key: cord-0058407-9g37kn31 authors: Rathnayake, Chamil; Caliandro, Alessandro title: Repurposing Sentiment Analysis for Social Research Scopes: An Inquiry into Emotion Expression Within Affective Publics on Twitter During the Covid-19 Emergency date: 2021-02-17 journal: Diversity, Divergence, Dialogue DOI: 10.1007/978-3-030-71292-1_30 sha: 2072bcff5c5f6d95be044e28ddd4bff122dc9d6d doc_id: 58407 cord_uid: 9g37kn31 The scope of the article is to discuss and propose some methodological strategies to repurpose sentiment analysis for social research scopes. We argue that sentiment analysis is well suited to study an important topic in digital sociology: affective publics. Specifically, sentiment analysis reveals useful to explore two key components of affective publics: a) structure (emergence of dominant emotions); b) dynamics (transformation of affectivity into emotions). To do that we suggest combining sentiment analysis with emotion detection, text analysis and social media engagement metrics – which help to better understand the semantic and social context in which the sentiment related to a specific issue is situated. To illustrate our methodological point, we draw on the analysis of 33,338 tweets containing two hashtags – #NHSHeroes and #Covidiot – emerged in response to the global pandemic caused by Covid-19. Drawing on the analysis of the two affective publics aggregating around #NHSHeroes and #Covidiot, we conclude that they reflect a blend of emotions. In some cases, such generic flow of affect coalesces into a dominant emotion while it may not necessarily occur in other instances. Affective publics structured around positive emotions and local issues tend to be more consistent and cohesive than those based on general issues and negative emotions. Although negative emotions might attract the attention of digital publics, positively framed messages engage users more. In this article we propose some methodological reflections on how to repurpose [33] sentiment analysis for social research scopes. In particular, we show how sentiment analysis can be helpful to explore affective publics. Sentiment analysis is largely employed and debated in information science research, but mostly from a computational point of view [1] . In behavioural sciences, sentiment analysis is principally used to measure reputation of digital entities (e.g. brands or politicians) [2] , but it is rarely used to understand collective social phenomena [3] . Surprisingly, sentiment analysis is scarcely employed to explore affective structures (i.e., dominant emotions) and dynamics (i.e., transformation of affectivity into emotions) of digital publics, which represent crucial objects of study in digital sociology [4] . To support our methodological reflections, we draw on a Twitter empirical analysis. Specifically, we focus on Twitter discussions related to Covid-19 -a topic that stirred intense emotional reactions within digital publics during the last few months -and analyse two hashtags (#NHSHeroes and #Covidiot), with a special emphasis on the context of the UK. This context is an important 'field' for studying public reactions, particularly due to the intense nature of sentiments that emerged in social media user reactions to the pandemic. Arguably, such intensity is related to thefact that the UK was heavily affected by Covid-19 with the total number of fatalities reached 45,000 by the end of July 2020 [5] as well as other controversies that arose about the Government's management of the emergency [6] . As we mentioned earlier, sentiment analysis is not particularly popular in sociological research. This is due to some intrinsic limitations of the technique. Sentiment analysis allows measuring emotions expressed towards a digital object, such as a product, a brand, an issue, or an individual in digital text [7] and it is mainly expressed through three coding categories: positive, negative, and neutral. These three general categories give the reader the false perception that a certain share of sentiment corresponds to a homogenous collection of users' opinions towards a given object, while it condenses only the emotional tone of a set of keywords contained in text. Moreover, simple measuring of quantities of positive and negative sentiments do not tell us much about the impact that digital affective intensities have on users. All these limitations make sentiment analysis difficult to apply in social research, where understanding of the cultural and social context in which a phenomenon is situated is crucial. In our opinion, a privileged sociological field where to experiment with sentiment analysis is that of affective publics [8] , where the focus of research is more on the affective ambience [9] users create around digital content, rather than personal opinions they express on it. Papacharissi conceptualizes affective publics as networked publics that primarily mobilize and connect or disconnect through expressions of sentiment [10] . This notion expands boyd's conceptualization of networked publics [11] as it stresses that users who participate in online publics might be materially networked by digital infrastructures (e.g. platforms, hashtags, etc.) but are socially and culturally connected through mutual exchanges of affective intensities. Such affective intensities can have different and unexpected social outcomes, since, as Papacharissi [10] stresses, some affective publics connect through common expressions of sentiment, but other disband because of them. The notion of effective public offers researchers a useful analytical category to frame collective participation in large and dispersed digital environments (such as social media) as well as observe the emergence of digital affect cultures [12] . Arguably, affectivity is the property that structures affective publics and keeps them together. Therefore, it is crucial to measure affectivity to see which specific emotions dominate in affective forms of engagement or trace the circulation of affect within digital networks. Anyhow, empirical investigations on affective publics are still scarce. In fact, affect is something ephemeral and difficult to capture. Affect is not emotion; it is an initial drive or sense experienced prior to identifying a particular reaction as an emotion [10] . Nevertheless, digital environments allow tracing these two movements, and in particular the materialization of affectivity into specific emotions in user-generated content. Specifically, social media environments provide 'natively digital instruments' [13] to measure emotions (e.g. through digital texts on which running sentiment analysis) and trace their circulations (e.g. through technicalities like RTs or like buttons). Thus, it important that scholars exploit more the potentialities of 'natively digital instruments' to explore affective publics systematically. So far, both quantitative and qualitative research tends to concentrate more on the socio-technical architecture of affective publics that is conceived as proxy of affectivity [14] . Quantitative studies focus on massive exchanges of social media metadata (such as RTs or likes) around a given digital content within a short span of time, which they consider as a token of collective manifestation of affectivity [15] . For example, Arvidsson et al. [16] consider teenagers aggregating around the hashtag #onedirection on Twitter as members of an affective public. This is because Onedirection's fans use #onedirection not to chat about music, but as a space of RTs exchange, which they use to express reciprocal emotionally support and/or joy regarding specific news (e.g. the announcement of a Onedirection's concert). Qualitative studies focus more on content, which, nonetheless, is framed as a socio-technical device that channels affectivity. There is an emerging strand of research on affective publics aggregating around visual content [17] . These kinds of studies tend to pay attention to the circulation of repetitive images, showing that they have the capacity to materialize collective affectivity [18] . For example, Döveling et al. [19] show that the affective public emerging around the hashtag #PrayForParis hinge on the circulation of standardized images, which in turn serve to express a common sentiment of grief among a dispersed group of users. Anyhow, this literature review highlights some methodological gaps. First, there is a scarcity of empirical research on affective publics taking advantage of sentiment analysis. In fact, the analysis of the textual component of digital content might be strategic to measure the actual structure and dynamics of affective publics, that is: a) the emergence of dominant emotions; b) the transformation of generic effective intensities into specific emotional forms. Second, few studies try to combine sentiment analysis with engagement metrics (e.g. RTs, favs, likes) in order to understand which kinds of emotions have the power to mobilize, keep together or break affective publics apart. Given the gaps highlighted above, we propose some methodological strategies to explore affective publics in a more systematic way. First, we stress the necessity to 'put sentiment into context'. From an analytical point of view, we propose to take into consideration two different kinds of contexts: semantic and social. In order to study the semantic context in which a given manifestation of sentiment is situated, we suggest to: 1) detect and distinguish the actual emotions through which the sentiment manifests (i.e. does a negative sentiment express anger or preoccupation?); and 2) associate emotion analysis to keyword analysis (i.e. how has collective anger been expressed?). In order to study to the social context, we suggest analyzing correlations between emotions and social media engagement metrics (e.g. retweets) in order to understand which emotions engage digital publics the most and if and to what extent they are able to mobilize users. In the conclusion we show how these two kinds of analysis turn to be useful to systematically explore the two key components of affective publics: structure and dynamics. To illustrate our point, we draw on the analysis of 33,338 tweets written in English and marked with the hashtags #NHSHeroes or #Covidiot. Twitter hashtags #NHSHeroes and #Covidiot were chosen in order to develop an empirical research based on sentiment analysis especially because that allows retrieving texts with clear-cut emotional connotations. Such an approach is appropriate for exploring collective manifestations of sentiments on social media. Specifically, #NHSHeroes had a strong positive connotation and Twitter users used it to support key workers at the UK National Health Service (NHS) who risked their lives to save lives during the lockdown [20] . Conversely, #Covidiot has a pronounced negative connotation since it is meant to publicly shame those people that, due to their reckless behaviour and/or opinions (e.g. not respecting social distancing, believing that Covid-19 is a hoax, etc.), represent a threat to public health [21] . We avoided general hashtags like #coronavirus or #WHO as we expected them to be more neutral in tone and do not concentrate around specific local issues. Our empirical analysis aims at answering the following research questions: 1) What are the differences in sentiments between the two different affective publics aggregating around #NHSheros and #Covidiot? Which are the dominant emotions characterizing each affective public? Which are the key terms associated to different manifestation of sentiment for each affective public? 2) How do sentiments correlate with engagement metrics (i.e., number of favourites received, retweets, and quote retweets)? Which kinds of emotions are able to engage and mobilize publics the most? Which have the opposite effect? We draw on a dataset of 33,338 tweets focusing on the crucial period between 28th of March and 4th of April, 2020 when Covid-19 started spreading rapidly in the UK. Behavioural change was necessary in this period to minimise the pressure on the NHS. Total number of individuals in the UK who were tested positive for Covid-19 increased from 17,089 to 41,903 during this period [22] . The dataset was obtained via the Twitter Search API (#Covidiot: 15,391, #NHSHeroes: 17,947). These hashtags represented public shaming (#Covidiot) and appreciation of keyworkers (#NHSHeroes). This is dataset did not include retweets as duplicated text affect sentiment scores. The NRC Emotion Lexicon (EmoLex) [23] included in the R Syuzhet package was used to detect sentiments in tweets. EmoLex contains word-sense pairs for eight different emotions (i.e., anger, anticipation, disgust, fear, joy, sadness, surprise, and trust) offering detection of sentiments beyond the popular negative-positive polarity. However, the lexicon can also be used to classify content into two basic sentiment categories. EmoLex includes 14,182 unigrams (words) that associate with the above eight emotions. For instance, words such as 'pandemic' and 'abandon' are associated with fear and sadness and classified as negative [24] . The Lexicon identifies the word 'prevention' as a positive expression associated with the sentiment 'anticipation.' EmoLex was chosen for sentiment detection as there is a wide range of publicly available documentation developed and maintained by an international community of experts and that it has already gained reputation in academic research, especially in information science. Several researchers have applied EmoLex to analyse sentiments in Twitter content. For instance, Yu and Wang [25] use this lexicon to analyse temporal changes in sentiments in tweets sent by American football fans during FIFA World Cup football games. Table 1 shows examples of tweets from each hashtag and sentiment scores calculated using EmoLex. Term frequencies for each sentiment was calculated to examine common words used to express sentiments. Pearson's Correlation Coefficient was calculated to examine relations between sentiment scores and three engagement metrics (retweet count, quote retweet count, and favourite count). This allows understanding whether hashtags that contain certain topical orientations mobilise liking or retweeting more than the others. Protecting the NHS was a key focus of the UK government's Covid-19 response strategy and the slogan "stay at home, protect the NHS, save lives" used in the first phase of the Covid-19 response [26] reflected the need for behavioural control in order to ensure that the health service is not overwhelmed. In general, the Twitter hashtag #NHSHeroes aligned with the government slogan and emerged mainly as an appreciation of NHS staff. The positive framing of the hashtag matched well with offline campaigns, such as Clap for Our Carers [21] , that gained nationwide popularity. Cumulative sentiments scores (CumSS)-total of sentiments within each category-and mean sentiment values (i.e., cumulative sentiment value divided by the number of tweets in the sample) ( Table 2) show that Trust is the most prevalent sentiment in #NHSHeroes (CumSS: 15180). Sentiment with the second highest cumulative score (11191) was fear. The results also indicated that anticipation also had a high cumulative score (10477). Other sentiments, such as anger, disgust, and sadness had considerably lower sentiment scores in this hashtag. Most frequent words used to express sentiments in #NHSHeores (Fig. 1) show that while words such as 'risk', 'pandemic', 'difficult', 'fight', and 'emergency' were used to express fear, words such as 'safe', 'proud', 'team', 'hope', 'lovely', 'brilliant', and 'clap' were used to express trust and anticipation. The above results indicate that #NHSHeores primarily includes a blend of trust, anticipation, and fear and that captures the UK public reaction to the NHS. While the above analysis provides an overall perspective, the EmoLex identified several words, such as hospital, as having relevance to multiple sentiments. This is not inaccurate as words can carry multiple sentiments. The Twitter hashtag #Covidiot is a marker used to publicly shame those who disregard social distancing measures. This hashtag was more internationally used than #NHSHeroes that mobilised engagement around local Covid-19 response in the UK. Cumulative sentiment scores and mean sentiment scores for #Covidiot are shown in Table 3 . The results showed that trust, fear, and sadness dominate #Covidiot. However, top words used to detect sentiments (Fig. 2) indicated that the EmoLex identified words such as 'trump' and 'president' as positive. This is not inaccurate as the word 'trump' connotes victory and 'president' reflects positivity. However, upon manual inspection, we observed that a large number of tweets in #Covidiot were used to criticise President Trump's leadership in the context of the pandemic. Frequent appearance of terms such as 'trumpvirus', 'clustertrump', 'trumpliespeopledie', and 'covidiotinchief' indicates that the hashtag has largely been used as a space for criticising President Trump. When the political use was excluded, fear and sadness were dominant sentiments in this hashtag. Words such as 'pandemic', 'die', 'death', 'dying', 'risk', 'kill', 'bad', 'sick', and 'late' were used frequently to express fear and sadness. Bigram frequencies were calculated to examine content beyond sentiment analysis. Hashtags included in tweets were included in bigram analysis since, as hashtags operate on multiple levels of meaning incorporating both untagged and tagged language and enact an inward and outward facing metadiscourse within and across posts [27, 28] . Table 4 provides top-10 frequently used bigrams after removing slight variants. The bigrams clearly show that #NHSHeroes is largely an issue public that emerged in appreciation of the NHS, UK. The bigram 'nhsthankyou' and 'nhsheroes' had the highest frequency (1627) in #NHSHeroes. The results confirm that #Covidiot included a range of frequently used bigrams, such as 'covidiottrumpvirus' (n= 99), 'covidiottrumpgenocide' (n= 61), 'trumpliespeoplediecovidiot' (n= 93) that directly attacked President Trump. Results given in Tables 2 and 3 and visualised in Fig. 3(a) show that trust, fear, and anticipation in #NHSHeroes were considerably higher than #Covidiot. Average mean values for joy and surprise were also higher in #NHSHeroes. Moreover, there were lower levels of anger and disgust in #NHSHeroes than #Covidiot. In general, these results show that while positive sentiments have been used frequently to discuss the pandemic in #NHSHeroes, it contained less negative sentiments than #Covidiot (see Fig. 3b and c) . Examining effects of such sentiments can help understand how affective publics organise around the NHS. Table 5 provides correlation statistics between sentiments and engagement metrics (i.e., retweet count, quote retweet count, and favourite count). We found only minimum levels of correlations in both hashtags. While Joy had a low positive correlation with the retweet count in #Covidiot (r: .019, p < .05), sadness correlated positively with the retweet count in #NHSHeroes. Correlations between sentiment scores and quote retweet count were also not noteworthy. While surprise correlated with the quote retweet count in #Covidiot (r: .039, P <. 05), joy correlated negatively with the quote retweet count in #NHSHeroes (r: −.032, p <. 05). These correlations do not show any convincing mobilisation. However, we observed that all the sentiments except anger and disgust positively correlated with the number of favourites in #NHSHeroes while there were no significant correlations in #Covidiot for the same metric. This shows that the intensity of sentiments in both hashtags do not associate with engagement via retweeting. However, sentiments in #NHSHeroes triggered substantial engagement via liking. Enacting a constructive public discourse is crucial for the effectiveness of the UK national response to Covid-19 as it allows behavioural change on a wider level as opposed to reactive measures such as penalties. Scholars have argued that social media users are primarily organised via affective forms of engagement that ultimately drive their behaviour [8] . A crucial step in understanding the 'health' of such a discourse is to detect emotions expressed in social media posts that continue to accumulate in networked issue publics. Positive emotions and significant correlations between sentiment scores and favorite count show that #NHSHeoroes, as a Twitter public, has mobilised users, at least by encouraging acts of liking to a significant level. Conversely, such mobilisation was not present in #Covidiot. This indicates that the Twitter hashtag #NHSHeroes is a more intense set of positive affective reactions than #Covidiot and it has been more successful in mobilising users. This shows that a local focus and valorisation of keyworkers is more effective in mobilising engagement in the context of Covid-19. Collective shaming and more general framing of the discourse using markers such as #Covidiot do not mobilise engagement, at least within the limits our data. It should, however, be noted that the positive effect that we observe in #NHSHeroes is only a slight impact as correlations were low. In general, lack of strong correlations between engagement metrics and sentiments show that expression of sentiment is the primary function of both hashtags rather than collective engagement, particularly in #Covodiot. This supports Papacharissi's [10] claim that affective publics facilitate connective rather than collective action. However, the above correlations should not be underestimated as facilitating emotion expression itself is a significant role that social media can play in the context of the pandemic. As Marwick and boyd pointed out [29] , people send tweets for a variety of reasons, from micro-celebrity practices to keeping a diary, and they do not necessarily expect audience engagement. In this case, dominance of positive sentiments in #NHSHeroes indicate alignment of user emotions with desirable behaviour in response to the pandemic. Positive reactions in the #NHSHeroes above are consistent to some extent with the UK public attitudes towards the NHS. A public satisfaction survey conducted in 2019 showed that the overall public satisfaction with the NHS was at 60% [30] . Another report showed that, by 2015, 89% of a sample of the British public supported a publicly funded national healthcare system [31] . Our results discussed above reflects those positive public attitudes towards the NHS. For instance, the tweet given in Fig. 4 (top) received a high trust (value: 5) and anticipation (value: 5) in our analysis and it shows positive attitudes towards NHS staff. The public attitudes in a previous survey also indicated that 43% of respondents did not see any improvement in the NHS [31] and that the majority of the people would prefer extra funding for the NHS [30] . Therefore, some negative emotions relate to lack of funding for the NHS and the consumer logic that it has embraced [31] . The tweet that received a high anger score (value: 6) in our analysis (Fig. 4b) , for instance, directly shows anger towards the lack of personal protective equipment in hospitals. (a) A tweet with positive sentiments that contains high trust (b) A tweet with negative sentiments that contains high anger The basic sentiment analysis indicates, unsurprisingly, that #NHSheroes is characterized by an overall positive sentiment as opposed to more negative reactions in #Covidiot. Anyhow, a more refined sentiment analysis that combines emotion detection, text analysis and engagement metrics provides a more comprehensive view. Specifically, it allows exploring the emotional structure of affective publics aggregating around #NHSheroes and #Covidiot as well as their dynamics of affect coalescence into specific emotional forms. Regarding #NHSheroes, we observe that users do not simply post generic positive opinions about NHS and its workers. Instead, they collectively express a sentiment of trust towards them. Moreover, text analysis allows reconstructing narratives users articulate within these collective expressions of sentiment: notwithstanding users are concerned by Covid-19 'emergency' and the 'risks' and 'difficulties' brought about by the 'pandemic', they feel 'safe' because of the 'brilliant' work of the NSH 'team', of which they are 'proud'. Conversely, the text analysis indicates that the hashtag #Covidiot is 'hijacked' by some users who make it 'political'. At least within the limits of our data, #Covidiot is used to a great extent as a pretext for criticizing Donald Trump rather than as a generic tool of public shaming. In general, President Trump is portrayed as the prototype 'Covidiot'. This narrative stirs a bundle of negative emotions, like fear, sadness, and anger. It should also be noted that the #Covidiot affective public seems to represent a mixed bag of emotions in which single emotions do not emerge as dominant. In fact, the public is dominated by two opposite emotions: Trust (0.469) and Fear (0.452). Probably this ambiguity is due to the underlying inconsistency of the #Covidiot hashtag itself: the hashtag is meant to ridicule covidiots, but in practice, it is used to criticize President Trump. Finally, we observe that #NHSHeoroes, as a locally oriented Twitter public, has mobilized users, at least by encouraging acts of liking. This shows that regional and more positive reactions can trigger more reactions from users than more general and negatively framed messages. In conclusion, affective publics reflect a blend of emotions. In some cases, such generic flow of affect coalesces into a dominant emotion. Affective publics structured around positive emotions and local issues tend to be more consistent and cohesive than those based on negative emotions. Although negative emotions might attract the attention of networked publics, positively framed messages can engage users more. We acknowledge that these results are just preliminary and more (and more diverse) affective publics must be investigated to verify our conclusions. Moreover, our analysis started investigating the strict nexus existing between affectivity flowing within networked publics and the specific emotions into which the affective flow fixes itself. To do that we proposed to use a quantitative technique: sentiment analysis, which allows to measure both affect and emotions. Anyway, further research needs to be done to fully understand the above mentioned nexus. For example, researchers might observe the changing of sentiment overtime, in order to see if, within an effective flux, there are specific emotions enduring overtime and why. As far as the analysis of semantic contexts are concerned, researchers can mix text analysis with cohashtag analysis in order to better understand the meaning of sentiment and its social use. Lastly, as far as users' engagement is concerned, further research might investigate how sentiment 'behaves' in different social formations (like publics, communities or crowds) and to what extent it is crucial to keep them alive, active and proactive. Finally, we believe, understanding sentiments that dominate the current Covid-19 discourse is crucial in understanding 'collective emotions' as well as developing intervention strategies as the UK struggles to defeat the pandemic. Moreover, affect based engagement strategies are necessary for the development of coping strategies in a post-pandemic society, especially due to the seriousness of trauma that the public had to endure. Affective computing and sentiment analysis Twitter brand sentiment analysis: a hybrid system using n-gram analysis and dynamic artificial neural network How censorship in China allows government criticism but silences collective expression Digital Sociology: The Reinvention of Social Research COVID-19) in the UK The Guardian view of Boris Johnson's crisis: blunder after blunder Qualitative Research in Digital Environments: A Research Toolkit. Routledge Affective Publics: Sentiment, Technology, and Politics Ambient affiliation: a linguistic perspective on Twitter Affective publics and structures of storytelling : sentiment, events and mediality A networked self: Identity, Community, and Culture on Social Network Sites Sensation Networks and the GIF: Toward and Allotropic Account of Affect in Networked Affect The online crowd: a contradiction in terms? On the potentials of Gustave Le Bon's crowd psychology in an analysis of affective blogging Brand public Crowds and value: Italian directioners on Twitter Writing' oneself into tragedy: Visual user practices and spectatorship of the Alan Kurdi images on Instagram #Funeral and Instagram: Death, social media, and platform vernacular From mediatized emotion to digital affect cultures: new technologies and global flows of emotion What Does "Covidiot" Mean, and Who Qualifies as One? (2020). www.health Clap for our carers Total UK COVID-19 cases update Crowdsourcing a word-emotion association lexicon World Cup 2014 in the Twitter world: a big data analysis of sentiments in U.S. sports fans' tweets Coronavirus: stay at home, protect the NHS, save lives -web version Searchable talk: the linguistic functions of hashtags Searchable talk: Hashtags and social media metadiscourse I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience Public satisfaction with the NHS and social care in 2019: Results and trends from the British Social Attitudes survey Public attitudes to the NHS: an analysis of responses to questions in the The business of the NHS: the rise and rise of consumer culture and commodification in the provision of healthcare services The end of the virtual: Digital methods