key: cord-0614692-qb0lq5vm authors: Vegt, Isabelle van der; Kleinberg, Bennett title: Women worry about family, men about the economy: Gender differences in emotional responses to COVID-19 date: 2020-04-17 journal: nan DOI: nan sha: 32bf7ad3fdda71036b48f7dc85cad407674277b7 doc_id: 614692 cord_uid: qb0lq5vm Among the critical challenges around the COVID-19 pandemic is dealing with potentially detrimental effects on people's mental health. Designing appropriate interventions and identifying the concerns of those most at risk requires methods that can extract worries, concerns and emotional responses from text data. We examine gender differences and the effect of document length on worries about the ongoing COVID-19 situation. Our findings suggest that i) shorter texts do not offer an as adequate insight into psychological processes as longer texts. We further find ii) marked gender differences in topics concerning emotional responses. Women worried more about their loved ones and severe health concerns while men were more occupied with effects on the economy and society. The findings align with general gender differences in language found elsewhere, but the current unique circumstances likely amplified these effects. We close this paper with a call for more high-quality datasets due to the limitations of Tweet-sized data. The COVID-19 pandemic is having an enormous effect on the world, with alarming death tolls, strict social distancing measures, and far-reaching consequences for the global economy. In order to mitigate the impact of the virus on mental health, it is crucial to gain an understanding of the emotions, worries and concerns the situation has brought about in people worldwide. Text data are rich sources to help with that task and computational methods potentially allow us to extract information about people's worries on a larger scale than ever. In previous work, the COVID-19 Real World Worry Dataset was introduced, consisting of 5,000 texts (2,500 short + 2,500 long) asking participants to express their emotions regarding the virus [1] . In the current study, we delve deeper into the data and explore potential gender differences with regards to realworld worries about COVID-19. Building on a substantial evidence base for gender differences in language, we examine whether linguistic information can reveal whether and how men and women differ in how they respond to the crisis. Importantly, we also examine whether men or women are potentially more affected than the other, which may hold implications for developing mitigation strategies for those who need them the most. The Real World Worry Dataset (RWWD) is a text dataset collected from 2,500 people, where each participant was asked to report their emotions and worries regarding COVID-19 [1] . Participants selected the most appropriate emotion describing what they were experiencing, choosing from anger, anxiety, desire, disgust, fear, happiness, relaxation, or sadness. They then rated the extent to which they worried about the COVID-19 situation and scored each of the eight emotions on a 9-point scale. Participants -all social media users -then wrote a long text (avg. 128 tokens) and a short, Tweet-sized text (avg. 28 tokens) about said emotions and worries. Instructions read: "write in a few sentences how you feel about the Corona situation at this very moment. This text should express your feelings at this moment." A large body of research has studied gender differences in language. Researchers have adopted both closed-and open vocabulary approaches. A closed vocabulary refers to dictionary approaches, where gender differences are measured through predefined word lists and categories. The LIWC (Linguistic Inquiry and Word Count) is a prominent example that measures linguistic categories (e.g., pronouns and verbs), psychological processes (e.g., anger and certainty), and personal concerns (e.g., family and money) [2] . The LIWC outputs the percentage of a document that belongs to each category. An open vocabulary approach is data-driven, in that gender differences are assessed without the use of predefined concepts. For example, n-grams or topics may be used to study gender differences. In a closed-vocabulary study of 14,324 text samples from different sources (e.g. stream-of-consciousness essays, emotional writing), gender differences for LIWC categories were examined [3] . For example, it was found that women used more pronouns (Cohen's d = 0.36) 1 words referring to emotions (d = 0.11), including anxiety (d = 0.16) and sadness (d = 0.10), as well as social words referring to friends (d = 0.09) and family (d = 0.12), and past (d = 0.12) and present-tense (d = 0.18) verbs. On the other hand, men used more articles (d = -0.24), numbers (d = -0.15), swear words (d = -0.22), and words referring to their occupation (d = -0.12) and money (d = -0.10) 2 . Another approach partially replicated these results, showing that gender differences also emerged for language on social media (Facebook status updates from 75,000 individuals) [6] . In addition to examining differences in LIWC categories, the authors extended their approach to a data-driven, open-vocabulary approach, using both ngrams and topics. For example, unigrams such as 'excited', 'love', and 'wonderful' were used more frequently by women, whereas unigrams such as 'xbox', 'fuck', and 'government' were used more by men. In terms of topics, women more often mentioned family and friends and wished others a happy birthday, whereas men spoke more often about gaming and governmental/economic affairs [6] . With the current paper, we aim to add to the timely need to understand emotional reactions to the "Corona crisis". A step towards finding intervention strategies for individuals in need of mental health support is gaining an understanding of how different groups are affected by the situation. We examine the role of gender in the experienced emotions as well as potential manifestations of the emotional responses in texts. Specifically, we look at gender differences i) in self-reported emotions, ii) in the prevalence of topics using a correlated topic model approach [7] , iii) between features derived from an open-vocabulary approach, and iv) in psycholinguistic constructs. We use the COVID-19 Real World Worry Dataset and thereby also test whether the differences emerge similarly in long and Tweet-sized texts. The RWWD contains 5,000 texts of 2,500 individuals -each of whom wrote a long and short text expressing their emotions about COVID-19. We applied the same base exclusion criteria as [1] (i.e. nine participants who padded their statements with punctuation), and further excluded those participants who did not report their gender (n = 55). The sample used in this paper consisted of n = 2,436 participants, 65.15% female, 34.85% male. We examined whether there were gender differences in the self-reported emotion scores (i.e. how people felt about COVID-19) using Bayesian hypothesis testing [8, 9] . In short, we report the Bayes factor, BF10 which expresses the degree to which the data are more likely to occur under the hypothesis that there is a gender difference, compared to the null hypothesis (i.e. no gender difference). For example, BF10 = 20 means that the data are 20 times more likely under the hypothesis that there is a gender difference. Importantly, Bayesian hypothesis testing allows for a quantification of support for the null hypothesis, too. While a BF10 = 1 implies that the data are equally likely under both hypotheses, a BF10 smaller than 1 indicates the support for the null, since #$ = # '( )* . Prior to topic modelling, all text data were lower-cased, stemmed, and all punctuation, stopwords, and numbers removed. First, we assess whether there are differences in topics between long and short texts. We construct a topic model 3 for all texts (long + short) and select the number of topics by examining values for semantic coherence and exclusivity [10, 11] . That approach assigns a probability for each document belonging to each topic. Here, we assign each document to its most likely topic (i.e., the highest topic probability for the document). We use a Chi-square test to examine whether there is an association between document type (long vs short) and topic occurrence. Standardised residuals (z-scores) are used to assess what drives a potential association. Gender differences in topic occurrences are assessed in the same way for long and short texts separately. A Chi-square test is applied to test for an association between gender (male vs female) and topic occurrence. In addition, we also look at differences in n-grams (unigrams, bigrams, trigrams) without the assumption of topic membership. Specifically, we calculate Bayes factors and Cohen's d effect sizes for male vs female comparisons on all n-grams to explore how both genders differ. We conduct that analysis for the short and long texts separately. Since n-grams might not capture higher-order constructs (e.g., analytical thinking, anxiety) and psychological processes, we run the same analysis as in 2.4 using the LIWC 2015 [2]. We compared the self-reported emotions (ranging from 1 = very low to 9 = very high) calculating Bayes factors and Cohen's d effect sizes. Table 1 suggests that there was extreme evidence (based on the Bayes factor [8] ) that women were more worried, anxious, afraid and sad than men (ds ranging from 0.35 to 0.46). There was strong evidence that women were angrier than men (d = 0.16). Conversely, men reported considerably more desire and more relaxation than women. We also assessed whether gender was associated with the "best fitting" chosen emotion. A Chi-square test, X 2 (7) = 43.83, p < .001, indicated an association between the chosen emotion and gender. Standardised residuals showed that this effect was driven by disparities between females choosing fear significantly more often than males (z = 2.62) and males choosing relaxation significantly more often (z = -4.40) than females. Thus, while anxiety was overall the most chosen emotion (55.36%, see [1] ), gender did play a role for the preference for fear and relaxation. Long vs short texts. For the topic model with long and short texts, 15 topics were selected based on semantic coherence and exclusivity of topic words. A significant association was found between text length (long vs. short) and topic occurrence, X 2 (14) = 1776.6, p < .001. The six topics that differed most (i.e., highest standardised residuals) between long and short texts are depicted in Table 2 4 . Long texts were more likely to concern worries about both family and work, as well as the societal impact of the virus. Short texts were more likely to concern lockdown rules, staying home, and negative emotions. Note. A positive standardised residual indicates that the topic was more likely to occur in long texts. The topic model for long texts contained 20 topics 4 . For this model, we observed a significant association between gender and topic occurrence, X 2 (19) = 140.02, p < .001. Table 3 show the topics with the largest gender difference and suggests that men were more likely to write about the national impact of the virus, worries about health, and the future outlook regarding COVID-19. Women spoke more about family and friends, as well as sickness and death. We selected a model with 15 topics to represent the short texts. Here, we also observed a significant association between gender and topic occurrence, X 2 (14) = 101.47 p < .001. Women spoke more about missing family and friends, and included more calls for solidarity. In contrast, men spoke more about the international impact of the virus, governmental policy, and returning to normal. Note. Positive standardised residuals indicate that the topic occurrence was higher for women than for men, and vice versa. Before extracting unigrams, bigrams and trigrams, the corpus was lower-cased and stopwords and punctuation were removed. Table 4 indicates that, in long texts, women use "anxious" (both in unigram and in bigram/trigram form) markedly more often than men, and mention "family" and "children" more often. The findings are partly corroborated for short texts, which, in addition, include the unigrams "scared" and "scary". Interestingly, unigrams which were more frequently used by men were "hopefully" and "calm". In broad lines, these findings reflect the differences found using a topic-based approach, where women expressed more fear and men were more likely to write about a (hopeful) return to normal. We also observe that n-gram-based differences are more pronounced in the longer texts than in shorter ones. To capture potential differences in higher-order constructs, we also looked at gender differences for the LIWC variables. Table 5 suggests that men had a higher score on analytical thinking, used more articles and more "big" words (more than six letters). Women, on the other hand, used more pronouns (all, personal pronouns, and first-person pronouns), more verbs, and expressed more anxiety and references to their family. We also observe that women had a substantially higher focus on the present than men. For short texts, we see that the differences are less pronounced (Bayes factors lower than ten only constitute moderate evidence [8] and are ignored here). The data show that men scored higher on references to power and used more articles, while women used more conjunctions and scored higher on authenticity. To understand gender differences in emotional responses to COVID-19 on the psycholinguistic level better, we zoom in on three LIWC clusters (Table 6) . We look at the clusters "personal concerns", "drives", and "time orientation" -each of which consists of sub-categories (e.g., concerns: work, death, drives: risk, achievement, time orientation: future, present). Men scored higher on the variables risk (e.g., cautious, dangerous), work, and money, whereas women had higher values for affiliation (e.g., friend, party, relatives), home (e.g., neighbour, pet) and present. Again, these bottomup findings seem to align with the topic models from a psycholinguistic angle. This study elucidated gender differences in emotional responses to COVID-19 in several (linguistic) domains. Analyses were performed for both long and shorter, Tweet-sized texts. We first discuss the implications of using the respective text lengths for understanding emotions and worries. Then, we review the observed gender differences and relate them to previous research on this topic. Lastly, some potential future avenues for understanding worries from text data are identified. In previous work using the same dataset, it was suggested that important differences emerge when participants are asked to write about their worries in long versus short, Tweet-size texts. In topic models that were constructed for long and short texts separately, longer texts seemed to shed more light on the worries and concerns of participants. In contrast, shorter texts seemed to function as a call for solidarity [1] . In the current study, we were able to statistically test that idea by constructing a topic model for both text types together. Indeed, when testing for an association between text length and topic occurrence, we found that topics significantly differed between text types. Our results confirmed that short Tweet-like texts more frequently referred to calls to 'stay at home, protect the NHS, save lives' (a UK government slogan during the crisis). Longer texts more effectively elucidated the specific worries participants had, including those about their family, work, and society. The apparent differences between long and Tweet-sized texts emphasise that researchers need to exercise caution in relying largely on Twitter datasets to study the COVID-19 crisis, and other more general social phenomena. Indeed, several Twitter datasets have been released [12] [13] [14] [15] [16] for the research community to study responses to the Corona crisis. However, the current study shows that such data may not be useful if we are interested in gaining a deep understanding of emotional responses and worries. The exact reasons for that difference (e.g., possibly different functions that each text form serves) merit further research attention. However, the observation that Tweetsized texts are failing to convey emotions and worries about COVID-19 to the extent that longer texts do, on top of the classical limitations of social media data (e.g., demographic biases [17] , participation biases [18] , or data collection biases [19] ), are reasons to be more cautious for issues as timely and urgent as mental health during a global pandemic. Ultimately, making use of the convenience of readily-available social media data might come at the expense of capturing people's concerns and could lead to skewed perceptions of these worries. We urge us as a research community to (re-) consider the importance of gathering high-quality, rich text data. Gender differences emerged in each domain that was studied in this paper. Reported emotions showed that women were more worried, anxious, afraid, and sad then men and these results were supported by linguistic differences. For instance, topic models suggested that women discussed sickness and death more than men. N-gram differences showed that women used 'anxious', 'sad', and 'scared' more than men and LIWC differences showed that women used more words related to anxiety. This is not to say that men did not worry about the crisis, as reported negative emotions were still relatively high (e.g., the average score of 6 out of 9 for worrying). Furthermore, topic models showed that men wrote more frequently about worries related to their health than women. The results further illustrated differences in what men and women worry about with regards to the crisis. Women's focus on family and friends emerged consistently in the topic models, n-gram differences, and LIWC differences. On the other hand, men more frequently worried about national and international impact (topic models) and wrote more frequently about money and work than women (LIWC). All in all, these results seem to follow previous work on gender differences in language more generally. For example, similar to our results, previous studies have found that women in general score higher than men on the LIWC categories for anxiety, family, and home [3] . Our results also seemed to have further replicated that men use more articles, and use the LIWC categories money and work more often than women [3] . In light of these findings, it is of key importance to discern whether the gender differences that emerged in this study are specific to COVID-19 worries, or are simply a reflection of gender differences that emerge regardless of the issue in question. There are some indications in our data to imply that the differences are in line with general gender differences but in a more pronounced way. For example, previous work [3] All of these are present in our data as well but often with an effect twice the size. Thus, it is possible that the COVID-19 situation and the accompanying toll on mental health exacerbated the language differences between men and women further. If we follow the line of [3] , the intensified linguistic gender differences can be interpreted as serving different functions for men and women. It has been suggested that women predominantly discuss other people and internal, emotional processes, while men focus more on external events and objects [3, 6] . Similar patterns are discoverable in our data, where women were more likely to discuss loved ones and their own negative emotions, whereas men were more likely to write about the external effects of the virus on society. All in all, the current results seem to fall in line with previous empirical work as well as persisting gender stereotypes. For the social and behavioural sciences during and after the Corona crisis, a principal research question revolves around the mental health implications. The current study leveraged text data to gain an understanding of what men and women worry about. At the same time, it is of vital interest to develop mitigation strategies for those who need them the most. While this paper attempted to shed light on gender differences, other fundamental questions still need answering. First, relatively little is known about the manifestation of emotions in language and the subsequent communication of it in the form of text data (e.g., how good are people in expressing their emotions and, importantly, which emotions are better captured computationally than others?). Second, the type of a text (e.g., stream-of-consciousness essay vs pointed topical text vs Tweet) seems to determine the findings to a great deal (i.e. different topics, different effect sizes). Ideally, future research can illuminate how the language and aims of an individual change depending on the type of text. Third, to map out the worries on a large scale and use measurement studies to understand the concerns people have, we need attention for prediction models constructed on highquality ground truth data. The current paper examined gender differences in worries related to the COVID-19 pandemic. Gender was related to the reported emotions, topics, n-grams, and psycholinguistic constructs. Women worried mainly about loved ones and expressed negative emotions such as anxiety and fear, whereas men were more concerned about the broader societal impacts of the virus. The results emphasise that longer texts, as opposed to short Tweet-size texts, more adequately reflect emotions and worries. Measuring Emotions in the COVID-19 Real World Worry Dataset The Development and Psychometric Properties of LIWC2015. The University of Texas at Austin Gender Differences in Language Use: An Analysis of 14,000 Text Samples Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs Statistical Power Analysis for the Behavioral Sciences Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach A correlated topic model of Science. Annals of Applied Statistics Bayesian Hypothesis Testing: An Alternative to Null Hypothesis Significance Testing (NHST) in Psychology and Social Sciences Bayesian hypothesis testing for psychologists: A tutorial on the Savage-Dickey method stm: R Package for Structural Topic Models Optimizing Semantic Coherence in Topic Models A Twitter Dataset of 150+ million tweets related to COVID-19 for open research #COVID-19: The First Public Coronavirus Twitter Dataset. Python Corona Virus (COVID-19) Tweets Dataset Coronada: Tweets about COVID-19. Python TWITA -Long-term Social Media Collection at the University of Turin Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose Crowdsourcing Subjective Perceptions of Neighbourhood Disorder: Interpreting Bias in Open Data Tampering with Twitter's Sample API