key: cord-190407-l9mutkby authors: Ziems, Caleb; He, Bing; Soni, Sandeep; Kumar, Srijan title: Racism is a Virus: Anti-Asian Hate and Counterhate in Social Media during the COVID-19 Crisis date: 2020-05-25 journal: nan DOI: nan sha: doc_id: 190407 cord_uid: l9mutkby The spread of COVID-19 has sparked racism, hate, and xenophobia in social media targeted at Chinese and broader Asian communities. However, little is known about how racial hate spreads during a pandemic and the role of counterhate speech in mitigating the spread. Here we study the evolution and spread of anti-Asian hate speech through the lens of Twitter. We create COVID-HATE, the largest dataset of anti-Asian hate and counterhate spanning three months, containing over 30 million tweets, and a social network with over 87 million nodes. By creating a novel hand-labeled dataset of 2,400 tweets, we train a text classifier to identify hate and counterhate tweets that achieves an average AUROC of 0.852. We identify 891,204 hate and 200,198 counterhate tweets in COVID-HATE. Using this data to conduct longitudinal analysis, we find that while hateful users are less engaged in the COVID-19 discussions prior to their first anti-Asian tweet, they become more vocal and engaged afterwards compared to counterhate users. We find that bots comprise 10.4% of hateful users and are more vocal and hateful compared to non-bot users. Comparing bot accounts, we show that hateful bots are more successful in attracting followers compared to counterhate bots. Analysis of the social network reveals that hateful and counterhate users interact and engage extensively with one another, instead of living in isolated polarized communities. Furthermore, we find that hate is contagious and nodes are highly likely to become hateful after being exposed to hateful content. Importantly, our analysis reveals that counterhate messages can discourage users from turning hateful in the first place. Overall, this work presents a comprehensive overview of anti-Asian hate and counterhate content during a pandemic. The COVID-HATE dataset is available at http://claws.cc.gatech.edu/covid. The global outbreak of coronavirus disease 2019 or COVID-19 has caused widespread disruption in the personal, social, and economic lives of people. The upheaval has resulted in increased levels of fear, anxiety, and outbursts of strong emotions (Ahorsu et al. 2020; Zandifar and Badrfam 2020; Miller 2020; Montemurro 2020) . This has led to hateful incidents throughout the world, such as acts of microaggression, physical and verbal abuse, and online harassment (Arnold 2020) . Following the origin of COVID-19 in China, these in- cidents have increasingly been targeted towards Chinese and broader Asian communities (Joubin 2020; Coates 2020), resulting in 1,497 racially-motivated hateful incidents in less than a month, according to a recent survey conducted by the A3PCON 1 and CAA 2 (Jeung and Nham 2020) . Furthermore, the FBI has warned of a potential surge in anti-Asian hate crimes motivated by COVID-19 (Margolin 2020) . While there is mounting evidence of offline discriminatory acts, racism, and xenophobia during COVID-19, the extent of such overtly hateful content on the web and social media is not known. At the same time, while efforts to educate about, curb, and counter hate have been made via social media campaigns (e.g. the #RacismIsAVirus campaign), but their success, effectiveness, and reach remain unclear. Moreover, online hate speech has severe negative impact on the victims, often deteriorating their mental health and causing anxiety (Saha, Chandrasekharan, and De Choudhury 2019) . Thus, it is critical to study the prevalence and impact of online hate and counterhate speech in the COVID-19 discourse. Most existing research on online hate speech and harassment do not focus on anti-Asian hate and are not situated in the context of a pandemic (ElSherief et al. 2018a; Liu et al. 2018; Mathew et al. 2019a) . A few very recent and concurrent works have released datasets on COVID-19-related hate and xenophobic writing against Asians (Vidgen et al. 2020; Schild et al. 2020) . The dataset by Schild et al. (2020) consists of multi-platform hate, but does not involve counterhate. Vidgen et al. (2020) include extensive hand labels for hate and counterhate tweets, but they do not include any analysis of the hate ecosystem and its evolution. Here we bridge these gaps. Our contributions. In this paper, we present COVID-HATE, the largest dataset of anti-Asian hate and counterhate speech on Twitter in the context of the COVID-19 pandemic, along with a long-term longitudinal study. We make the following key contributions: • We create a dataset of COVID-19-related tweets, containing over 30 million tweets made between January 15 and April 17, 2020. We also crawl the social network of users, containing over 87 million nodes and 717 million edges. A hate network subgraph is shown in Figure 1 . • We annotate 2,400 tweets based on their hatefulness towards Asians as hate, counterhate, or neutral tweets. We build a highly accurate text classifier to automatically identify hate and counterhate tweets. This classifier identifies 891,204 hate and 200,198 counterhate tweets. • We conduct statistical, linguistic, geographic, and network analysis of hate and counterhate tweets and users to reveal characteristic patterns of the origin, evolution, and spread of anti-Asian hate throughout the world. We analyse the role of bots in this ecosystem. • We perform a social contagion study to show that anti-Asian hate begets hate, and counterhate is slightly effective in preventing hate speech. The COVID-HATE dataset is available on the project website: http://claws.cc.gatech.edu/covid. In this section, we describe COVID-HATE, a Twitter dataset containing COVID-19 related anti-Asian hate and counterhate tweets and network. Table 1 references the statistics of our dataset. We adopted a keyword-based approach to collect relevant COVID-19 tweets through Twitter's official APIs. The complete list of keywords is present in the Appendix. We used a collection of 42 keywords and hashtags belonging to three sets: (a) covid-19 keywords: contain popular terms related to COVID-19 (e.g. #coronavirus), (b) hate -19 (e.g. #chinavirus) . The hashtags were selected based on their popularity and tweet volume, and the list also includes Asian slurs obtained from Hate-Base 3 , and finally, (c) counterhate keywords: contain hashtags that were popularly used to organize efforts to counter hate speech (e.g. #RacismIsAVirus). Using these keywords, we collected 30,929,269 Englishlanguage tweets made by 7,833,194 users between January 15, 2020 and April 17, 2020. We also removed all retweets to focus our analysis only on original content. Twitter Network Construction: In addition to the tweets, we crawled the ego-network (i.e., the followers and followees) of a randomly-sampled subset of users who have made at least one COVID-19 tweet. A total of 489,011 users' neighborhoods were obtained. The resulting network has 87,851,137 users and 717,087,317 edges. Since there are no ground truth labels of Anti-Asian hate or counterhate for tweets, we hand-label a subset of tweets and create a textual classifier to label the rest. 4 Even though tweets may have explicitly hateful hashtags, categorizing tweets simply based on the presence or absence of these hashtags is insufficient because hashtags are often added to gain visibility and to promote tweets. Conversely, a tweet can be hateful even without having a hateful hashtag. The equivalent is true for counterhate tweets. Thus, we developed a rigorous labeling process to establish the ground truth categories of tweets based on the tweet content. We labeled the tweets into the following three broad categories, as we describe below. Anti-Asian COVID-19 Hate Tweets: We build on previous studies of racial hate from the social and information science literature to define anti-Asian hate. Specifically, Parekh and others (2012) and Fortuna and Nunes (2018) established that hate speech is directed at an individual or group based on "an arbitrary or normatively irrelevant feature," and that it casts the target as an "undesirable presence and a legitimate object of hostility." Building on this, we define anti-Asian COVID-19 hate as antagonistic speech that: (a) has one or more covid-19 keywords, (b) is directed towards an individual or a group of Asian people, organization, country, or government, and (c) is abusive, derogatory, or assigns blame for the creation, spread, misrepresentation, or mismanagement of COVID-19. One example of tweet labeled as anti-Asian hate says: It's the Chinese virus, from China, caused by your disgusting eating habits, your cruelty. Boycott anything Chinese #kungflu #chinaliedpeopledied #covid Pro-Asian COVID-19 Counterhate Tweets: This category of tweets is COVID-19-related (tweets have at least one of the covid-19 keywords) and can either explicitly: (a) identify, criticize, and actively oppose racism, hate speech, or violence towards Asian people, community, country, or government, or (b) support and defend the Asian people, community, country, or government. These tweets can either be direct replies to hateful tweets or be stand-alone tweets. An example of a tweet in this category is as follows (censorship ours): The virus did inherently come from China but you cant just call it the Chinese virus because thats racist. or KungFlu because 1. Its not a f*****g flu it is a Coronavirus which is a type of virus. And 2. Thats also racist. Hate-Neutral Tweets: These are tweets that are neither explicitly nor implicitly hateful, nor counterhateful, yet contain content related to COVID-19. Many tweets in this category are news-related, advertisements, or outright spam. One example of a tweet in this category is: Labeling process: Two authors labeled 2,400 tweets in total in multiple rounds. The tweets were randomly sampled from the set of tweets collected in the data collection process. Since the majority of tweets were expected to be neutral, we over-sampled tweets that contain Asian, hate, and counterhate terms. This ensured our labeling process yielded sufficient hate and counterhate tweets to train a classifier. The first round of labeling was done using a preliminary category definition. Two authors, i.e., annotators, independently labeled a set of 100 tweets. After independent labeling, both annotators resolved disagreements and updated the labeling guidelines and tweet category definitions based on the discussions. This resulted in the final definitions presented above. Next, 300 tweets were labeled independently by both annotators. This round led to near-perfect inter-rater agreement, ensuring the applicability and soundness of the labeling criteria. Finally, another 2,000 tweets were labeled. Overall, post removal of some duplicates, we were left with 2,319 labeled tweets, containing 678 hateful, 359 counterhate, and 961 neutral tweets. The remaining 321 tweets contained aggression directed towards non-Asian groups. We focus only on hate, counterhate, and neutral tweets in the remainder of this paper. We create a text-based classifier to label tweets as hate, counterhate, or neutral. We use the annotated dataset to train text-based machine learning models. We create the following three sets of features that are used for classification. • Linguistic Features [Ling] . This set contains a total of 90 features spanning stylistic, syntactic, and psycholinguistic categories, together representing the linguistic properties of the text. These features have previously been very effective in identifying hate speech, cyberbullying, and online deception (Fortuna and Nunes 2018; Salawu, He, and Lumsden 2017; Schmidt and Wiegand 2017; . We incorporate 20 stylistic features, such as the number of characters and words, fraction of unique words and stopwords, and the distribution of numeric characters, uppercase characters, and punctuation in the tweet. In addition, we use the tweet sentiment (Hutto and Gilbert 2014) . Syntactic features include the number of hashtags, URLs, and mentions in the tweet. Finally, we include the distribution of the words across the 65 LIWC categories, which accounts for the psycholinguistic properties of the text (Tausczik and Pennebaker 2010). • Hashtags [Hash] . This category counts the number of occurrences of each hashtag and keyword presented in the Tweet Dataset section. Even though they are not absolutely reliable, these features can be strong indicators of the final category of the tweet. For example, a tweet containing '#RacismIsAVirus' is more likely to be a counterhate tweet than a hate tweet. • Tweet Embeddings [Emb] . The above two feature sets apply a bag-of-words style approach and ignore semantic meaning. Thus, to incorporate word-and sentence-level semantics, we embed each tweet using two popular text-embedding models: GloVe (Pennington, Socher, and Manning 2014) and BERT (Devlin et al. 2018) . We used the GloVe model pre-trained on the 27B token Twitter dataset to generate 200-dimensional word embeddings, and we took the average across all content words in the tweet. For BERT, we first removed all Twitter-and web-specific content such as URLs, usernames, hashtags, and emojis, and then used the hidden layer representation of the tweet to generate 768dimensional tweet BERT-embeddings. Model creation and performance. Given the three-class classification task (hate vs. counterhate vs. neutral), we trained three separate one-vs-all Logistic Regression classifiers. Each classifier is trained with all three categories of features. We conducted five-fold cross-validation on the hand-annotated dataset. Performance of the models was measured using precision, recall, F 1 scores, and AUROC, as shown in Table 2 . Standard deviations are measured across the five classification folds. The performance of other machine learning models, such as Random Forest classifiers and SVM, was similar to that of Logistic Regression. We do not train deep learning models as the training data points are limited and these models would tend to overfit. In the rest of this section, we discuss the model performance of the Logistic Regression classifier. First, we evaluate the performance of a model trained with GloVe embeddings compared against BERT embeddings, in terms of the AUROC score. The following table shows the scores and standard deviations: The BERT model outperforms the GloVe model in two out of three tasks, and performs nearly the same in the third task. Thus, we use BERT embeddings in our further experiments. The model performances on the three tasks are compared in Table 2 . The table shows the ablation study of the threefeature set. First, we note that with a single category of feature, embedding features perform better than linguistic and hashtag features. This is true for all three one-vs-all classification tasks. Next, we observe that the addition of hashtag features to the other two features increases the performance slightly in all cases. The combination of embeddings and hashtags performs better among the two in all cases. Finally, we note that the combination of all features leads to a performance that is similar, within error bounds, to the embedding and hashtag model. We use the simpler of the two models (as it has less chance of over-fitting) to label the rest of the tweets in our dataset. In our entire dataset with 30,963,337 tweets, the models labeled 891,204 tweets as hateful (2.88%), 200,198 tweets as counterhate (0.65%), and 26,837,429 tweets as neutral (86.77%). The remaining tweets triggered more than one classifier, thus labeling them into more than one category. These were marked as other and not used in the rest of the paper. This makes the COVID-HATE dataset the largest COVID-19-related anti-Asian hate and counterhate dataset on Twitter. In this section, we use the COVID-HATE dataset to analyse the patterns of hate and counterhate in the Twitter ecosystem. We focus our analysis on the evolution and spread of hate and counterhate and the characteristics of the users involved. Here we consider the longitudinal and geographical spread of hate and counterhate tweets in the Twitter ecosystem. Hate tweets are more frequent than counterhate tweets. Figure 2 shows the daily distribution of hate and counterhate tweets. First, we note that hate tweets outnumber counterhate tweets throughout the timeline. Next, the number of hate and counterhate tweets was negligible-tolow in January and February. We observe a synchronized spike in both the timeseries between March 16 and March 19. By the week following the surge, the hate volume per day settled at a value 134.0% greater than that before the peak, but counterhate volume returned to a steady state that was only 15.6% greater than in the week before the peak. This suggests that hate lingers longer than counterhate. Nationally-relevant activity sparks countrywide hate. We dig deeper by further breaking down the hate trend by geographic regions. We used the OpenStreetMaps API to locate every tweet (Haklay and Weber 2008). When tweet location was unavailable, the user's profile location was used. A total of 37.3% of tweets were located and used in this analysis. Figure 3 shows the trend for the five countries with the largest number of hate tweets: USA (generated 21.0% of all hate tweets), India (6.1%), China (4.0%), UK (3.1%) and Canada (2.1%). Anecdotally, we see that local events lead to spikes in counts of hate tweets in a country-the spike in the USA was followed closely by President Trump's "Chinese Virus" tweet 5 and the spike in India happened after countrywide shelter-in-place orders were enforced. Furthermore, the rise of COVID-19 cases in a country does not effect that country's hatefulness, as measured by the Spearman's correlation coeffcient between the number of countrywide COVID-19 cases and the number of hate tweets in the country. The values are very low at 1.1 × 10 −9 and 6.5 × 10 −13 for the USA and India, respectively. Now, we turn to analyzing the properties of users who produce hate and counterhate tweets. Following the tweet categorization labels, we divide users, based on their tweets, into one of the following: hate, counterhate, neutral, or dual. Hate and counterhate users make at least one tweet from their respective tweet categories and none in the other. Users who tweet from both categories are dual. Finally, users who make at least one COVID-19 tweet, but no hate or counterhate tweets, are labeled as neutral. Among the 7,841,130 users who tweet about COVID-19, most of the Figure 4 shows the distribution of the number of hate tweets (counterhate tweets, respectively) made by hate users (counterhate users, resp.). We observe that both distributions exhibit a long tail, showing that most users make few relevant tweets and only a handful of users are responsible for spreading hate propaganda and counterhate messages. Hate users are less engaged in COVID-19 discussions prior to activation. When a user sends her first hate or counterhate tweet, she is said to be 'activated.' Recall that hate users never make a counterhate tweet and similarly, counterhate users never make a hate tweet. Here we evaluate the pre-activation hate user and counterhate user behavior. First, we find that hateful users are slightly less active before acti- vation, with an average of 6.23 COVID-related tweets, compared with the average of 6.53 COVID-related tweets made by counterhate users (p < 0.001). 6 Next we compare the sentiment and psycholinguistic properties of hate users and counterhate users, prior to activation. On average, hate users write shorter tweets (90.85 vs. 99.31 characters; p < 0.001), while using more URLs (0.636 vs. 0.546; p < 0.001) and tagging others more often (0.789 vs. 0.767; p < 0.001). Additionally, their overall sentiment scores are also more neutral (0.772 vs. 0.756 score; p < 0.001). Altogether, this shows that hate users are less active and more neutral in the COVID-19 discussion compared to counterhate users, prior to activation. After activation, hate users become more active and hateful than counterhate users. Here we contrast the postactivation behavior with pre-activation behavior of users. We find that after activation, hateful users become more active in the COVID-19 discussion compared to counterhate users (16.05 vs. 7.66 COVID-related tweets per user; p < 0.001). Moreover, hateful users make 2.26 hateful tweets on average compared to only 1.32 counterhate tweets per counterhate user (p < 0.001). Overall, our analysis shows that while hateful users do not participate in COVID-related discussions prior to activation, they become far more vocal, engaged, and hateful once they start participating in the discussion. In this section, we examine the user-user social connectivity in the hate ecosystem. As described in the dataset section, we crawl the social network containing over 43 million nodes and 372 million edges. Out of these, 708,166 nodes have made at least one COVID-19-related tweet. The rest of the nodes are part of the network as they are neighbors of these nodes. Figure 1 shows a subgraph of this network. To understand the differences in how hateful and counterhate users behave, we compare their ego-networks. We find that on average, hate users are better connected than counterhate users-hate users follow more users compared to counterhate users (870.65 vs. 785.41; p < 0.001) and are followed more (789.17 vs. 697.37; p < 0.001). Intragroup and intergroup connectivity. Next, we analyse the connectivity of users within and across the different groups to establish if nodes express homophily or form echo chambers. Simply comparing their probability of creating edges to nodes of a certain group is not sufficient as it is confounded by the node degrees and node distribution across categories. Thus, we create a network baseline to model the expected behavior of nodes and compare the observed behavior against this baseline (Leskovec, Huttenlocher, and Kleinberg 2010) . The baseline networks are created by randomly shuffling the edges, while keeping the set of nodes the same. The node degrees are preserved and each node is ensured to have the same number of COVID-19 neighbors as it did in the original network, though the neighbors change during shuffling. We compute the aggregate ego-network statistics across several versions of baseline networks (100 in our experiments). We compare the observed and the baseline behavior using the probability of connecting to hate, counterhate, and neutral nodes. Figure 5 presents the results. Nodes exhibit homophily. First, we examine the propensity for hate and counterhate nodes to connect with nodes within their own group. In Figure 5 (top), we show that counterhate users are 3.1× more likely to connect to other counterhate users compared to the baseline behavior. Similarly, the bottom figure shows that hateful users connect with other hateful users 3.7× more than expectation. Thus, nodes are preferentially connected to other nodes in the same group. Do hateful and counterhate users form polarized communities? Echo chambers and polarization are commonlyobserved phenomena in social media, which are responsible for the spread of propaganda and misinformation (Garimella et al. 2018; Del Vicario et al. 2016; Quattrociocchi, Scala, and Sunstein 2016) . However, it is not known whether echochambers exist in the hate network too. Given that nodes preferentially connect to similar nodes, four scenarios are possible. (1) Hate and counterhate users live in isolated echo-chambers, where these groups do not interact with one another. (2) On the other extreme, the two groups interact highly with each other, possibly exhibiting conflict. The remaining two possibilities (3) and (4) are that the out-group connections are one-sided. Figure 5 illustrates the empirically-observed behavior. Both hate and counterhate nodes are more likely to connect with one another than expected. Precisely, hateful users follow counterhate users 2.3× more than expected and counterhate users are 2.2× more likely to follow hateful users compared to the baseline. Furthermore, both hateful and counter-hate users are, on average, 18.2% less likely than expected to follow neutral users. Altogether, these indicate that hateful and counterhate users are highly engaged and closely interact with each other. Future work includes linguistic analysis of direct replies between the two groups that will reveal whether their interactions are pleasant and respectful, or they engage in altercation and conflicts. Here we explore if bots are responsible for spreading and countering anti-Asian hate and propaganda. We used the Botometer API (Davis et al. 2016) to assign a bot score to users. Due to API rate limitations, we restrict our analysis to 138,706 users sampled randomly from the set of hate and counterhate users. Following (Shao et al. 2018 ), we use a threshold of 0.5 bot score, labeling 8.9% of all users as bots. Digging deeper, we find that 10.4% hate users are bots and 9.7% of counterhate users are bots. Moreover, hate users are more likely to be bots-the mean bot score of hate users is 0.195 compared to 0.177 for counterhate users (p < 0.001). In the following, we contrast the behavior of hateful bots with hateful non-bot users. How does bot activity compare to non-bots? As expected, hateful bots are highly active broadcasters of COVID-19 (all hate, counterhate, and neutral) content. They post 1.8× more COVID-19-related content compared to hateful non-bot users (46.6 vs. 26.5 tweets; p < 0.01). Hateful bots also share 1.2× more hateful content compared to non-bot users (3.35 vs. 2.79 hateful tweets; p < 0.01). Do bots connect differently from real users? Surprisingly, hateful bots have fewer followers compared to hateful real users (667.31 vs 797.87, p < 0.001). Despite this, hateful users are a similar proportion of all followers of hateful bots and hateful non-bots (8.6% vs 8.4%, p = 0.46). This shows that hateful bots are able to attract real users to be followers as effectively as real hateful users. On the other hand, counterhate bots are not successful in attracting a relevant audience. They have fewer followers compared to hateful bots (300.60 vs 667.31, p < 0.001), and even real counterhate users follow hateful bots 4.32× more often than counterhate bots. These illustrate that counterhate bots are unable to attract the relevant users who can potentially amplify counterhate narratives and together reduce hate speech on the platform. Antisocial behavior, such as hate speech, abuse, and trolling, have been shown to exhibit social contagion (Mathew et al. 2019a; Burnap and Williams 2015; . However, whether hate speech is spreading as a social contagion during the COVID-19 pandemic remains to be seen. Moreover, the effect of the simultaneous presence and the spread of counterhate speech on the spread of anti-Asian hate and vice-versa is not known. Thus, in this section, we investigate the within-group and across-group influence on the diffusion of hate and counterhate speech. We quantify influence as the likelihood of a user to become hateful (resorting to hate speech for the first time) af-ter a user is exposed to hate or counterhate tweets from her neighbors. Similarly, we also explore the effect of neighborhood messages on a node's likelihood to start sending counterhate tweets for the first time. We refer to a user's change of state from the neutral state to hateful or counterhateful because of neighbors as an infection. We model the dynamics of hate and counterhate infection as an event cascade (Goel et al. 2016; Soni, Ramirez, and Eisenstein 2019) . The cascade is a temporally-ordered sequence of events recording the nodes that transition from neutral and whether they become hateful or counterhateful. Each cascade is associated with a function Risk s→s (n) that quantifies the probability that a user transitions from neutral to category s ∈ {hate, counterhate} after exposure from n neighbors of category s ∈ {hate, counterhate}. Neighbors are available from the social network. The infection risk function is calculated as: where Inf ected s is the set of users already infected with type s and Exposed s (n) is the set of users with at least n neighbors of type s. The infection risk function is determined not only by users' influence on one another, but also by homophily-the tendency of similar users to cluster in the network. To tease out their effect, we create a null model that measures the baseline risk of infection solely due to homophily. The order of cascade events is randomly shuffled and the infection risk is calculated in this random cascade (Anagnostopoulos, Kumar, and Mahdian 2008) . The mean baseline infection risk observed in 100 shuffled cascades is compared to the empirically observed in infection risk. If the empirical infection risk exceeds the baseline risk, then social contagion is responsible for the spread of infection (hate or counterhate). In Figure 6 , we compare the empirical infection risk and the mean baseline risk, along with its 95% confidence interval. We focus on the differences between the two curves. Figure 6 (a) shows that exposure to hate speech increases the likelihood of adopting hate speech, compared to the baseline. Moreover, the likelihood of adoption increases as the number of exposures increase. On the other hand, Figure 6(b) illustrates that the exposure to hate leads to a moderate increase in the likelihood of counterhate speech infection, though this increase is within the bounds expected from homophily. Next, we look at the impact of exposure to counterhate speech. In Figure 6 (c), we see that the exposure to counterhate neighbors results in a reduced likelihood of a node turning hateful, compared to the baseline. The difference is small but statistically significant. This suggests that exposure to counterspeech discourages users from turning malicious. Finally, Figure 6 (d) shows no difference between expected and observed rate of counterhate leading to more counterhate. Overall, our analysis in this section shows that hate begets hate and counterspeech can discourage users from becoming hateful. Counterhate, on the other hand, does not exhibit network contagion. (d) Figure 6 : The probability of a node becoming hateful or counterhateful after exposure to neighbors tweets, in the observed data and in the baseline case. (a) Exposure to hate tweets influences users to be hateful, several times more than expected, while (c) counterhate tweets reduces nodes from turning hateful. Exposure to neighborhood hate (b) or counterhate (d) does not result in nodes adopting counterhate more than expected. Here we discuss the works that we have not covered so far. Hate speech on social media. Detecting hate speech has been shown to be a challenging task, even for humans, often resulting in low inter-rater agreement (Ross et al. 2017 ; Waseem and Hovy 2016) Automatic hate speech detection methods are primarily text-based, often based on ngrams (Xu et al. 2012; Hosseinmardi et al. 2016; Mehdad and Tetreault 2016; Van Hee et al. 2018) , word embeddings (Zhao, Zhou, and Mao 2016; Djuric et al. 2015) , and other engineered features from hate dictionaries (ElSherief et al. 2018a; Liu et al. 2018 ). More recent methods use deep neural architectures (Chen, McKeever, and Delany 2019; Badjatiya et al. 2017) , and BERT has shown to outperform common models without overfitting (Nikolov and Radivchev 2019) . Studies have also focused on the targets of hate (ElSherief et al. 2018b; Mondal, Silva, and Benevenuto 2017) . Though most online hate speech research has focused on Twitter and Reddit, recent research has also spanned multiple platforms and fringe communities (Phadke and Mitra 2020; Mariconti et al. 2018; Mathew et al. 2019a; Finkelstein et al. 2018; Mittos et al. 2019) and groups (Velásquez et al. 2020; Phadke and Mitra 2020; Phadke et al. 2018) . However, none of the above-mentioned studies are placed in the context of COVID-19, which is the gap we address. COVID-19 and social media research. The COVID-19 outbreak has motivated a growing collection of datasets on social media discourse surrounding COVID-19 (Chen, Lerman, and Ferrara 2020; Schild et al. 2020; Vidgen et al. 2020) . The discourse also contains a substantial amount of misinformation (Kouzy et al. 2020; Singh et al. 2020) and conspiratorial content (Ferrara 2020) that is being shared and actively spread by social bots (Ferrara 2020; Gallotti et al. 2020) , but these works do not focus on studying hate. Few studies to date have specifically addressed the spread of anti-Asian hate on social media. Schild et al. (2020) proposed a multi-platform dataset of hate, but did not address counterhate speech, which is simultaneously prevalent on the platform. Our study analyses both hate and counterhate in a unified framework. Moreover, we create a hand-labeled dataset and classifier for hate and counterhate detection, which goes beyond the keyword-style approach adopted in the paper by Schild et al. (2020) . Recently, Chen et al. (2020) collected tweets containing controversial hashtags like #Chi-neseVirus. However, as we have shown, the presence of hashtags is insufficient to label a tweet as hate. We overcome this shortcoming by developing a hand-labeled dataset and classifier. Finally, contemporaneous work by Vidgen et al. (2020) released a large hand-labeled dataset of hate and counterhate speech. However, they do not conduct any analysis of the hate ecosystem, which we present in this work, in addition to a complementary hand-labeled dataset. Due to the recency of Vidgen et al. (2020) , we leave comparison between the two frameworks for future work. Counterhate speech. Counterhate speech has qualitatively been shown to be the most effective and the least intrusive solution to hate speech, though quantitative studies are limited (Bartlett and Krasodomski-Jones 2015; Gagliardone et al. 2015; Benesch 2014; Briggs and Feve 2013; Benesch 2014; Wright et al. 2017; Miškolci, Kováčová, and Rigová 2020; Munger 2017) . More recent work has focused on developing novel counterspeech data sets (Chung et al. 2019; Mathew et al. 2018 ) and detection classifiers (Mathew et al. 2019b) . Counterhate chatbots have shown initial success (Hoekstra et al. 2019) . However, none of the previous works have studied counterhate in the context of COVID-19, which is the gap that we address in this work. Our findings in this work shed light on an ongoing societal problem caused by the COVID-19 pandemic. Importantly, by studying online hate speech on Twitter, we have shown that online hate is more prevalent than counterhate. We show that hate begets hate, making it crucial to detect hate speech as early as possible and to reduce its exposure to others. The current efforts to counter hate speech are limited, exposing the need to create effective, scalable countermeasures. Our work has limitations. Our analysis is limited to English tweets and one platform (Twitter). It will be useful to extend the analysis to multiple languages and spanning multiple platforms where COVID-19 discussion happens. Next, augmenting our dataset with the recently released handlabeled dataset of COVID-19-related hateful and counterhate tweets can bolster the experiments (Vidgen et al. 2020), though the findings are not expected to change. Moreover, our labeling scheme is coarse-grained (hate vs. counterhate vs. neutral), which limits the ability to study nuanced forms of hate and counter speech (Salminen et al. 2018) . While this work is limited to anti-Asian hate speech, it will be worth studying hate and harassment towards other minority groups during the pandemic. This work lays the ground work for many important directions of future research. First, while we focused on textbased hate speech, hate is also spread using images, memes, and videos. One open research direction is to understand multimodal hate speech. Next, the interrelation between the hate ecosystem and the misinformation ecosystem is important to study. Finally, curbing the spread of hate speech is crucial, and more research is needed in developing (semi-)automated ways to counter hate speech. Spike in burglaries, assaults and domestic violence seen in Houston-area during coronavirus pandemic (Click2Houston) The use of deep learning distributed representations in the identification of abusive text Botornot: A system to evaluate social bots Bert: Pre-training of deep bidirectional transformers for language understanding Countering online hate speech Assessing the risks of" infodemics" in response to covid-19 epidemics The social dynamics of language change in online networks Prediction of cyberbullying incidents in a media-based social network Vader: A parsimonious rule-based model for sentiment analysis of social media text An army of me: Sockpuppets in online discussion communities Signed networks in social media Forecasting the presence and intensity of hostility on instagram using linguistic and social features you know what to do": Proactive detection of youtube videos targeted by coordinated hate attacks and we will fight for our race!'" a measurement study of genetic testing conversations on reddit and 4chan Tweetment effects on the tweeted: Experimentally reducing racist harassment Is there a case for banning hate speech? The content and context of hate speech: Rethinking regulation and responses 37-56 Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media The spread of low-credibility content by social bots Analyzing the targets of hate in online social media A first look at covid-19 information and misinformation sharing on twitter Detecting social influence in event cascades by comparing discriminative rankers The psychological meaning of words: Liwc and computerized text analysis methods Hate multiverse spreads malicious covid-19 content online beyond individual platform control Learning from bullying traces in social media Automatic detection of cyberbullying on social networks based on bullying features Ling 60.8 ± 2.4% 39.9 ± 3.6% 48.1 ± 3.5% 0.760 ± 0.012 Hash 71.0 ± 3.4% 40.4 ± 5.1% 51.4 ± 5.0% 0.798 ± 0.031 Emb 71.1 ± 3.3% 60.7 ± 3.8% 65.5 ± 3.2% 0.864 ± 0.017 Ling+Hash 69.4 ± 0.5% 57.2 ± 3.9% 62.7 ± 2.5% 0.851 ± 0.017 Emb+Hash 72.3 ± 2.8% 63.8 ± 2.6% 67.8 ± 2.4% 0.876 ± 0.019 All 68.9 ± 3.1% 64.4 ± 4.8% 66.5 ± 3.7% 0.867 ± 0.017 Counterhate tweet detection Ling 44.6 ± 7.3% 13.1 ± 3.7% 20.1 ± 5.0% 0.728 ± 0.033 Hash 52.0 ± 21.6% 3.6 ± 1.1% 6.7 ± 2.1% 0.793 ± 0.016 Emb 57.6 ± 9.0% 38.9 ± 6.0% 46.0 ± 5.7% 0.833 ± 0.019 Ling+Hash 54.2 ± 8.9% 29.7 ± 2.4% 38.2 ± 3.4% 0.807 ± 0.032Emb+Hash 58.0 ± 7.1% 42.8 ± 6.3% 49.0 ± 5.9% 0.852 ± 0.019 All 52.8 ± 4.8% 41. Below is the list of keywords and hashtags used to collect tweets.•