key: cord-0156869-f3q949ol authors: Chuai, Yuwei; Zhao, Jichang title: Anger makes fake news viral online date: 2020-04-22 journal: nan DOI: nan sha: 07aaa611260b496f03bb7e106172799cd8d3b689 doc_id: 156869 cord_uid: f3q949ol Fake news that manipulates political elections, strikes financial systems and even incites riots is more viral than real news online, resulting in unstable societies and buffeted democracy. The easier contagion of fake news online can be causally explained by the greater anger it carries. Offline questionnaires reveal that anger leads to more incentivized audiences in terms of anxiety management and information sharing and accordingly makes fake news more contagious than real news online. Our results suggest that the digital contagion of emotions, in particular anger, should be comprehensively considered in profiling the spread of online information. Cures such as tagging anger in social media could be implemented to slow or prevent the contagion of fake news at the source. impact of fake news on social media could be disproportionate (3) and profound (4) , especially in the political (2, (4) (5) (6) (7) and economic fields (8) . In the first few months of the 2016 U.S. presidential election, on average, each adult was exposed to more than one fake news item that was not only widely spread but also deliberately biased (6) . Furthermore, fake news is more likely to appear in the highly uncertain conditions of emergencies, such as disease epidemics and outbreaks (9, 10) , accidents and conflicts (11) , which makes the spread of fake news a byproduct of the natural response that people have to disastrous events, and social media can be fertile ground for this spread (43) online. Fake news is more viral than real (true) news online (2) . The mechanism underlying its fast spread, though critical, remains unresolved. Unique structural features in the circulation of fake news, such as long diameters of penetration, have been revealed and have been found to be platform independent (13) (14) (15) (16) . However, fake news is generally verified to be false after explosive circulation (17) ; thus, in the early spread, it is essentially not thought to be fake, so the structural uniqueness is the manifestation of its fast spread, rather than a cause that can fundamentally explain its viral proliferation. Individuals, either human or bots (18) , posting and reposting fake news on social media are an alternative cause, in particular, the human that occupies the dominant partition (19) . The spread of news is associated with the friends and followers of the author. Nevertheless, user characteristics fail to sufficiently explain the easy contagion of fake news due to their greater effects on the dissemination of real news (2) . The content of fake news, which was also found to be entangled with spread (2, 20) , could offer promising directions in probing the mechanism of its fast spread. More importantly, instead of examining spreading structures (2, 13) and reposter demographics (21) after the circulation was ignited, revealing the mechanism at the source that independent to user demographics would be powerful in inspiring new cures with the minimum invasion of privacy. Hence, we would rather differentiate fake news from real news at the very beginning of their spread through scrutiny of content to figure out new treatments against fake news that can be implemented without delay. Online news content not only delivers factual information but also carries sophisticated emotional signals. Being embedded in information spread, the digital contagion of emotions, in which individuals experience the same feelings on social media, is similar to the face-to-face emotion exchange offline (22, 23) . Emotions further impact the spread of information, e.g., promoting the sharing (24) or shaping the paths (25) . When the relevance between content quality and popularity is not strong (26) , the emotions involved and their influence on psychological arousal may be key (2, (27) (28) (29) . Moreover, the spread of different emotions can inherently be distinguished (29) , implying that emotions conveyed by both fake and real news could offer comparative proxy measurements to examine the mechanisms underlying their circulation. In fact, fake news is preferentially injected with emotions such as anger for political attacks (30) . However, differentiating fake news from real news is rarely based on emotions delivered in the content and incentives beyond reposting in extant efforts. The discrepancy in user perceptions between fake news and real news were unraveled through emotions of replies (2) , while the emotions that inherently carried by news themselves are not considered in explaining the circulation. In fact, it has been found that negative emotions in content might cause positive responses (e.g., sympathy) (31) , meaning the emotions in particular the negative parts should be directly examined in the spread of fake news. At the same time, although content on social media can be short, simplifying the emotions it carries into a single emotion might miss the emotional richness (2, 32, 33) and lead to failure of emotion recognition and inconsistent results (23, 28, 29, 34) . In this study, by successfully combining digital traces on social media and offline questionnaires, we aim to unravel the mechanism underlying the fast spread of fake news by answering three key questions: What are the differences in the emotional distributions of real and fake news? Can these differences explain why fake news is more infectious than real news? How do they affect the incentives behind news reposting? We collected a large dataset of both fake news and real news from Weibo, the most popular Twitter-like service in China, that includes 10,000 true news items posted by credibly verified users and 22,479 fake news items endorsed by an official committee of Weibo after wide dissemination (see SM S1 for more details). On the basis of the number of followers on behalf of the broadcasting potential of authors and the number of retweets on behalf of the spreading capability of news (16) , we assemble both categories of news into treatment and control groups. For example, taking fake news with low numbers of followers and high volumes of retweets (LHF news) as the treatment group, the controlled counterparts consist of either fake news with high volumes of followers and low numbers of retweets (HLF news) or true news with high volumes of followers and low numbers of retweets (HLT news) (see SM S2 for details). By intentionally selecting news that is weakly retweeted but posted by highly followed authors, the possible effects from users can be controlled to amplify the spread promotion resulting from the particular emotion content it carries. Moreover, although fake news is statistically more contagious (longer path, faster speed, lasts longer, and gets more retweets) than real news (see SM S2.3 and 3), not every fake news item is necessarily more viral than any real news item. For instance, the diffusion capability of highly retweeted true news is definitely more powerful than that of lowly circulated fake news. Therefore, we would compare LHF news with HLF news and HLT news first and then extend the comparison to the full spectrum of discrepancies between true (T) news and fake (F) news in terms of emotions. Emotional signals carried in either fake or real news can be sophisticated, i.e., a combination of elementary compounds rather than a single one (33) . The distribution of five emotions that represent basic human feelings (2, 35, 36) , namely, anger, disgust, joy, sadness and fear, is inferred for each news item in our data through a lexicon that is manually labeled to cover 87.1% of news items with the remaining considered neutral (see SM S4). Emotions with strong presence in the distribution are the feelings that the sender of the news wishes the receivers to experience (37) . The proportion of anger (Fig. 1A) in LHF news is expected to be significantly higher than that in both HLF and HLT news, while joy is expected to be lower (Fig. 1E ). The comparison is then extended to a full spectrum between all fake news and real news, and consistent results, though with shrinking gaps for anger and joy, as expected, are obtained (Figs. 1B, F). Furthermore, the dominance of anger in fake news (especially highly retweeted news) and joy in real news (even lowly retweeted news) is further confirmed with better resolution in the distribution of emotions of keywords that precisely separate the treatment groups from control groups (see SM S6 and Fig. S10 ). These observations persistently suggest that fake news carries more anger yet less joy than real news and imply the possibility that anger might promote the fast spread of fake news online. The divergence in anger and joy between fake news and real news is robust and independent of emotion inference models and emotion distribution measures (see SM S7). Even in specific events like COVID-19, the dominance of anger to joy in highly retweeted fakes news conformably suggests the promotion of anger in fast spread of fake news (see SM S7). By contrast the near overlap in disgust between different types of news ( Fig. 1C, D) , the less occupation of fake news when sadness more than 0.5 (Fig. 1H) , and the more dominant position of fear in HLF news (Fig. 1I ) indicate their less positive roles in the virality of fake news (24, 34, 38) . Therefore, significant gaps across news groups could also be independent of circulation, and well-controlled causal inference is accordingly necessary for anger and joy. To causally infer and qualify the promotion of anger and the prevention of joy in the spread of fake news, internal factors related to content (39), user (2) and external shocks such as disaster events (9) should be comprehensively controlled. Specifically, internal factors, including mention, hashtag, location, date, URL, length, topic, other emotions, follower (number of followers), friend (number of reciprocal followers) and external shocks including emergency (a disaster event) constitute control variables in the logit and linear inference models (see SM S8). The results of the logit model (see SM S9) for lowly retweeted true (LT) news (control group) and highly retweeted fake (HF) news (treatment group) show that the coefficient of anger is significantly positive and the coefficient of joy is negative, implying that anger causally promotes the fast spread of fake news online. Other emotions are omitted (Table 1 (1)) due to multicollinearity and their trivial impact on circulation. Moreover, for the logit model used to estimate all true and fake news, anger is positively associated with fake news, though with a smaller coefficient and narrower deviation, as anticipated (Table 1 (2)). Recalling the gaps observed in emotion distributions across groups of news, all the results consistently suggest the positive promotion effect of anger in the circulation of fake news, particularly for news that is highly retweeted. The causally negative relationship between joy and fake news contrarily indicates its prevention in dissemination. To further qualify the influence of both anger and joy in the spread of fake news, a linear regression model with the number of retweets as the dependent variable is established (see SM S9). It is congruously found for fake news and all news that the coefficients of anger are significantly positive while the coefficients of joy are negative (Table 1 (3) and (4)), suggesting that anger can promote circulation and joy can prevent the spread. Specifically, supposing that other factors are fixed, increasing the occupation of anger by 0.1 and reducing that of joy by 0.1 in fake news leads to 5.8 more retweets, and 2.2 more retweets occur if anger is increased by 0.1 in place of other negative emotions but joy is fixed. The above causal relationships between emotions and circulation are robust to alternative emotion detection approaches such as competent machine learning models (see Table S17 ). For other significant factors, although mention can promote the spread of news (Table 1 (3) and (4)), the coefficient is not significant for LHF news (Table 1 (1)) and even prevents the spread of fake news (Table 1 (2)); emergency is significantly positive in the logit models (Table 1 (1) and (2)) but inconsistently negative in the linear models (Table 1 (3) and (4)) (see SM S8 for more de-tails). Therefore, carrying more anger and less joy is the mechanism behind the fast spread of fake news that makes it more viral than real news online. More importantly, additional evidence from extensive datasets of English news on both Twitter and main stream media further confirms the independence to cultures and platforms of this mechanism (see SM 10) . Negative stimuli such as anger elicit stronger and quicker emotional reactions and even behavioral responses than positive stimuli such as joy (40, 41). The odds of being forwarded through e-mails are also causally impacted by the physiological arousal caused by emotional articles, and those evoking high-arousal positive or negative emotions could be more viral (34) . In the spread of fake news, the incentives behind the action of reposting that reignites circulation are therefore hypothetically associated with the anger and joy the news carries. Taking LHF news as the treatment group and HLF news and HLT news as the control groups, the possible associations between reposting incentives and emotions are examined through offline questionnaires. By selecting 15 typical news items with keywords from these groups (see SM S11), questionnaires are implemented to investigate four motivations for news reposting in social media (42), including anxiety management, information sharing, relationship management and self-enhancement. The subjects of the surveys are Weibo users, and the overlapping between offline subjects and online users is ensured (see SM S12). Preliminary results indicate that the motivation of anxiety management in LHF news is significantly higher than that in the control groups ( Fig. 2A) . Moreover, compared to HLT news, subjects are more intensively incentivized to share information when reposting HLF news and LHF news (Fig. 2B) . Thus, fake news can stimulate strong motivation for information sharing; in particular, news that is widely disseminated can also strengthen the motivation for anxiety management. There is no significant variation in the motivation for relationship management across news groups (Fig. 2C ), and the motivation for self-enhancement in HLT news is stronger than that in fake news (Fig. 2D) . What is more interesting is that in questionnaires with keywords highlighted with marks, the unique stimuli of widely circulated fake news for anxiety management is strengthened (see Fig. S23A ). The incentive of information sharing is similarly enhanced for fake news (see Fig. S23B ). All these results imply that the responses to the anger carried by fake news are sharing information and even managing anxiety. To validate this finding, the news in questionnaires is further split into anger-dominated news and joy-dominated news (see SM S13.2) to directly probe the impact of emotions. Compared to the retweeting motivations of joy-dominated news, anger-dominated news stimulates stronger incentives for anxiety management (Fig. 2E ) and information sharing ( Fig. 2F ). Joy-dominated news ultimately excites stronger self-enhancement ( Fig. 2H ) than anger-dominated news. Meanwhile, no significant difference is observed between anger and joy in terms of relationship management motivation (Fig. 2G ). Shuffling emotions randomly further testifies to the significance of these observations (see SM S13.2). Therefore, the greater anger delivered in fake news leads to more incentivized audiences with respect to anxiety management and information sharing, resulting in a greater likelihood of retweets and, thus, more viral contagion. Our findings emphasize the necessity of considering emotions, particularly anger, in understanding the spread of information online. On social media, the associations between information diffusion and embedded emotions have been noted for a long time; however, the profiles of the roles of both positive and negative emotions are inconsistent and even contradictory across diverse contexts (23) . Considering the heterogeneous influence on spreading from negative emotions such as anger and sadness (24, 34, 38) , the causal impact on information diffusion should be examined with respect to well-resolved negative emotions. Instead of simplifying emotions binarily into positive and negative emotions, more elementary emotions are considered in this study, and the distribution of five emotions is inferred to reflect the complete emotional spectrum of news online. This more detailed spectrum of emotions identifies angers unique role in provoking strong incentives of anxiety management and information shar-ing, which results in the virality of fake news online. From this perspective, emotions could be genes of fake news circulation, and similar to small mutations, they could make the virus go viral. Mutations that increase anger or reduce joy in fake news enhance its likelihood of being retweeted. Distinguishing structures in the circulation of fake news, which have been pervasively revealed in both Twitter (?, 2) and Weibo (14) , could also be deciphered based on the anger such news predominantly carries since anger prefers weak ties in social networks (?) and may inherently forge the diffusion structure of fake news (see SM 2.3). Meanwhile, the role of joy in preventing spread, especially in fake news, underlines the fundamentality of considering negative emotions of fine granularity to control and deepen future explorations. Therefore, it is anticipated that insights from emotions will improve the extant understanding of online information spread. The vigorous promotion in circulation from anger implies new weapons against fake news. Although structural signals can be sensed at an early stage to target fake news (14) , fake news spreads rapidly and reaches the peak of new retweets in less than one hour (see Fig. S7 ), so the negative impact has been exposed to a large population of audiences before identification. Moreover it can take more than three days for a post to be rated as false by outside fact-checkers on Facebook (44). What is worse, like a cat-and-mouse game between manipulation and detection, features derived from content or users that were found to be helpful in machine learning on targeting fake news (45) can be easily converted to inspire future countermeasures for fabricating more sophisticated false news (11) . In particular, fake news related to emergencies is widely disseminated because of its clever combination with anger, which may explain why efforts to counter misperceptions about diseases during epidemics and outbreaks are not always effective (10) . Inefficient or ineffective efforts to detect fake news and debunk misinformation by correcting both calls for new treatments and preventing the spread of anger could be a profound and promising direction. The early deviation in dissemination paths between fake news and real news suggests the rapid effect of anger in shaping retweeting (25) . For example, platforms such as Facebook, Twitter and Weibo should warn and discourage users as they try to retweet news that delivers too much anger and persuade them to assess the credibility of the information more critically. The trade-off between free speech and fake news prevention is the prime principle; however, a better balance would be achieved by tagging angry news (e.g., with an occupation of anger of more than 20%, see SM S14 for more details) at the very beginning to make audiences and potential spreaders less emotional and more rational (46). Figs. S1 to S25 Tables S1 to S26 The fake news and real news in this study were collected from Weibo, the most popular Twitterlike service in China, which had 200 million daily active users and generated over 100 million daily tweets (news) at the end of 2018. 1 Here news refers to tweets including news-related content on Weibo. The users of Weibo are dominated by young people, and those aged between Through the open API of Weibo, we collected fake news rated and exposed by the official committee. Considering that fake news always draws attention from the committee after being widely disseminated, the digital traces of the spread of such news on Weibo can be completely traversed. Further probes on the timelines of all news items confirm this fact in S3. Real news, also termed true news in this study, refers to information that was not tagged as false by the committee and was posted by verified users, such as mainstream media, elites, or public authorities, with credibility. In total, we collected 22,479 fake news items (with 1,189,186 users) and 10,000 real news items (with 409,865 users) from 2011 to 2016. For each news item on Weibo, we also collected its attributes, namely, text, posting time, author profile (number of followers, number of reciprocal followers, etc.), retweets, and reposting time. A subset of the fake news and real news used in this study was employed in the previous study (14) on the structural uniqueness of fake news, in which equivalent results are derived from both Weibo and Twitter, implying the reliability and universality of our data. Also, authentic tweets from credible nonverified authors of Weibo further testified the representativeness of our real news data (14) . We have made the data publicly available at https://doi.org/10.6084/m9.figshare. 12163569.v2. The number of followers intuitively represents the influence of users on social media, i.e., more followers mean the news will be broadcast to a larger audience and accordingly result in more retweets. Additionally, the number of retweets can represent the spreading capability of a given news item. Fake news might be widely retweeted because of the influence of its author; however, the broadcasting potential of authors does not sufficiently explain the fast spread of fake news (2), e.g., fake news posted by lowly followed authors might be massively retweeted. To examine the causal impact of emotions on the circulation of fake news, treatment groups and control groups are established to control for variables and infer the significant roles of emotions underlying the spread. Considering that the role of emotions in information spreading might be subtle and easily interfered with by other variables, such as the influence of authors, we aim to split news, either fake or real, into a treatment group (e.g., highly retweeted news posted by authors with a low volume of followers) and a control group (e.g., lowly retweeted news posted by authors with a high volume of followers), through which the possible influence of authors can be controlled and the effects of emotions are amplified. Intuitively, for highly retweeted news posted by authors with a low volume of followers, promotion from the content, in particular, the emotions carried, would be more powerful and thus easier to detect. Therefore, we group the news according to the number of its authors followers (x) and the number of retweets (y) (16) . For example, based on real news with a high number of followers but a low number of retweets and fake news with a low number of followers but a high number of retweets, a division model of maximizing the difference between true and fake news is defined to determine the splitting interface, which is specified as • N T is the number of true (T) news items. • N F is the number of fake (F) news items. • N LLT is the number of true news items with a low number of followers (< x) and a low number of retweets (< y). • N LHT is the number of true news items with a low number of followers (< x) and a high number of retweets (≥ y). • N HHT is the number of true news items with a high number of followers (≥ x) and a high number of retweets (≥ y). • N HLT is the number of true news items with a high number of followers (≥ x) and a low number of retweets (< y). • N LLF is the number of fake news items with a low number of followers (< x) and a low number of retweets (< y). • N LHF is the number of fake news items with a low number of followers (< x) and a high number of retweets (≥ y). • N HHF is the number of fake news items with a high number of followers (≥ x) and a high number of retweets (≥ y). • N HLF is the number of fake news items with a high number of followers (≥ x) and a low number of retweets (< y). Table S1 : Numbers and proportions of all groups of both fake and real news. We let the number of followers (from 10 to 10 4 ) and the number of retweets (from 10 to 10 8 ) grow exponentially with a step size of 1 to maximize the value of D and find the optimal partition line. As shown in Fig. S1 , the best tuple is (x * , y * ) = (10, 1000). According to the tuple (10, 1000), we divide the news into low volume of followers and lowly retweeted true (LLT) news, low volume of followers and highly retweeted true (LHT) news, high volume of followers and highly retweeted true (HHT) news, high volume of followers and lowly retweeted true (HLT) news, low volume of followers and lowly retweeted fake (LLF) news, low volume of followers and highly retweeted fake (LHF) news, high volume of followers and highly retweeted fake (HHF) news and high volume of followers and lowly retweeted fake (HLF) news (Fig. S2 ). Lowly retweeted true (LT) news includes LLT news and HLT news, highly retweeted true (HT) news includes LHT news and HHT news, lowly retweeted fake (LF) news includes LLF news and HLF news and highly retweeted fake (HF) news includes LHF news and HHF news. Additionally, ignoring the label of fake or true, lowly retweeted news is categorized as L news, and highly retweeted news is categorized as H news. By pairing various groups, diverse assemblies of treatments and controls can be established to examine the causal impact of emotions on circulation. Specifically, HLT news accounts for the largest proportion of true news, and LLF news accounts for the largest proportion of fake news (Table S1) . To verify the rationality of the partition strategy in S2.1, we first examine the information dominance between different author groups. Here information dominance measures to which extent the authors of news items could dominate the spread in other spreader groups. According to their numbers of followers (x) , all users are divided into eight groups, including G 0 (users whose follower counts fall in the interval [0, 10)), G 1 (users whose follower counts fall in the interval 10, 10 2 ), G 2 (users whose follower counts fall in the interval 10 2 , 10 3 ), G 3 (users whose follower counts fall in the interval 10 3 , 10 4 ), G 4 (users whose follower counts fall in the interval 10 4 , 10 5 ), G 5 (users whose follower counts fall in the interval 10 5 , 10 6 ), G 6 (users whose follower counts fall in the interval 10 6 , 10 7 ), and G 7 (users whose follower counts fall in the interval 10 7 , ∞ ). The information transmitted from the news item m inG i (if the author where N i,m,j is the number of spreaders belonging to G j in the retweets of m in G i and G is the number of groups. Meanwhile, the coverage of m to G j is defined as where N j is the number of users belonging to G j . According to T i,m,j and C i,m,j , the transmis- where M i is the number of news items in G i . Then, the information dominance of G i to G j is is defined that G i has more information influence as compared to G j . As shown in Fig. S3 , since G 2 , the information dominance of G out to G in is constantly larger than 0.5, implying authors with numbers of followers higher than 10 3 indeed possess more information influence. Hence, it is reasonable to divide L users (with low influence) and H users (with high influence) by 10 3 according to our partition strategy. The spreading capability of news may not be comprehensively represented by the number of retweets, and the diffusion structure can also reflect the very viral nature of news. Therefore, we further examine the rationality of the partition strategy according to retweeting number (y) in S2.1 from the perspective of circulation structure. The structural virality is the average distance between all pairs of nodes in a diffusion (48) , which can measure the diversity of diffusion where d i,j denotes the length of the shortest path between nodes i and j. When v ∼ 2, it can be thought an approximately pure broadcast (48) . The average structural virality of news diffusion with the number of retweets is shown in Fig. S4 . For all true and fake news, approximately 97% of the structural virality is lower than 2 when the number of retweets is less than 10, which is exactly same to the cutting point previously obtained, verifies the reliability of the division in S2.1 and again consolidates our partition strategy of news groups for treatment and control. Meanwhile, fake news is more viral (longer average path) than true news (K-S test ∼ 0.159, P ∼ 0) in terms of structural virality, which is consistent with previous results on Twitter (2) , implying the universality of our dataset from Weibo. Six typical diffusion networks of both fake and real news are also shown in Fig. S5 to further illustrate this. As mentioned in S1, both fake news and real news were collected before 2017 (our commercial access to Weibo API expired in 2017), and the news in our data set was posted from 2011 to 2016 (Fig. S6) . Specifically, fake news still obtains 26% of its retweets after 48 hours, while that proportion for true news is 20%. More importantly, the stronger vitality of fake news than true news is consistently observed in groups of LT news vs. LF news (K-S test ∼ 0.114, P ∼ 0.0) (Fig. S7D) and HT news vs. HF news (K-S test ∼ 0.138, P ∼ 0.0) (Fig. S7E) . Besides, we compared the distributions of the number of retweets within 48 hours of the posting and found that the propagation speed of fake news is significantly higher than that of true news (K-S test ∼ 0.195, P ∼ 0.0) (Fig. S7F ). All this evidence suggests findings similar to those for Twitter (2) , that is, fake news is more viral than real news online. Compared to that of real news, its circulation lasts longer, has higher speed, and ultimately produces more retweets. In this study, the emotional texts of news in social media, both fake and true, are assumed to carry sophisticated signals that cannot be fully represented by binary values such as positive or negative. In contrast, emotions, in particular, negative emotions, are split into elementary compounds, including anger, disgust, sadness, and fear (35, 51) . Then, together with joy, which is used to reflect positive emotion, the distributions of the five emotions are derived to fully represent the emotional spectrum of each news item. An emotion lexicon must be established to obtain the emotional distribution of the text in both fake and true news intuitively and accurately; then, the occupation of a certain emotion can be calculated as the fraction of terms with this emotion in all emotional terms of the news text. We first segment all the texts into terms, filter by parts of speech, and keep nouns, verbs, adverbs, gerunds, adjectives, adjectives directly used as adverbials and adjectives with noun function to compose a candidate set. As a result, 34227 preselected terms are obtained. Note that there might also be terms of nonemotion in the candidate set. We then hire human coders to manually label the terms: those without emotions are marked as neutral. A WeChat applet, named Word Emotion (Fig. S8) , is built to make the labeling convenient. The whole labeling task was completed by nine well-instructed coders who are active users of Weibo with ages between 18 and 30 years old, and each term is labeled three times by randomly selected coders. Finally, terms with more than two identical emotional labels are screened out to build the lexicon. The emotion distributions of news in the different groups are derived utilizing the established emotion lexicon. After the inference of emotion distributions, possible differences between treatment and control groups of news are comprehensively examined. These differences are expected to help reveal the mechanism underlying the circulation of fake news. In particular, more insights might be derived by splitting negative emotion into more elementary emotions. In the main text, we discussed that the amount of anger in fake news is significantly higher than that in true news, and the amount of joy in true news is significantly higher than that in fake news. This phenomenon is more obvious in HLT news and LHF news after excluding the influence of the author. Moreover, to further examine the difference between anger and joy and its possible association with the fast spread of fake news, we compare the emotional differences between HLF news and LHF news. The results show that the amount of anger in LHF news is significantly higher than that in HLF news (Fig. 1A in the main text), and the amount of joy is significantly lower than that in HLF news (Fig. 1E in the main text), which is consistent with the comparison between L news and H news (Fig. S9A, S9C ). That is, the amount of anger in widely circulated news is significantly higher than that in less widely circulated news. The statistics of the emotional distributions and the results of K-S tests are shown in Table S2 -5. All these observations consistently suggest an association between anger and the virality of fake news and inspire later causal inference through regression models. The existence of highly retweeted tweets posted by authors with a low volume of followers in both fake news and real news implies the potential influence of content on circulation. Besides, emotions are carried by words in the text. The distinguishing distributions of emotions, in particular, anger and joy, between fake news and real news inspire us to pinpoint keywords that could split news groups. Additionally, these keywords could help in later offline questionnaires to strengthen the stimuli of anger and joy on the reposting incentives of the audience (see S13). Specifically, for LHF news, HLT news, and HLF news, we train an SVM (52) the emotional keywords in HLT news are all joyful (Fig. S10B) , and those in HLF news are mainly joyful (Fig. S10F) , followed by fearful. However, negative emotions, especially anger, dominate the keywords in LHF news (Fig. S10D) . These observations support the initial assumption that emotions carried by news, in particular, the dominant emotions of anger and joy, can be reflected by keywords that effectively separate different groups of news; therefore, these keywords will affect the incentives underlying retweets. Meanwhile, the exact same difference in the emotion distribution at the keyword level further confirms the consistency and robustness of the emotional divergence between fake news and true news revealed at the collective level (see S5). Table S6 -11. All the results support our conclusions obtained from the emotion lexicon, in particular, the difference in emotion distributions between anger and joy, suggesting the robustness of our understanding of emotion divergence between fake news and real news. In the previous analysis and the additional test on emotion divergence, the emotion distribution of each news item is inferred exclusively by one method, i.e., lexicon-based, Bayes or BP1, and and true news, a new text-level measure is presented to represent the emotion distribution by ranks. Specifically, for each news item text, a batch of models is employed separately to infer the probability of belonging to the five emotions, which are then ranked according to these probabilities: lower-ranking values represent higher probabilities of the texts belonging to the corresponding emotions. Note that emotions with the same probability are ranked randomly. By aggregating the ranks of a certain emotion over all models, a distribution of rank can be ranks of anger in LHF news, F news and H news are significantly lower than those in HLT news, T news, and L news (Fig. S11A, B, C) , while the ranks of joy show the opposite trends ( Fig. S11G, H, I) . Note that a lower rank represents a higher probability of belonging to the corresponding emotion. This result is consistent with all previous results, indicating that the divergence in anger and joy between fake news and real news is robust and independent of emotion inference model and emotion distribution measure. However, the differences in other 4 The classic machine learning models are built with scikit-learn and BP2 is built with PyTorch. negative emotions across news groups, though significant, are inconsistent and varying. The ranks of sadness in LHF news, F news, and H news are significantly higher than those in HLT news, T news, and L news (Fig. S11J , K, L), which is inconsistent with the previous results (see Fig. 1 in the main text). The ranks of disgust fluctuate inconsistently across different assemblies of news groups. Although the rank of fear in LHF news is significantly lower than that in HLT news, as the rank is smaller than 4, it becomes higher than that of HLT, as the rank is 5. (Fig. S11M ). Therefore, in the following causal inference on the impact of emotions on circulation, negative emotions other than anger are not considered separately. Emergent events in particular those disastrous ones always spur fake news items and social and joy in their emotional distributions to testify our findings in the circumstance of specific emergence events. Using the emotion lexicon built in this paper, the emotional distributions of 200 fake news items are inferred. It is consistently found that HF news carries more anger and less joy than LF news. The dominance of anger to joy (the occupation of anger minus that of joy) is significantly larger in the group of HF news (T ∼ 2.851, P ∼ 0.006) (Fig. S12 and Table S15 ). However, it should be noted that here we only support a case study on the fake news that Carrying more anger but less joy is significantly associated with the fast spread of fake news. To further examine the causal impact of anger and joy on the circulation of news online, variables that might be correlated with the spread should be comprehensively considered and controlled. In addition to emotions inferred from texts, other factors such as content (39), user profiles (2), and external shocks such as disaster events (9) evidence from previous efforts of the impact of age on spread is inconsistent (2, 21) . In the meantime, according to the annual report 6 , most Weibo user ages are concentrated in a narrow range between 18 and 30 years old, so the impact of age could be trivial because of context dependence. Also, according to recent results in (21) , ages of users are associated with topics of the content, e.g., the one with ages more than 60 are more likely to post/repost tweets in politics, hence here in our model, the factor of user age could be indirectly controlled through topics that comprehensively considered. Thus, age can be omitted without significant disturbance to the results. In total, the following variables will be derived and controlled: 6 https://data.weibo.com/report/index • Mention: Whether the text contains @. • Hashtag: Whether the text contains a hashtag. • Location: Whether the text contains location information. • Date: Whether the text contains date information. • URL: Whether the text contains a URL. • Length: The length of the text. • Emergency: Whether the text content is related to a disaster event. The emergency event in this study refers to the explosion accident in the Tianjin Binhai New Area on August 12, 2015, which occurred within the sampling period. • Topic: The topic discussed in the text. • Follower: The number of followers of the author. • Friend: The number of friends of the author. We calculated the length distribution of the text as the number of characters and letters. The length of LHF news has a more concentrated distributed than that of HLT news (K-S test ∼ 0.145, P ∼ 0) (Fig. S13A) , and the difference is also significant in fake news and true news (K-S test ∼ 0.134, P ∼ 0) (Fig. S13B) . Therefore, fake news may be more deliberate and planned in terms of linguistic organization, while real news is more casually narrated. However, the text length is more concentrated in HLF news (compared with LHF news, K-S test ∼ 0.073, P ∼ 0) (Fig. S13A ) and L news (compared with H news, K-S test ∼ 0.095, P ∼ 0) (Fig. S13C) , indicating that this factor might have little effect on promoting the spread of false news. The topics discussed in the news are also important features of the text. We used a naïve Bayesian topic classifier (57) to analyze the topic distributions of different types of news. The classifier was trained on more than 410,000 Weibo tweets, which were grouped into seven categories that fit the news taxonomy of Weibo: entertainment, finance, international, military, society, sports, and technology. The accuracy and F-measure are greater than 0.84, indicating good performance in topic classification. Besides, incremental training in this classifier can help solve the problem of new words. News that cannot be classified into the above seven categories is omitted in the analysis. As shown in Fig. S14 , significant differences are observed in the distribution of topics among different groups of news. Specifically, the topic of society accounts for the largest proportion in HLF news, LHF news, and F news, suggesting that fake news focuses on social issues that are closely related to peoples daily lives. Hot social topics would make fake news more likely to spread but do not necessarily make fake news widely spread because H newss proportion of society topic is lower than that of L news. Through the analysis of the above eight variables derived from content, the differences between true and fake news are examined, but many do not promote the spread of fake news. Two factors, mention and emergency, may play promoting roles in the spread of fake news; however, they only occupy small proportions of all news items, which might undermine their effect on fast circulation. We also examine the variables from the author profiles. Interestingly, whether true or fake, news with more retweets was posted by authors with more followers (Fig. S15) and friends (Fig. S16) . However, the greater numbers of followers and friends associated with true news (as compared to fake news, and is consistent with the finding in Twitter (2)) suggest that these factors might not be the key factors making fake news more viral than true news online. By controlling all these variables, we establish both logit and linear models to examine the causal impact of anger and joy on the spread of fake news. Logit and linear regression models are established to causally examine the impact of anger and joy on the spread of fake news. Note that for emotion variables, we focus primarily on anger and joy and combine the other emotions into other emotions. Note that there is a linear relationship between emotion-related variables because the ratios of the five emotions sum to 1. All the control variables from content, user profiles, and the external shock, as presented above, are comprehensively introduced into both models. The logit model is defined as where • p f ake is the probability of fake news. • β 0 is the intercept. • β 1 , β 2 , ..., β 13 are the coefficients of variables. • v 1 , v 2 , ..., v 13 represent anger, joy, other emotions, follower, friend, mention, hashtag, location, date, URL, length, emergency, and topic. • Mention, hashtag, location, date, URL, emergency, and topic are virtual variables. Emotion variables derived from emotion distributions in the logit model are calculated for all methods, namely, emotion lexicon, Bayes, and BP1. The results of the model based on the emotion lexicon are shown in Table 1 of the main text. We hereby supplement the estimation results for the remaining two methods (Table S17 ). In all the results, the coefficients of anger are uniformly and significantly positive after controlling for all other variables, indicating that anger is causally associated with fake news, particularly news that is highly retweeted. By contrast, the coefficients of joy are significantly negative in all results, especially for HF news and H news, indicating its prevention on the spread or news, particularly fake news. The coefficients of emergency and military and the topic of society are significantly positive, while the coefficients of mention are positive but nonsignificant ( Table 1 in the main text and Table S17 ), which is consistent with our analysis in S8. Then, a linear regression model is established to further qualify the influence of anger and joy on the spread of fake news. The model is defined as where • The dependent variable N um retweet is the number of retweets within 48 hours of news release. Note that over 70% of retweets of fake news and 80% of retweets of real news occurred within 48 hours after posting (see S3). Other settings, e.g., longer than 48 hours, do not influence the results. • The independent variables are consistent with the explanatory variables of the logit model. We first estimate the linear model on fake news and then for all news, neglecting the labels of true or fake; the results can be found in Table 1 (3, 4) of the main text, in which the emotion distributions are inferred through the method based on the emotion lexicon. We also apply the linear model on emotion distributions from the other two methods, and consistent results are obtained, as shown in Table S17 (3, 6 Table S17 : (1, 4) The logit model for LT news and HF news. (2, 5) The logit model for T news and F news. (3, 6) The linear model for L news and H news. * P < 0.1, * * P < 0.05, * * * P < 0.01. It was stated that emotion expression is culture dependent (58) . Though previous results on diffusion networks (see S2) and timeline analysis (see S3) demonstrate the consistency with English tweets in Twitter and suggest the universality of our data from Weibo, more evidence on the roles of anger and joy in the circulation through regression models of causal inference is still necessary. Here six datasets publicly available online are accordingly utilized to ensure that our results could apply to English news (tweets) that from Twitter and even other mainstream news media like WASHINGTON (Reuters). These datasets include: The emotion lexicon from the National Research Council of Canada (NRC) is employed to infer the emotional distributions of all English news. It contains 14,182 words with eight emotions: anger, disgust, joy, sadness, fear, surprise, anticipation, and trust (59, 60) . The coverage of this emotion lexicon is 73.3% on the dataset used in the logit model. Though emotions carried by English news here is expanded to eight emotions, the promoting effect of anger is still significant and joy is opposite as expected (Table S18 ). These results suggest that the promoting effect of anger in spread is independent to cultural differences and our results can be confidently extended to English news. Other emotions such as disgust and anticipation are also found to be significant but with negative coefficients, implying their prevention on the spread. It should also be noted that here the linear model is not examined due to the missing of retweeting time in these datasets, i.e., whether the spread of news is sufficiently sampled cannot be assured, and consequently it would be problematic to treat the number of retweets directly as dependable variables. Since whether the news in the Dataset S1-4 is true or fake is not labeled, Dataset S5 of containing 21,417 true news (with 11,264 political news and 10,133 world news) and 23,481 fake news (with 9,050 news, 6,718 political news, 1,548 government news, 4,415 left-news, 781 U.S. news, and 776 Middle-east news) is further utilized to verify the divergence of anger between true and fake news. Note that true news may from sources such as WASHINGTON (Reuters) and Twitter, hence here the texts of title and body of the news are jointed together to perform the emotion inference (the coverage of emotion lexicon is nearly 100%). As expected, anger occupation in fake news is higher than that in true news (true news ∼ 0.110, fake news ∼ 0.123, K-S test ∼ 0.108, P ∼ 0). There is also a very small dataset (Dataset S6) of fake news containing 117 LF news (tweets) with emotions and 361 HF news (tweets) with emotions. It is consistent to results from Weibo (see Table 1 ) that HF news in Twitter carries more anger than the counterpart of LF news (LF news ∼ 0.020, HF news ∼ 0.142, K-S test ∼ 0.416, P ∼ 0). To sum up, results from these supplementary datasets of English news consolidate our conclusions derived from Weibo and support that independent to cultures and platforms, fake news carries more anger than real news and anger promotes the circulation of news online. Emotions of high arousal, such as anger and joy, are associated with information diffusion, particularly information sharing (24) . To further investigate how anger and joy carried in news influence incentives underlying retweeting, which reignites the circulation of news on social media, offline questionnaires are conducted to bind the emotion divergence between fake news and real news with retweeting incentives. Due to the time consuming and intensive labor costs, it is challenging for questionnaires to cover all the fake news and true news in our data. Therefore, five typical news items respectively from groups of HLT news, LHF news, and HLF news are (Table S19- 22) , and their positions in the group can be found in Fig. S17 . The sampled texts and the keywords in these texts are distributed evenly in the embedding space of different groups of news, suggesting that they are indeed typical and representative. Notably, the selected keywords that help separate the groups of news in sampling the texts are anticipated to help strengthen the stimuli of reposting incentives, which would further enhance the impact of anger and joy. We employ a carefully designed questionnaire that is commonly used for rumor sharing motivation surveys on social media (42), which comprehensively measures four motivations of the subjects: anxiety management, information sharing, relationship management and selfenhancement. There are six items for anxiety management (Fig. S18 ), six items for information sharing (Fig. S19) , five items for relationship management (Fig. S20) and four items for selfenhancement (Fig. S21) . Each item is measured on a four-point scale (1-strongly disagree, 2-disagree, 3-agree, 4-strongly agree). There are six questionnaires in total. For each group of news items, we implement two online questionnaires, one showing the original text and one showing the text with keywords marked in red squares (Fig. S22) . Meanwhile, five news items from each group appear in each questionnaire randomly. Except for the news presented, all other circumstances in the questionnaires, e.g., author profile, posting time, and posting source, are carefully controlled to be consistent. Specifically, the difference in stimuli to the incentives of subjects is only the news itself. For the presentation of the text, we attempted to simulate the real Weibo interface by adding the background of the mobile version of the Weibo App to each news item (Fig. S22 ). For subjects who completed the questionnaires, we required them to be Weibo users aged between 18 and 30 years old (according to the 2018 Weibo user development report, this age group accounts for 75% of all users), matching users in online data as much as possible. 13 Note that subjects are not specifically targeted based on occupation or income level because we want to probe the general effect of emotion divergence on the retweeting incentives for the majority of Weibo users. More importantly, considering the widespread global impact of fake news online, revealing a mechanism that is independent of user demographics would be powerful in inspiring new cures. 13 https://data.weibo.com/report/index We hired a well-reputed online survey company 14 Table S23 . The collected responses are also publicly available at https://doi.org/10.6084/ m9.figshare.12163569.v2. Since subjective bias may exist, that is, the response degree might vary across different subjects, the following method is adopted to eliminate the subjective bias: where m i is the average score of all the items in motivation M i andM i − avg is the debiased average score for M i. The main text showed that the motivation of information sharing of false news is stronger than that of real news, and the motivation of anxiety management of LHF news is significantly stronger than that of news in both HLF and HLT. For responses with keywords outlined, these differences are significant and even augmented, and interestingly, the differences between LHF news and the other two groups of news are more significant in M1 (Fig. S23A) audiences of highly retweeted fake news are more incentivized in terms of anxiety management. The statistics and K-S tests are shown in Table S24 and Table S25 . Next, we divide the news in the questionnaires according to the emotions it carries with the largest occupation. News1 and News5 in LHF news are dominated by anger. Joy dominates News2 in LHF news and News1, 3, 4, 5 in HLT news. The rest of the news is dominated by other emotions. In the analysis in S13.1, we found that the marked keywords play a role in widening differences. Hence, we directly combine the responses without keywords and those with keywords according to their dominant emotions to further examine the emotions stimuli with respect to retweeting motivation. The results are analyzed in the main text, and the K-S tests results are shown in Table S26 . Furthermore, in terms of neglecting emotion dominance, all the data of the questionnaires are divided into two groups randomly to analyze the difference in motivations. Surprisingly, no significant differences were observed in the four motivations ( Carrying more anger makes fake news more viral than real online news. According to this conclusion, instead of determining new features in fake news detection, developing new cues of tagging anger on social media is a promising approach to restrain the spread of fake news at the source. Because the intervention can be implemented immediately after posting, there will be no lag in the fight against fake news. More importantly, the principle of guaranteeing the freedom of speech will be respected, and an acceptable trade-off between free sharing and fake news prevention can be achieved. By alerting users of angry tweets, audiences can be persuaded to assess them more critically before emotionally retweeting, consequently leading to less emotional and more rational retweeters. Specifically, for tweets (news) that deliver too much anger, e.g., the occupation of anger surpasses a predetermined threshold (θ), a retweeting warning could be provided on platforms such as Twitter, Facebook, and Weibo. According to a report of Facebook in battling misinformation of COVID-19, warning labels can effectively prevent 95% users from further access. 15 In accordance with this, it is very optimistically assumed here that no angry tweets with warning tags from the platform will be retweeted. To determine the value of θ, we focus on news with high volumes of retweets (HT news and HF news in our data) and define a measure to optimize θ, i.e., preventing fake news that will be highly retweeted but not real news that will be popular. The measure is denoted as β and is defined as • N HF is the number of HF news items. • N HF (≥θ) is the number of HF news items with an occupation of anger greater than θ. 15 https://about.fb.com/news/2020/04/covid-19-misinfo-update/ • N HT is the number of HT news items. • N HT (≥θ) is the number of HT news items with an occupation of anger greater than θ. The values of β for θ values increasing with a step size 0.1 and 0.05 are shown in Fig. S24 and Fig. S25 , and the values peak when θ = 0.2. In our dataset from Weibo, warning about news in which anger occupies more than 20% will efficiently and effectively prevent 46% of highly retweeted fake news and only influence the circulation of 22% of popular real news. And for all highly retweeted news in our dataset (i.e., HF+HT), in those with occupation of anger higher than 0.2, HF news accounts for 89%, implying further that our treatment can predominantly target highly retweeted fake news. Though the fraction of prevented fake news that otherwise will be widely circulated is not as high as expected, considering the intrinsic characteristics of very low cost and timely intervention, the newly presented treatment should be weighted with high priority in the toolbox of cures against fake news. Hence, it is worth trying on social media platforms such as Weibo, Twitter or Facebook to prevent the spread of fake news online at the source through this new approach. The science of fake news The spread of true and false news online Evaluating the fake news problem at the scale of the information ecosystem Protecting elections from social media manipulation Influence of fake news in Twitter during the 2016 US presidential election Social media and fake news in the 2016 election Can 'Fake News' Impact The Stock Market The effects of corrective information about disease epidemics and outbreaks: evidence from Zika and yellow fever in Brazil Fake news: the narrative battle over the Ukrainian conflict Researchers are tracking another pandemic, tooof coronavirus misinformation The spreading of misinformation online Fake news propagates differently from real news even at early stages of spreading Hidden resilience and adaptive dynamics of the global online hate ecology Anomalous structure and dynamics in news diffusion among heterogeneous individuals Scientific communication in a post-truth society The spread of low-credibility content by social bots Fake news spreads faster than true news on Twitterthanks to people, not bots Science audiences, misinformation, and fake news Less than you think: prevalence and predictors of fake news dissemination on Facebook Experimental evidence of massive-scale emotional contagion through social networks Digital emotion contagion Emotions and information diffusion in social mediasentiment of microblogs and sharing behavior Emotion shapes the diffusion of moralized content in social networks Cognitive attraction and online misinformation Emotions promote social interaction by synchronizing brain activity across individuals Talking about others: emotionality and the dissemination of social information Emotions, partisanship, and misperceptions: how anger and anxiety moderate the effect of partisan bias on susceptibility to political misinformation Mediated populism, culture and media form An emotion-based independent cascade model for sentiment spreading. Knowl.-Based Syst Compound facial expressions of emotion The role of mixed emotions in consumer behaviour What makes online content viral? Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations An argument for basic emotions Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena The structural virality of online diffusion Fast unfolding of communities in large networks Random walks, markov processes and the multiscale modular organization of complex networks Social sharing of emotion following exposure to a negatively valenced situation Least squares support vector machine classifiers Using tf-idf to determine word relevance in document queries An introduction to variable and feature selection Moodlens: an emoticon-based sentiment analysis system for chinese tweets A word2vec model for sentiment analysis of Weibo Topic dynamics in Weibo: a comprehensive study The role of culture and gender in the relationship between positive and negative affect Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon Workshop on Computational Approaches to Analysis and Generation of Emotion in Text (ACL, 2010) Crowdsourcing a wordemotion association lexicon COVID-19) tweets posted from 16 Dataset S4: 397,629 election day tweets scraped on the day of 2016 United States election in Twitter Dataset S5: 23,481 fake news and 21 478 fake news (tweets) posted during breaking news related to the events including Prince Toronto, Charlie Hebdo, Germanwings-crash, Sydney siege and etc Each English tweet contain text, retweet counts, follower counts, friend counts and etc Though there are no labels of whether fake or real on these tweets, the promoting effect of anger on retweeting can still be verified. We randomly extract 2,000 news items from each file (one file per day) in Datasets S1-3 and obtain 90,000 news items (57,508 news with retweets) related to COVID-19 totally. Besides, there are 72 h−news ) = β 0 + β 1 w 1 + β 2 w 2 + β 3 w 3 + β 4 w 4 + β 5 w 5 +β 6 w 6 +β 7 w 7 + β 8 w 8 + β 9 w + β 10 w 10 + β 11 w 11 + β 12 w 12 + β 13 w 13 + β 14 w 14 + β 15 w 15 + β 16 w 16 + β 17 w 17 17 represent variables of anger, disgust, joy, sadness, fear, surprise, anticipation, trust, follower, friend, mention, hashtag, location, date, URL, length • Topic indicates politics or COVID-19 The fat man behind the Chinese table tennis team [sneers], yes! Guoliang Liu is definitely an all-rounder. The devil trains the team members, provides shouting, cheering, wake-up services, water and towels, kiss the team members [kiss] and have to cook the noodles to reward the three troops China Cultural Relics Conservation Foundation held a special fund work symposium, Tongling in Anhui made an emergency rescue of the ancient mining site of Jinniu Cave at Fenghuangshan Copper Mine, and Guobo held the "Four Medical Books". Ben Thangka Art Inheritance Achievement Exhibition, Hubei implemented a "three-level joint review" model, accelerated the promotion of cultural relics census data review, Xinjiang held the first national mobile cultural relics census training class HLT-News4-5 selected in HLT news. Keywords are highlighted in red