key: cord-284851-gtdyexp1 authors: Green, Jon; Edgerton, Jared; Naftel, Daniel; Shoub, Kelsey; Cranmer, Skyler J. title: Elusive consensus: Polarization in elite communication on the COVID-19 pandemic date: 2020-07-10 journal: Sci Adv DOI: 10.1126/sciadv.abc2717 sha: doc_id: 284851 cord_uid: gtdyexp1 Cues sent by political elites are known to influence public attitudes and behavior. Polarization in elite rhetoric may hinder effective responses to public health crises, when accurate information and rapid behavioral change can save lives. We examine polarization in cues sent to the public by current members of the U.S. House and Senate during the onset of the COVID-19 pandemic, measuring polarization as the ability to correctly classify the partisanship of tweets’ authors based solely on the text and the dates they were sent. We find that Democrats discussed the crisis more frequently–emphasizing threats to public health and American workers–while Republicans placed greater emphasis on China and businesses. Polarization in elite discussion of the COVID-19 pandemic peaked in mid-February—weeks after the first confirmed case in the United States—and continued into March. These divergent cues correspond with a partisan divide in the public’s early reaction to the crisis. In democratic countries, the public is highly responsive to cues sent by political elites (1, 2) whose messages can encourage unity or deepen social cleavages. Because the public relies on these cues for reliable information, it is especially important that elites present a unified message during a crisis. Elites sent such a unified message after the September 11th terrorist attacks, when Republican and Democratic lawmakers issued joint statements reassuring Americans that they were safe and promising rapid retaliation (3) . However, the high levels of partisan polarization observed today among both elites and the mass public in the United States (4-6) can lead to a fractured national response, as elites send conflicting cues to citizens who are inclined to only be receptive to the messages of co-partisans (7) . In addition, once initial opinions based on these messages are formed, they may be difficult to update with subsequent factual information (8) . Coronavirus disease 2019 (COVID- 19) presents the greatest public health threat and economic challenge in modern history, with the United States having the fastest rate of growth in cases among industrialized nations as of this writing (9) . The severity of this crisis is particularly sensitive to public opinion, given that behavioral change at the individual level is integral to successfully slowing the spread of the virus. Given the high levels of polarization in the American electorate, citizens are less likely to change their behavior in ways that correspond to the consensus of public health experts if there is not a political consensus that these changes are necessary (10) . Here, we investigate polarization in elite cues on COVID-19 with a comprehensive dataset of tweets about the virus sent by members of the (current) 116th U.S. Congress between 17 January 2020-the date of the first mention of the virus by a member-and 31 March 2020. Members of Congress frequently use Twitter to communicate their political positions (11) and directly engage with voters (12) . Following recent work on discourse polarization in Congress (13), we operationalize polarization as the degree to which one can correctly classify the partisanship of a speaker based on a "unit of speech"-in this case, the text contained in a tweet about the issue. The greater one's ability to identify the partisanship of a tweet's author based only on what the tweet says, the greater the polarization. To systematically monitor the polarization of the political elite's response to the COVID-19 pandemic, we begin with a list of Twitter handles associated with members of the 116th U.S. Congress. This includes all official member accounts from a list maintained by the public affairs cable network (C-SPAN), as well as any verified account associated with a member that averaged at least one tweet per day at the time of collection and did not indicate in the profile information that the account was for campaign purposes or managed by staff. These additions are necessary because some members operate multiple accounts and the unofficial account is often more prominent than the official one in these cases. We then merged these handles with data on members' partisanship and ideology and collected their timelines via REST API (Twitter's Representational State Transfer Application Programming Interface). To identify tweets about the COVID-19 pandemic, we developed a set of dictionaries regarding subtopics related to the crisis (details in the Supplementary Materials). After flagging COVID-19-related tweets, we apply a set of preprocessing steps (outlined in the Supplementary Materials); tokenize to unigrams, bigrams, and trigrams; omit tokens that appear extremely frequently or infrequently; and train a random forest machine learning algorithm on a randomly sampled majority of the tweets using the text features and the dates they were sent. Random forests are advantageous in this context because they can account for nonlinear interactions in token usage that are difficult to capture using traditional parametric approaches (see the Supplementary Materials for a comparison). We then use this model to predict the partisan affiliation of the remaining tweets. This out-of-sample classification accuracy is taken as our measure of discourse polarization, as is established in the literature (13) . We find that members of Congress quickly polarized along party lines in their communications regarding the crisis. Not only did members of each party differ in how they discussed the issue but also Democrats tended to discuss the issue earlier and with more frequency. From 17 January to 31 March, Democrats sent 19,803 tweets about COVID-19, while Republican members issued only 11,084 or 71 tweets per Democratic to 45 tweets per Republican member. We see these trends clearly in Fig. 1A , with the first panel showing cumulative tweets from Democratic and Republican members of Congress about COVID-19. The difference in cumulative tweets between Democratic and Republicans politicians became more pronounced after the U.S. Centers for Disease Control and Prevention identified the first case of community spread in California, and this gulf continued to increase following the declaration of a national emergency. The differential emphasis on the issue itself, independent of differences in word usage, suggests that Democratic members were sending earlier and stronger signals to their constituents that they should be concerned about the crisis. In addition to differences in the volume of communications about the crisis, there are also meaningful differences in the substantive content of the signals that members of each party were sending to constituents. Figure 1B shows the greatest absolute difference in words used by Democratic and Republican members of Congress (e.g., the word "health" was used in 26% of Democratic and 15% of Republican tweets). The words most frequently used by Democrats concern public health and direct aid to workers (e.g., health, leave, and testing), while the words most frequently used by Republicans concern national unity, China, and business (e.g., together, United States, China, and businesses). We find further evidence of polarization in elite communication regarding the COVID-19 pandemic using a combination of natural language processing and machine learning techniques, which en-able us to correctly classify the partisanship of 76% of tweets based solely on the text features they contain and the dates they were sent. After creating a document-feature matrix of the text and appending the dates each tweet was sent (coded as the number of days since January 1), we hold out a random sample of 30% of the data as a test set and train a random forest on the remaining 70%. This process is repeated over 15 folds of cross-validation to account for sensitivity to observations falling on either side of the training/testing split. Following recent work in this area (13) , the proportion of tweets in the test sets that are correctly labeled as being sent by either a Democrat or Republican are used to derive our estimates of language-based polarization. However, partisan polarization is not constant over time. Initially low, it quickly and sharply increased and only slightly lessened as the severity of the crisis has become undeniable. Figure 2A shows that in the first full week after the first mention of COVID-19, accuracy is relatively low, indicating little polarization. However, polarization quickly rises, peaking during the week beginning February 9-roughly 2 weeks after the first reported case in the United States and well after the virus had begun to have devastating effects in multiple peer democracies. From there, polarization declines slightly in early-to-mid March before rising again later in the month as the parties debated the various relief packages designed to mitigate the economic damage caused by the pandemic. Overall, the tweets that members of Congress have sent about COVID-19 are highly informative of their partisanship and ideology. This is seen in Fig. 2B , which identifies the median tweet for each member in the test set in terms of predicted probability that the tweet was authored by a Republican and plots that against that member's DW-NOMINATE score, a standard left/right scaling of congressional voting behavior (14) . The band in the figure marks the range of partisan overlap-from the Democrat with the highest median predicted probability of having authored a tweet sent by a Republican to the Republican with the lowest median probability. In total, 69% of members fall outside this range, suggesting that their tweets are more partisan than those of the most similar member of the other party. Where there full political consensus throughout the entire period such that text features were in expectation uninformative for partisanship, this band would encompass all or nearly all members. Much of the variation in polarization is attributable to changes in how Republican members discuss the issue: Beginning in early March, they adopt less distinctive language and become more difficult to classify. We see this pattern in Fig. 2C , which plots weekly rates of recall (the proportion of Democratic and Republican members who are correctly identified as such) against the "no-information" rate that would expect to achieve by flipping a coin weighted to the share of tweets in the test sets sent by Republicans or Democrats, respectively. Rates of recall for Republican members are higher above the Fig. 2. Classification accuracy, partisan COVID-19 language by roll call voting, and recall above no-information rate. Plot (A) k-fold prediction out of sample by week. Classification accuracy increases over time. This suggests that Democratic and Republican members of Congress are becoming more polarized over time. Plot (B) shows the increases of political ideology of members of Congress by the median predicted probability of their test set tweets being authored by a Republican. Plot (C) shows rates of recall (recovery of true cases) by party. The lower bound is the naive probability of correctly classifying a Republican or Democratic member as such based solely on prevalence in the test sets, the upper bound displays the observed rate of recall, and the shaded area represents the increase in recall above the no-information rate. no-information rate earlier in the period, while Democratic rates of recall are high throughout. To be clear, it is not the case that Republicans are sending no meaningful partisan signals in later weeks such that the model predicts every tweet sent by a Democrat-in that case, the rate of recall for Republican members would be zero. This does mean, however, that tweets sent by Republicans in earlier weeks are more distinctive-i.e., easier to separate from tweets sent by Democratsthan those sent in later weeks. These results highlight the degree to which a political consensus regarding the COVID-19 pandemic failed to quickly materialize in the United States. A society's ability to effectively mobilize in response to a crisis of the nature and scale of COVID-19 depends in large part on its political leadership. This is apparent on two levels: First, the scale of the governmental response required to mitigate the impacts of this pandemic makes this as much a political crisis as a public health one; second, the public's reliance on elite cues and the necessity of widespread changes in individual behavior to slow the spread of disease puts abnormally high pressure on elected officials to send consistent and accurate cues regarding how citizens should think about and react to the crisis. The set of elected officials we analyze here, members of Congress, has not signaled consensus. Our analysis of tweets sent by members of Congress during the early months of the outbreak indicates that members quickly polarized around the issue, with Democrats discussing the issue earlier, more frequently, and with more emphasis on public health and direct aid to affected workers. By contrast, Republicans placed more emphasis on generalized national unity, China, and businesses. Our overall classification accuracy of 76%, with 69% of members falling outside the range of partisan overlap we identify on the issue, provides further evidence of a substantial partisan divide in how COVID-19 is discussed. This rate is similar to the results derived from an analysis of recent floor speeches in Congress (13) and is considerable given the relative brevity of each unit of speech represented in a tweet. This suggests that the response to the current crisis has followed recent patterns of polarization seen in political communication more generally. Party elites have become polarized on an increasing number of issue areas (15) , including topics that lack a clear ideological dimension (16) . In addition, while policy debates in Congress are often driven by the most extreme legislators (17) , the speed with which polarization occurred around COVID-19 is notable, particularly in the absence of obvious pressure from party activists (15) . The divergent cues sent by Congressional Democrats and Republicans correspond with a partisan divide in the public's early reaction to the crisis, with self-identified Democrats reporting significantly more behavioral change than independents and Republicans during the initial wave of the pandemic (18) . While directly identifying linkages between the two phenomena is beyond the scope of this work, we note the vast literature in political science highlighting the importance of partisan and elite cues for anchoring citizens' political attitudes and behaviors (1, 2, 19) , as well as citizens' particular attentiveness to trusted elites during times of crisis (20) . The counterfactual state in which partisan elites formed a consensus regarding the public health crisis and sent clear, consistent cues to that effect would almost certainly have led to more consistent changes in behavior on the part of the public and, in turn, a slower spread of the disease. This underscores the urgency by which political leaders must develop a bipartisan consensus consistent with public health recommendations if they intend to effectively respond to the COVID-19 pandemic. Assuming the costs of war: Events, elites, and american public support for military conflict Follow the Leader? How Voters Respond to Politicians' Policies and Performance Expressing the sense of the Senate and House of Representatives regarding the terrorist attacks launched against the United States on Affect, not ideology: A social identity perspective on polarization The ideological foundations of affective polarization in the U.S. electorate Uncivil Agreement: How Politics Became our Identity What is the "science of science communication Framing, motivated reasoning, and opinions about emergent technologies An interactive web-based dashboard to track COVID-19 in real time Cultural cognition of scientific consensus Proceedings of the 2013 Conference on Computer Supported Cooperative Work Twitter use by the U.S. congress Measuring group differences in high-dimensional choices: Method and application to congressional speech The Dance of Ideology and Unequal Riches Party polarization and "conflict extension" in the american electorate Beyond Ideology: Politics, Principles, and Partisanship in the US Senate Representational Style in Congress: What Legislators Say and Why it Matters New coronavirus polling shows americans are responding to the threat unevenly Shortcuts versus encyclopedias: Information and voting behavior in california insurance reform elections Anxious Politics: Democratic Citizenship in a Threatening World Package 'rtweet Sonnet, Package "rvoteview Tweets by members of congress tell the story of an escalating covid-19 crisis Multinomial inverse regression for text analysis Permutation importance: A corrected feature importance measure led data visualization. All authors contributed to the interpretation of results and the writing of the paper. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data Elusive consensus: Polarization in elite communication on the COVID-19 pandemic We would like to thank the Ohio Supercomputer Center for use of the machine and for giving this COVID-19-related research queue priority. We would also like to thank the Ohio State University Political Science workshop and the Lazer Lab at Northeastern University for helpful feedback and suggestions. Funding: S.J.C. Supplementary material for this article is available at http://advances.sciencemag.org/cgi/ content/full/6/28/eabc2717/DC1