key: cord-0194890-v04rn775 authors: Gallo, Edoardo; Halatova, Darija; Langtry, Alastair title: Fines and progressive ideology promote social distancing date: 2020-12-22 journal: nan DOI: nan sha: f3a58198fe0c441d71a51829f2c5ca92c6b4eff6 doc_id: 194890 cord_uid: v04rn775 Governments have used social distancing to stem the spread of COVID-19, but lack evidence on the most effective policy to ensure compliance. We examine the effectiveness of fines and informational messages (nudges) in promoting social distancing in a web-based interactive experiment conducted during the pandemic on a near-representative sample of the US population. Fines promote distancing, but nudges only have a marginal impact. Individuals do more social distancing when they are aware they are a superspreader. Political ideology also has a causal impact -- progressives are more likely to practice distancing, and they are marginally more responsive to fines. The COVID-19 pandemic is causing the most significant global disruption in the post-war period, and it will impact the global economy at least until a vaccine is widely available (Chudik et al., 2020; Eichenbaum et al., 2020; Kissler et al., 2020; Polyakova et al., 2020) . In the meantime, governments worldwide are acting to prevent their healthcare systems from being overwhelmed and to reduce the number of deaths (Brauner et al., 2021; Murray, 2020; Verelst et al., 2020) . Regulations to minimize social interactions by limiting exposure to others (social distancing) or self-isolating have emerged as a primary policy tool to manage infection levels in the short-and medium-term (Hale et al., 2020; IMF, 2020) . Widespread adoption of, and compliance with, social distancing guidelines is unprecedented, and governments have used different policy tools to promote social distancing behavior. Some countries impose heavy fines on anyone found breaching social distancing measures. For instance, Singapore punishes first-time offenders with a fine of up to 10,000 SGD (approx. 7,000 USD), and repeated violators can be jailed (Singapore Ministry of Health, 2020) . At the other end of the spectrum, India and the UK relied heavily on informational messages (nudges hereafter) early on, but then introduced fines after violations turned out to be common (Hunter, 2020; Sibony, 2020) . The reality is that the novelty and scale of social distancing policies is uncharted territory, and governments lack robust scientific evidence on the relative effectiveness of different policies. We conduct an interactive online experiment to investigate the effectiveness of fines and nudges as policy interventions in promoting social distancing, and the role of network position and the contagiousness of the disease. In the fine treatment, subjects pay a fine if they do not practice social distancing. In the nudge treatment, subjects watch a 3-minute video that explains the harm to others of not distancing. To investigate the role of network position, we assign participants to either a 5-node complete network (where everyone is connected to everyone else) or a 5-node star network (where participants are only connected to a central "superspreader"). Finally, we vary the contagiousness of the disease by assigning participants to either a high or a low contagion treatment. The subject pool, recruited through Amazon Mechanical Turk (henceforth, MTurk), is a near-representative sample of the US population in terms of age, gender, and geographical location. All sessions took place in May 2020 at the height of the first wave of the COVID-19 pandemic. Our main finding is that fines significantly increase the level of social distancing by participants, while the impact of nudges is marginal. This supports the increased use of financial penalties by many governments, rather than just relying on informational messages as in the early stages of the pandemic (Hunter, 2020; Murray, 2020) . Second, network position matters. Superspreaders practice distancing more than both poorly connected peripheral individuals and individuals with the same number of interactions in a homogeneous group. Superspreaders have played a pivotal role in the COVID-19 pandemic, but, to our knowledge, research has been largely limited to the biological dimension (Adam et al., 2020; Lau et al., 2020) . There is, however, also a social dimension to being a superspreaderindividuals with many social interactions are more likely to spread the virus. Individuals are aware of being a superspreader in terms of social interactions, and our findings show that this leads to behavioral responses that are absent in the biological realm. Third, political conservatives are less likely to practice distancing. The magnitude of the effect is significant and comparable to the impact of introducing a fine. We use an instrumental variable approach to provide evidence that this relationship is causal. This is consistent with the well-documented partisan divisions that characterize the COVID-19 response in the US (Allcott et al., 2020; Gollwitzer et al., 2020) . Related literature. This study has immediate methodological implications for policymakers considering non-pharmaceutical interventions to contain the COVID-19 pandemic. It also contributes to both economics and epidemiology literatures. Below, we briefly summarize our contributions to these areas. Field experiments have become a prominent methodology to test policies because they provide clean evidence of causality in real settings (Duflo, 2020) . The COVID-19 pandemic has largely deprived policymakers of the ability to deploy field experiments due to movement restrictions including self-isolation, social distancing, and travel bans. In the last decade, several studies have run interactive web-based experiments in other disciplines (Rand et al., 2011; Shirado and Christakis, 2017) and, in a more limited fashion, within economics (Gallo and Yan, 2015b; Jackson and Xing, 2014) . A primary contribution of our work is to show how interactive web-based experiments are a novel methodology to test policies that complements existing ones and has distinct advantages. The final section of this paper contains further discussion of how interactive web-based experiments can complement and enhance studies based on sociomobility data and surveys. The paper proposes a game that is novel in the experimental economics literature to investigate social distancing decisions. It is related to the standard public good game because it explores a trade-off between a costless self-interested decision and a costly one that benefits everyone in the group (Ledyard, 1994) . The increase in social distancing in the fine intervention is analogous to the increase in contributions when there is a punishment mechanism in a public good game (Fehr and Gachter, 2000) . A crucial additional component of the social distancing game is that decisions are affected by stochastic factors in the environment -the contagion process and the partial effectiveness of social distancing. This relates our paper to experimental work on public goods with stochastic elements. In particular, Fudenberg et al. (2012) look at repeated play in a prisoner's dilemma where intended actions are implemented with noise. Further, Charness et al. (2014) investigate how the uncertainty about network structure affects subjects' ability to coordinate on efficient outcomes in games of strategic substitutes and complements. Unlike these works, in our social distancing game stochasticity originates from a probabilistic contagious process through which the disease spreads across the network. In political economy, recent studies have investigated the impact of political ideology on social distancing decisions. For instance, both Gollwitzer et al. (2020) and Allcott et al. (2020) use sociomobility data from smartphones to measure physical distancing. Looking at countylevel data, they find that Republicans do significantly less social distancing than Democrats. In contrast, we look at decision-making at the individual level, which allows us to relate individuals' political leanings to their behavior. Our findings are strongly complementary to the existing literature -we also find that conservatives do significantly less distancing. The experimental setting also allows us to use an instrumental variables approach to argue that the effect of political ideology is causal. The epidemiology literature about social distancing focuses on its macro-level impacts, i.e. how it might affect the evolution of the pandemic. Consequently, studies are mostly simulationbased. For example, Fenichel et al. (2011) explicitly incorporate optimizing behavior by a representative agent into the classic SIR model and use simulations to examine how this affects disease dynamics. More recently, there has been a profusion of studies focused on the COVID-19 pandemic. Kissler et al. (2020) and Chang et al. (2020) both simulate disease dynamics for COVID-19, and explore the effects of a variety of non-pharmaceutical interventions on the predictions. In contrast, we elicit actual behavioral responses to counterfactual policies. Aside from their direct relevance, our findings can also be used to refine simulation-based epidemiological models by informing the choice of behavioral rules. A further contribution of our work is to show how awareness of being a superspreader affects behavioral responses. The role of superspreaders has received a lot of attention in the epidemiological literature, but studies are limited to the biological component where the individual is unaware of being a superspreader (Adam et al., 2020; Lau et al., 2020) and/or empirical studies that correlate network position with health outcomes (Chen et al., 2021) . In our work, instead, we show that position in the network causes changes in social distancing behavior, and we can investigate how these behavioral responses vary with policy interventions. The remainder of this paper is structured as follows. Section 1 describes our experimental design. Section 2 summarizes our methods of data collection and presents the resulting sample. Section 3 presents the results. Finally, Section 4 contains a discussion of our results and the role of interactive online experiments in informing policy. This section describes the game, the workflow of the experiment, and treatments. Game. Figure 1 illustrates the social distancing game which corresponds to a round of the experiment. At the beginning of a round, participants are randomly allocated to one of the positions in an unweighted and undirected network (top left). The structure of the network is common knowledge. Subjects must simultaneously decide whether to practice social distancing at a known cost c > 0 (top center). After the decisions are made, one and only one participant patient zero -is randomly picked to contract COVID-19 directly. In the example in Figure 1 , patient zero is the participant highlighted in red in the top right panel. There are two benefits to practicing social distancing. First, if the individual is randomly picked to be patient zero, then she becomes infected with 50% probability, rather than for sure. Second, any individual who practices distancing cannot transmit COVID-19 to others or contract it from infected individuals. In the example in Figure 1 , patient zero decided to not practice social distancing, hence she becomes infected. COVID-19 then spreads through the network from infected to healthy individuals who do not practice social distancing through contagion with a known probability α ∈ (0, 1). The bottom panels of Figure 1 show a possible spread in the example with three individuals infected and two healthy ones at the end of the contagion process. At the end of the round, healthy individuals receive a benefit of b = 100 points, whereas those infected receive zero benefit. Any individual who chose to practice social distancing pays the cost c = 35 points, irrespective of their final infection status. Assuming self-interested fully rational agents, the model predicts that the pure strategy Nash equilibrium outcomes will be inefficient for a wide range of parameter values. In fact, given an arbitrary network, it is possible to fully characterize both equilibrium and efficient outcomes. 1 In particular, we show that, depending on the network structure, the equilibrium number of agents practicing social distancing can be below the efficient number. In this model, social distancing is essentially a merit good so it exhibits positive externalities -welfare loss may therefore occur in the absence of intervention. Workflow. In the experiment participants first play 20 rounds of the baseline game above in fixed groups of five subjects, with positions on the network being randomly reallocated in each round. Notice that the participants are not informed about the decisions and outcomes of other members of their group at any point during or after the experiment. After the initial 20 rounds, their group is treated with either the fine or the behavioral nudge policy intervention. In the fine treatment, whenever a participant decides not to practice social distancing she receives a fine of f = 15 points, independent of her infection status at the end of the round. In the nudge treatment, every participant must watch a 3-minute video explaining how failing to practice social distancing may harm others. The same group of 5 participants then play 20 rounds of the game under one of the policy interventions. Hence we investigate the impact of each policy on social distancing behavior using a within-subjects design and the relative effectiveness of the two policies with a between-subjects design. Once participants have completed the interactive part of the experiment, they proceed to the post-experimental questionnaire, with basic demographics questions and a set of knowledge and attitudes questions on a range of topics including the COVID-19 pandemic, social distancing, religion, global warming, ideology, and political affiliation. Finally, participants complete the Bomb Risk Elicitation Task (BRET) (Crosetto and Filippin, 2013) to elicit risk preferences. Subjects are explicitly primed to think about COVID-19 in the instructions by naming the disease and describing the main symptoms according to the guidelines by Centers for Disease Control and Prevention. Treatments. The focus of our experiment is on three treatment dimensions. First, we consider two 5-node network architectures for the structure of interactions among participants: (1) a complete network, where everyone is connected to everyone else, and (2) a star network, where one node is connected to all other nodes which are not connected between themselves. Next, we vary the level of contagiousness of COVID-19 to be either low (α = 15%) or high (α = 65%). Finally, we focus on two intervention methods: (1) a fine for not practicing social distancing f = 15 points, or (2) a behavioral nudge in the form of an informational video highlighting the harm caused to others by not practicing social distancing. Consequently, we have a 2 × 2 × 2 full-factorial design, with a total of 8 treatments. In this section, we summarize our data collection methods and present the resulting dataset. The section also includes a description of the key demographics of the final sample and a convergence analysis of the decision data from the experiment. Following standard practices in interactive online experiments (Gallo and Yan, 2015b; Suri and Watts, 2011) , we first recruit a standing panel of subjects using a short survey on MTurk. During recruitment, we use the 2018 US Census data to generate a representative panel of the adult US population in terms of age, gender, and geographical location (U.S. Census Bureau, 2019) . As part of recruitment, we also collect subjects' social preferences using the Social Value Orientation (SVO) task (Murphy et al., 2011) . The SVO task classifies individuals into four categories -prosocial, individualistic, competitive, and altruistic -although a typical sample is predominantly composed of individualistic and prosocial individuals only. For each of the experimental sessions, we invite a random representative sample of subjects from our standing panel. 2 Despite the possibility of selection bias in participating in the experimental session, we obtain a near-representative sample of n = 400 participants completing the experiment. To check there are no biases in assigning subjects to treatments, we run a chi-squared test on participants' assignment to treatment (a categorical variable with 8 options) with respect to gender, age category, and geographical location variables. Results indicate that subjects in all eight experimental treatments are not significantly different from each other (two-sided χ 2 , p > 0.05). The experiment took place between May 4th and 29th, 2020. Each session had between one and five independent groups of five subjects playing simultaneously, with the exact numbers depending on turnout. The experiment lasted on average 33 minutes, and subjects earned an average of 5.81 USD (including a fixed participation fee of 1 USD). Subjects remained completely anonymous throughout the experiment, and informed consent was obtained from subjects before participation. All 400 subjects in our sample are residents of the US, 47% are female, and the mean age is just under 44 years. The average subject in our sample has 15.3 years of education and 70% are employed (either full or part-time). In terms of religious attitudes, 51% of our participants consider themselves religious, with 39% reporting Christianity as their primary religion. The average subject in the sample is moderately risk-averse, with a CRRA score of 0.53 − 0.55. With respect to social values, our sample mostly consists of prosocials (58%) and individualists (41.75%), and only one subject is classified as competitive. 3 Overall, the final dataset contains 40 decisions for each of the 400 subjects for a total of 16,000 observations. Figure 2A shows the evolution of average individual propensity to social distance aggregated over contagiousness and network treatments. The figure shows a downward trend in social distancing behavior in the first rounds. Such behavior is typical for public goods experiments whereby contributions to the public good decline as the session progresses (Ledyard, 1994) . In this paper, we are interested in the limiting outcomes of this convergence process, and therefore we would like to restrict our attention to those rounds where the majority of subjects have settled on some stable strategy. In particular, we define that a participant has converged to a stable strategy by a certain round if she used this strategy for the previous four rounds, and in all subsequent rounds, the number of consecutive deviations from the chosen strategy does not exceed two. 4 Given the convergence definition above, we find that at least 80% of participants converge to a stable strategy in every treatment after the first 10 rounds. Our main analysis, therefore, focuses on the last 10 rounds of the baseline and intervention in order to control for the time trend due to learning and experimentation by subjects. The results are robust to including all of the data. 5 This section investigates the determinants of social distancing decisions using both non-parametric and parametric methods. It includes a discussion of the impact of political ideology and the instrumental variable approach we use to identify its causal effect. The non-parametric analysis uses the Mann-Whitney U-test (MW hereafter) (Mann and Whitney, 1947) for unmatched samples and the Wilcoxon Signed-Rank test (WSR hereafter) for matched samples. Samples are aggregated at the group level and matched when they are generated by the same groups, and they are unmatched when making comparisons across groups from different treatments. 6 We use a Random Effects Logit (henceforth, REL) to model the binary choice of whether to practice social distancing. Formally, it models a random utility model with individual-specific random effects: where y it is the decision to practice social distancing, x it is the set of controls, v i are individual specific random-effect and it is an error term with a logistic distribution. It allows individualspecific propensity to practice social distancing, but assumes that the impact on utility of a 3 See Online Appendix for a more detailed description of the demographics of the sample. 4 We consider three types of convergence strategies. In both networks, we look at the strategy where the subject always chooses the same action. For the star network, we also consider two extra strategies. In one strategy the participant always chooses the same action when she is in the superspreader position and the complement action when she is peripheral. In the other strategy, she always chooses the same action when she is the superspreader and alternates between the actions when peripheral. 5 Our analysis is robust to using a more/less conservative definition with one or three consecutive deviations allowed. See Online Appendix for a complete description of our robustness checks. 6 All results using these non-parametric tests are robust to using their parametric analogs, i.e. t-test for unpaired and paired samples. 7 variable x is not individual-specific. We cluster errors at the group-level. Our main specification (column 5 in Table 1 ) includes the following five categories of independent variables x it : (1) dummy variables for the experimental treatments (fine intervention, nudge intervention, contagion level, and node types), (2) demographics controls, (3) location and institutional controls, (4) preference controls, and (5) ideology controls. P-values throughout this section are presented for a single specification, but, as Table 1 shows, results are robust across all specifications. 7 A preliminary question is whether the policy intervention has an impact on the aggregate level of social distancing. We conduct a Wald test to identify a structural break. This test assumes that there is at most one structural break in the data, but is agnostic as to when it occurs and whether it occurs at all. In the fine intervention, a change in the evolution of social distancing occurs at round 21, immediately after the implementation of the fine (Wald test, p < 0.001 for all data and subdivided by the other treatments). The same break at round 21 occurs in the nudge intervention in the aggregate data (Wald, p < 0.001), but the result is not robust once we subdivide by the other treatments. The introduction of the fine therefore has an immediate impact on social distancing behavior, while the impact of the nudge is less clear-cut. Results. Fines increase the level of social distancing, and the effect is significant both in the non-parametric (WSR, p = 0.0008) and in the regression (Specification 1, S1 hereafter, p = 0.001) analyses. The increase in social distancing behavior is present both in low (WSR, p = 0.04) and high (WSR, p = 0.003) contagion environments. In the last 10 rounds of the fine intervention, there is an 8% increase in social distancing compared to the last 10 rounds of the pooled baseline so the effect of the fine is sizeable and long-lasting. In contrast, the impact of the nudge intervention is marginal and not robust to all specifications (WSR, p = 0.008; S1, p = 0.03). Interestingly, narrowing the focus to a specific contagion environment, the nudge significantly increases distancing with high contagion (WSR, p = 0.0003) but it has no effect with low contagion (WSR, p = 0.5). Comparing the two policies, the impact of the fine is marginally higher than the nudge (MW, p = 0.08). Overall, the increase in social distancing in the last 10 rounds due to the nudge intervention is only 2%. A crucial driver in the spread of COVID-19 is the presence of superspreaders -individuals who go on to infect a much larger number of others than the average (Adam et al., 2020; Lau et al., 2020) . A primary determinant of being a superspreader is biological -something that cannot be varied experimentally and is outside of the scope of this study. However, there is also a social element driven by the wide variations in the number of social interactions across individuals (Jackson and Rogers, 2007) . An important difference between biological and social drivers of being a superspreader is that an individual is aware of the latter, and therefore the tendency to social distance may vary with position in the network. Figure 2B illustrates the three positions in our experiment drawn with the node size proportional to the amount of social distancing in that position for each policy treatment. On the left side, there is the complete network in which all participants are in what we dub a close-knit position. On the right side, there is the star network with one participant -the superspreaderinteracting with all others in what we dub a peripheral position. Superspreaders practice more social distancing than peripheral participants both in the baseline and policy interventions (MW and S1, p < 0.0001 for all). Moreover, superspreaders also practice social distancing more than close-knit participants (MW, p = 0.004; S1, p = 0.03) despite having the same number of interactions. This may be because they are aware of their central role in spreading COVID-19 and want to protect the group, or because they realize peripheral participants are less likely to distance due to their isolated position. High contagiousness leads to significantly more social distancing (MW, p = 0.01, S1, p < 0.0001), and the size of the effect is similar to the difference between being in the close-knit and peripheral positions. Social distancing is 10% and 15% higher in the high contagion setting than in the low contagion setting in baseline and intervention respectively. Social preferences play an important role in the decision to social distance. First, an analysis of equilibrium decisions with self-interested individuals predicts levels of social distancing that are significantly lower than participants' decisions in all treatments (MW, p < 0.0001). Second, using the SVO task, we classify subjects into individualistic and prosocial. As Figure 3A illustrates, prosocial participants are 20% more likely to choose distancing (S2, p < 0.0001). In terms of sociodemographics, females (S2, p = 0.004) and older individuals (S2, p < 0.0001) are more likely to distance. From the data, being female translates to a 12.6% increase in the probability of distancing, and there are no significant interactions between gender and the effect of each policy intervention. Social distancing is increasing with age -an average 60year-old is 40.1% more likely to distance than an otherwise identical 20-year-old. Interestingly, older subjects are relatively more responsive to the nudge (S3, p = 0.02) and relatively less responsive to the fine (S3, p < 0.0001). Whites are less likely to distance (S2, p = 0.04), but the effect is not robust. Education, religion, and employment status are not associated with distancing behavior. These associations between sociodemographic characteristics and distancing are broadly consistent with evidence from sociomobility (Baradaran Motie and Biolsi, 2020) and survey-based data (Galasso et al., 2020; Masters et al., 2020; Papageorge et al., 2020; Pedersen and Favero, 2020) . We elicit participants' risk preferences using BRET. As expected, risk-seeking individuals are less likely to practice social distancing (S2, p = 0.001), and the effect is about half the size of being prosocial in the SVO task. The post-experimental survey includes a question asking why subjects choose to stay at home -participants who list "protecting others" as one of the stated reasons are more likely to social distance (S2, p = 0.007) and the effect is similar in size to being risk-averse. There is no association between distancing and the geographical evolution of the pandemic as captured by new cases, cumulative cases, and total deaths at the state level on the day of participation in the experiment. Similarly, institutional decisions such as the timing of stay-athome orders in different states are unrelated to social distancing decisions. 8 Ideology. We find that self-reported Democrats are significantly more likely to practice distancing than Republicans (S4, p = 0.02). However, this binary classification is a very coarse measure. To obtain a more fine-grained picture of participants' political leanings, we construct an ideology index, with a score for each subject based on responses to questions about (1) support for President Trump's handling of the COVID-19 pandemic, (2) support for universal healthcare, and (3) belief that social distancing measures impose unjustified economic costs. Questions are on a 5-point Likert scale, so we obtain a 0-12 index with 0 and 12 indicating extremely progressive and extremely conservative participants respectively. Figure 3B shows that the probability of practicing distancing decreases the more conservative a participant is -an increase in ideology index from 1 to 5 (25th to 75th percentile among our subjects) corresponds to a 14.6% decrease in the probability of distancing. The ideology index is a highly significant correlate of social distancing decisions (S5, p < 0.0001), in line with recent sociomobility and survey-based studies in the US (Allcott et al., 2020; Barrios and Hochberg, 2020; Simonov et al., 2020) . Our experimental design exogenously varies policy intervention, contagiousness, and the network, allowing us to make causal inferences on these dimensions. The same is obviously not possible with ideology. We investigate the causal effect of ideology with an instrumental variable approach (IV hereafter), using a measure of participants' skepticism of global warming as the instrument. As a partisan issue in the US, attitudes toward global warming are strongly correlated with political ideology (Hornsey et al., 2016; Pew Research Center, 2019) . The exclusion restriction our IV identification relies on is that global warming attitudes do not affect 11 social distancing decisions separately from their association with ideology. Specification 6 (S6) in Table 1 is identical to our main specification (S5), except that it uses an instrument for ideology. The instrument is an index of climate change attitudes -constructed from subjects' beliefs that global warming is (1) happening, (2) caused mostly by humans, and (3) affecting weather in the USA. 9 Political ideology is a significant causal determinant of social distancing, with conservatives less likely to practice distancing (S6, p = 0.04). Figure 3C illustrates an interesting further interaction between political ideology and the type of policy intervention. While the nudge intervention has no differential impact on subjects with different ideologies, conservative participants are marginally less responsive to the fine (S6, p = 0.08). Note that using the ideology index itself (as in specification 5) instead of the IV approach (as in specification 6) has very little effect on the results. Interpreting the results. The nonlinearity of the REL model makes interpreting the point estimates in Table 1 difficult. The partial effect of any given variable depends on both the initial value for that variable and on the values of all variables. For a given individual, the partial effect of changing a variable from x to x is not the same as that of changing it from x to x . As a consequence, the estimated partial effect is different for every person. To meaningfully compare the effect of different factors, we compute the Average Partial Effects (APE) calibrated on the characteristics and/or decisions of the subjects in our experiment. To do this, we calculate the partial effect for each participant in the experiment, and then take the simple mean over these individual-specific partial effects. For the individual-specific partial effect, we compare the estimated probability of distancing for each subject for two different values of the variable of interest while keeping all other variables at their observed value for that individual. For binary variables, we compare 0 and 1. For continuous variables, we compare the 25th and 75th percentiles (based on the experimental data). Figure 4 shows the mean partial effects (APEs) for all variables that are statistically significant in our main specification, as well as their 25th, 50th, and 75th percentiles. Blue bars indicate the impact of our treatment variables. A high contagion setting and being in the peripheral network position have the largest impact with 18% and −22% respectively. The fine intervention and being in the superspreader position have a moderate impact of about 8%. The smallest impact is the nudge intervention with only a 4% effect. Yellow bars indicate sociodemographics correlates. Increasing age from the 25th to the 75th percentile in our data has a 23% effect on distancing. Gender and race have a moderate effect of around 10%. Orange bars display the effect of correlates related to preferences. Being prosocial is associated with a large 20% effect on distancing, while risk preferences and the desire to protect others have a moderate impact of about 10%. Finally, the purple bar shows that political leanings have a moderate 12% effect, which we argue is causal in our analysis. Omitted from Figure 4 are a set of demographic, institutional, and geographic factors that are not statistically significant. A subject's level of education, religion, and labor force participation are all unrelated to their social distancing decisions. Also insignificant are their geographic location (as measured by Federal Region), the number of COVID-19 cases and deaths in their state, and population density. Finally, Figure 4 clearly shows how the partial effect of any given variable is highly heterogeneous across subjects. In fact, the range in partial effects from the 25th to 75th percentiles exceeds the mean in many cases. 9 We used the questions from . The sudden disruption brought about by the COVID-19 pandemic demanded a quick response from policymakers who had to implement novel social distancing policies whose success determined the fate of tens of thousands of lives. Our work provides some of the first experimental evidence on how to enforce these policies and how their efficacy varies across social and individual characteristics. Additionally, it shows how the novel methodology of interactive web-based experiments can play a crucial role to inform policymakers, and complement data obtained using sociomobility and survey-based studies. Our first main finding is that fines are effective at increasing social distancing, while the impact of nudges is marginal. A limitation of our design is that we examine one specific form of nudge -a video that emphasizes the harms to others of failing to practice social distancing. However, the efficacy of nudges is sensitive to the content, media, and cultural contexts. For instance, Banerjee et al. (2020) show that nudges in the form of 2.5-minute information videos are effective in encouraging social distancing in the Indian context. Despite their low impact in our study, future research should continue exploring the efficacy of nudges in promoting social distancing because their ease of deployment makes them a particularly attractive tool for policymakers. The second main finding is that conservative-leaning individuals are less likely to practice social distancing in the US. While this tendency has been documented in other empirical studies (Allcott et al., 2020; Gollwitzer et al., 2020) , we provide evidence of a causal relationship using an IV approach. An open question is whether this relationship is peculiar to the US context or it extends to other countries with a less stark partisan divide. In terms of methodology, this study highlights the important role interactive web-based experiments can play in providing timely information to policymakers to test the effectiveness Figure 4 : Average Partial Effects (APE) for variables that are significant in the REL main specification. Bar boundaries indicate 25th and 75th percentiles, vertical line inside bars indicate 50th percentile, and black dots show the mean. Blue bars indicate treatment variables. Orange and yellow bars indicate effects from social preferences and demographics respectively. Purple bar is the ideology variable. Variables not included because they are not significant include: education, religion, labor force participation, case numbers, and geography controls. of policies. Experiments are the gold standard to test the causal impact of policy interventions, but lab and/or field experiments were severely constrained during the COVID-19 pandemic, depriving researchers of the standard tools to identify causal effects (Haushofer and Metcalf, 2020) . Web-based experiments, however, were unaffected. Moreover, they can be deployed quickly, scaled up at minimal cost, and easily reach a diverse sample and/or specific populations of particular interest. We believe they should be part of the standard policymaker toolkit to test the impact of novel policies before their implementation to the general population. Evidence regarding behavioral responses to social distancing policies has so far been mostly based on either sociomobility data or survey studies based on self-reported claims about hypothetical behavior. Aside from a clean identification of causality, web-based interactive experiments complement these methodologies and present some further distinct advantages. Mobility data provides detailed information on behavior in real-life settings that can verify the external validity of web-based experiments. For instance, recent studies validate our finding that conservative-leaning individuals practice less social distancing (Allcott et al., 2020; Barrios and Hochberg, 2020) . However, it is usually stripped of any personal information, including sociodemographics, so that inferences about the effects of these variables can only be made for geographic units rather than individuals. Moreover, sociomobility data is unable to shed light on the effectiveness of counterfactual policies, while policymakers frequently need information in advance of implementation. Surveys are a standard methodology for fast, large-scale data collection, but our study raises some question marks about their accuracy in gauging actual behavioral responses. In our postexperimental survey, we ask participants to estimate how effective fines and nudges are in promoting social distancing. Participants significantly overestimate the efficacy of nudges compared to their actual choices in the experiment independently on whether they are in the fine (MW, p < 0.0001) or nudge (WSR, p = 0.001) treatment. This difference between surveybased self-reported intentions and actual behavior is consistent with recent evidence on the short-lived efficacy of nudges (Brandon et al., 2017) . In other words, picking a policy based on self-reported measures may lead policymakers to opt for the wrong solution. These results caution against excessive reliance on survey-based self-reported data to gauge behavioral responses. Policy responses to COVID-19. Retrieved from https://www.imf.org/en/Topics/imf-and-covid19/ Policy-Responses-to-COVID-19. This Appendix contains additional information. Section A.1 presents the theoretical framework and derives the hypotheses that we investigate in the experiment. Section A.2 summarizes our experimental design along with implementation and describes the resulting dataset. Section A.3 contains details of the analysis and is followed by Section A.4 which describes our robustness checks. In this section, we lay down the theoretical model that underpins our experiments and summarizes the hypotheses that we test. The model is similar to those found in the theoretical economics literature on self-protection against a contagious process on fixed networks (Acemoglu et al., 2016; Cerdeiro et al., 2017) . Consider a set of N = 1, 2, . . . , n risk-neutral agents on an unweighted and undirected network G. If a link between agents i and j is present, then G ij = G ji = 1. Otherwise G ij = G ji = 0. Each agent i simultaneously chooses whether to practice social distancing, at a cost c > 0 to herself. After agents make this choice, exactly one agent is chosen uniformly at random to be exposed to COVID-19. Call this agent patient zero. If patient zero is practicing social distancing, then she becomes infected with probability γ < 1, and cannot pass COVID-19 on to anyone else. Otherwise, she becomes infected with probability 1 and passes it on to each of her neighbors who are also not practicing social distancing uniformly at random with probability α ∈ [0, 1]. Any other agent j who becomes infected with COVID-19 passes it on to each of her own neighbors who are not practicing social distancing with probability α ∈ [0, 1], again uniformly at random. An agent does not pass it on to any neighbor who is practicing social distancing. Therefore, an agent who practices social distancing may only become infected if she is patient zero. At the end of the game, an agent receives a benefit b > c if she is not infected, and 0 otherwise. Additionally, an agent pays a fine, f ≥ 0 if she did not practice social distancing -regardless of whether she infected anyone else or became infected herself. Note that setting f = 0 means no fine is present. When a subset of agents S ⊆ N practice social distancing, the probability that agent i becomes infected is p i|S i . This depends on the network structure and on who practices social distancing. Notice that if i ∈ S then p i|S i = γ/n, since she can only become infected if she is patient zero. We assume that all agents are self-interested -they care only about themselves and so ignore any effects their choices have on others. Then agent i receives an expected payoff, π i , of: (2) Parameterization. We now set out some of the parameter values and the networks we use in the experiment. Doing so allows us to state the hypotheses clearly. We have two different networks: a 5-node complete network, and a 5-node star network. In the complete network, each agent is connected to every other agent. In the star network, one agent is connected to every other agent, and there is no other link in the network. We call agents in the complete network "close-knit", agents at the center of the star "superspreaders", and agents on the arms of the star "peripheral". Denote these agents C, S, and P respectively. We set the cost c of practicing social distancing equal to 35 points, the benefit b of avoiding infection equal to 100 points, and the probability γ of patient zero becoming infected if she is practicing social distancing equal to 0.5. This allows us to investigate how the Nash equilibrium predictions and efficiency vary as the rate of contagion α is varied on [0, 1]. Figure A1 shows the equilibrium and socially efficient sets in the complete network and the star network for all values of α. A set is socially efficient if it maximizes the sum of the expected payoffs. In the experiment, we focus on two different levels of contagion: α = 0.15 (low contagion), and α = 0.65 (high contagion). Below, we formulate a set of hypotheses that are the focus of our analysis. First, notice that fines for failing to practice social distancing make distancing relatively more attractive, and so we expect that they weakly increase the amount of social distancing in society. Note that due to the non-uniqueness of equilibria, it is possible that different agents practice social distancing when fines are present. Hypothesis 1. A fine f > 0 for agents who do not practice social distancing weakly increases the number of agents practicing social distancing. The model assumes that agents are self-interested and fully rational. Therefore, while fines can affect agents' behavior, a behavioral 'nudge' should have no impact -it neither provides new information to agents nor does it change their preferences. Hypothesis 2. A behavioral 'nudge', in the form of an informational video, has no impact on the social distancing decisions of any agent. An obvious consequence of these two hypotheses is that a fine has a weakly greater effect on social distancing decisions than a nudge. Hypothesis 3. A fine f > 0 increases the amount of social distancing weakly more than a behavioral 'nudge'. The model predicts that the network position also plays an important role in determining social distancing decisions. Even though close-knit agents and superspreaders have the same number of links -and close-knit agents actually have a higher probability of becoming infected if nobody practices social distancing -superspreaders practice weakly more social distancing. This is because superspreaders are surrounded by peripheral agents who, with only one link, have a low incentive to practice social distancing themselves. This leaves it all up to the superspreader to protect themselves. Both superspreaders and close-knit agents practice more social distancing than peripheral agents. They have more links, and so a higher probability of becoming infected (if nobody else practices distancing). Hypothesis 4. Superspreaders practice more social distancing than close-knit agents, and in turn, close-knit agents practice more social distancing than peripheral agents. We can see this from Figure A1 , which sets out the equilibrium and socially efficient outcomes. In the low contagion setting, no agents practice social distancing in equilibrium (in either the star or the complete network). In the high contagion setting, a superspreader always practices distancing, a (randomly chosen) close-knit agent does so three-fifths of the time, while a peripheral agent never does so. Further, agents are more likely to practice social distancing in the high contagion setting than in the low contagion setting. This is because an agent faces a greater chance of becoming infected by her neighbors. Hypothesis 5. Conditional on her position in the network, an agent practices more social distancing in the high contagion setting than in the low contagion setting. Regardless of their position in the network and the level of contagion, self-interested agents do not account for the benefits they provide to others -in the form of reduced infection riskwhen they practice social distancing. Therefore, we expect that agents practice less social distancing than would be optimal. Hypothesis 6 confirms this intuition for the complete network, while Hypothesis 7 shows that there can be too little or the right amount of social distancing (relative to the social optimum) in the star. Moreover, for certain other values of α, too much social distancing is also possible, due to the presence of multiple equilibria in the star network. Hypothesis 6. In the complete network, the Nash equilibrium involves fewer agents practicing social distancing than the social optimum. Proof. First, let us look at the Nash equilibria. Replace the probability of becoming patient zero with its actual value in the 5-node network -i.e. 1/n = 0.2. Notice that, if the expected payoff of practicing social distancing is weakly greater than that of not doing so when everybody else practices social distancing, then everybody should opt where p i|∅ is the probability that the close-knit agent becomes infected given that nobody practices social distancing, then the unique equilibrium is such that nobody practices distancing. Between these two boundaries on the cost of social distancing, given a fixed parameterization, the number of agents practicing social distancing in a Nash equilibrium rises monotonically. For example, when where p i|3 is the probability that the close-knit agent becomes infected given that three other close-knit agents practice distancing, four agents practice social distancing in the unique equilibrium. It is easy to show, that nobody wants to deviate from the chosen strategy if the above conditions hold. The conditions which ensure that two or one agents practice social distancing in a Nash equilibrium are determined analogously. Let us now turn to efficiency. First, note that 2)b then the expected total social payoff is maximized when everybody practices social distancing. Next, there is a range of parameters, for which efficiency demands that exactly four agents practice social distancing. This requires that c > 0. Other intervals can be established analogously. The top panel of Figure A1 shows how equilibrium and efficient sets of agents who practice social distancing in a complete network change when the rate of contagion α is varied. All other parameters are fixed according to our parameterization. Observe that in the low contagion environment (α = 0.15) nobody practices social distancing in the unique Nash equilibrium, yet, efficiency demands that exactly two agents undertake social distancing. Similarly, in high contagion (α = 0.65), three agents practice social distancing in equilibrium while the efficient outcome is four agents doing distancing. Next, Hypothesis 7 shows that, depending on the parameterization, the level of social distancing practiced in a Nash equilibrium in the star network can be below, coincide with, or exceed the efficient outcome. Hypothesis 7. In the star network: (1) in the low contagiousness environment, the Nash equilibrium involves fewer agents practicing social distancing than the social optimum; (2) in the high contagiousness environment, the Nash equilibrium and the social optimum coincide (with the hub agent practicing social distancing). Proof. First, let us look at the Nash equilibria. Analogously to the proof for the , then there is a unique equilibrium with all agents practice social distancing. This is because everybody should practice social distancing when the expected payoff of doing it is weakly greater than that of not doing so even when everybody else practices social distancing. On the other hand, if the expected payoff of practicing social distancing for both the superspreader and the peripheral agent is lower than that of not doing so, even if everybody else does not practice distancing, then nobody should opt for social distancing in equilibrium. Using the notation, that means that there is a unique equilibrium where nobody practices social distancing if is the probabilities that the superspreader (peripheral agents) becomes infected if nobody practices social distancing.. It follows trivially then that for all the rest of the parameter space, there is an equilibrium such that only the superspreader practices social distancing, while all peripheral agents do not. This is, however, not always the unique equilibrium. In fact, for some values of the parameters, there could also be an equilibrium where the superspreader does not undertake social distancing but (some of the) peripheral agents do. For example, in our parameterization, for α ∈ (0.72, 1] there is an equilibrium in which three peripheral agents practice social distancing. To see this, note that nobody of the three peripheral agents who practice social distancing want to deviate because where p P |2P = 0.2 + 0.2α + 0.2α 2 -the probability that a peripheral agents becomes infected if two other peripheral agents practice social distancing. Moreover, both the superspreader and the fourth peripheral Now to efficiency. Observe that, depending on the parameterization, efficiency permits only one of the three possible outcomes. The first one is the case where everyone undertakes social distancing. This is efficient if the total expected payoff of everyone practicing social distancing exceeds that of the case where all but one agent practice distancing. Mathematically, this means that Next, note that for some values of the parameters, efficiency demands that nobody undertakes social distancing. For this to be true, it must be that the total expected payoff when exactly one agent practices social distancing is lower than when nobody does distancing. It must therefore be true that Note that, for α ∈ (0, 1), the second condition is automatically satisfied when the first one holds because 0.2 is strictly less than both p S|1P and p S|1P . Finally, for 0.2(1 − γ)b ≤ c < 0.2(4 + 8α + 12α 2 − γ)b efficiency requires that the superspreader practices social distancing but not the peripherals. This follows trivially since the benefit created when the superspreader practices social distancing is larger by that of the peripheral agent, yet the costs are the same. Figure A1 shows how equilibrium and efficient sets of agents who practice social distancing change when the rate of contagion α is varied. All other parameters are fixed according to our calibration. Now to the specific values chosen in our parameterization. Observe that in the low contagion environment (α = 0.15) nobody practices social distancing in the unique Nash equilibrium, yet, efficiency demands that the superspreader opts for distancing. Conversely, in high contagion (α = 0.65), equilibrium and efficiency coincide with only the superspreader practicing social distancing. One can also show that the equilibrium for α = 0.65 is unique. When an agent practices social distancing, she gets a fixed payoff in all cases except when she is patient zero and becomes infected. This latter scenario occurs with probability γ/n. When she does not practice social distancing, her payoff depends on other agents' behavior, and on how COVID-19 randomly spreads through the network. The payoff is more variable when she does not practice social distancing. Therefore, a risk-averse agent will practice weakly more social distancing than the model predicts. There are of course many ways of modeling risk aversion, but the qualitative impact of higher risk aversion on social distancing decisions does not depend on the particular modeling choices. It will always weakly increase the amount of social distancing an agent practices. Hypothesis 8. The amount of social distancing an agent practices is weakly increasing in her level of risk aversion. An agent might also care about other agents' welfare (as captured by their payoffs). That is, they might have altruistic preferences. Choosing to practice social distancing has a spillover effect -it reduces the probability that some other agent becomes infected with COVID-19, at no cost to that other agent. The spillover can only be positive. If agent i chooses to practice social distancing, this never imposes a cost on anyone else and can provide a benefit to some others. Therefore, altruistic preferences can only (weakly) increase the amount of social distancing that an agent does. This does not depend on how one chooses to model altruistic preferences. Hypothesis 9. An agent with other-regarding (i.e. altruistic) preferences practices weakly more social distancing than in the Nash equilibrium. This section explains the design and implementation of the experiment and describes the data. Section A.2.1 describes the design of the experiment and presents the flow of the experimental session. Next, in Section A.2.2, we summarize the details of our implementation, including technical aspects and sampling procedures. Finally, in Section A.2.3 we summarize our dataset. Calibration. Throughout the experiment, we set the cost of social distancing at c = 35 points, and the benefit of not getting infected at b = 100 points. We also set the probability of patient zero becoming infected if she practices social distancing at γ = 0.5. Further, our experiment has three dimensions which are varied in our treatments. These are: • two 5-node networks: (1) a complete network, where everyone is connected to everyone else, and (2) a star network, where one node is connected to all other nodes which are not connected between themselves; • two levels of the rate of contagion: (1) low (α = 15%), and (2) high (α = 65%). • two types of intervention: (1) a fine for not practicing social distancing f = 15 points, and (2) a behavioral nudge highlighting the harm caused to others by not practicing social distancing. Consequently, we have a 2 × 2 × 2 full-factorial design, with a total of 8 treatments. We run each treatment 10 times, resulting in a final sample of 400 participants. Further details on the resulting sample are in Section A.2.3. In the remainder of this section, we set out the flow of the experiment. For expository purposes, we focus on the star network, 65% rate of contagion treatment. We also explain both fine and nudge interventions. Once a participant joins the session, she enters her ID and starts working on instructions for the first part of the experiment, which we refer to as the Baseline. To qualify for the experiment she must read the instructions and pass an understanding quiz. The instructions together with the quiz take an average of 8.8 minutes (s.d. 3.3 minutes) to complete. As part of the instructions, the participant is explicitly primed to think about COVID-19. In particular, we explicitly describe the main symptoms of COVID-19 using the guidelines by Centers for Disease Control and Prevention. Full instructions are in Section B.1. Upon completing the quiz, the subject joins a waiting room where she waits to be matched with four others. Rather than being allocated into groups on a first-come-first-served basis, group allocation is randomized. Once a group is formed, participants proceed to the Baseline where they play 20 rounds of the same game. Figure A2 presents the flow of a typical round. At the beginning of the round, participants are randomly allocated to the five positions on the network (top-left corner of Figure A2 ) which is fixed throughout the whole experiment. In this example, the network is a star. The participant is asked to privately make her social distancing decision (top-center part of Figure A2 ). Practicing social distancing costs 35 points, while not practicing social distancing is free. The participant has 20 seconds to make her decision, otherwise, she receives a penalty of 200 points and her decision is automatically recorded as a "No". A participant who fails to submit her decisions in three consecutive rounds is disqualified from the experiment and does not receive any compensation. Once everyone in the group has made their decisions, the computer randomly selects exactly one participant to be patient zero (top-right corner of Figure A2 ). If patient zero is not practicing social distancing she becomes infected with COVID-19 with certainty. If patient zero is practicing social distancing, however, she becomes infected with probability 50%. In the example in Figure A2 , patient zero decided not to practice social distancing, and therefore she becomes infected. As explained in Section A.1, since patient zero is infected and does not practice social distancing, she may pass COVID-19 to other participants. In our example, patient zero only interacts with the superspreader -who is not practicing social distancing. Therefore, the superspreader becomes infected with probability α = 0.65. In our example, contagion is successful, and the superspreader becomes infected (bottom-right corner of Figure A2 ). Next, the superspreader can potentially infect two out of three other participants in peripheral positions, since one of them is practicing social distancing and so is protected from contagion. In our example, one of the participants in the peripheral position becomes infected, which happens with probability 65%, and the other remains healthy, which occurs with probability 35%, as shown in the bottom-center part of Figure A2 . At the end of the contagion process, payoffs for the round are determined based on participants' social distancing decisions and infection status. Healthy participants receive a bonus of b = 100 points while infected ones get 0 points, minus the cost of social distancing c = 35 points if applicable. In our example, three participants end up with a payoff of zero points, one gets 65 points, and one 100 points (bottom left corner of Figure A2 ). At the end of each round, a participant is informed of her health outcome and the number of points earned and is also reminded about her position within the network and social distancing choice. This information is presented for the last five rounds, and the participant has 15 seconds to review this information. Notice that the participant is not informed about decisions and outcomes of other members of her group at any point during the experiment. Once participants complete the Baseline, they are taken to the instructions for the second part of the experiment, which we refer to as Intervention. The instructions for the Intervention differ depending on the treatment. For the fine treatment, participants are shown textual instructions explaining how the fine works. If instead, the intervention is the nudge, participants are shown a three-minute video highlighting the harm to others caused by not practicing social distancing. Further details and instructions for the two types of intervention are in Section B.1. Upon completing the instructions, participants must pass a one-question understanding quiz. Once all five participants pass the quiz, they proceed to the Intervention. In the Intervention, participants play 20 rounds of the same game as in the Baseline. In particular, the network and the rate of contagion remain unchanged. For the nudge treatment, payments also remain unchanged. In the fine treatment, participants who decide not to practice social distancing receive a fine of f = 15 points, irrespective of their health outcome. That is, a participant who does not practice social distancing in a round can receive either 85 points or -15 points depending on her health outcome. Once participants have completed the Intervention, the interactive part of the experiment is over, and they proceed to the Post-experimental Questionnaire. Here, we ask the same set of demographics questions as in the recruitment survey. In addition, we ask a set of knowledge and attitudes questions on a range of topics including the COVID-19 pandemic, social distancing, religion, global warming, ideology, and political affiliation. Finally, a participant completes a Bonus Task which is the Bomb Risk Elicitation Task (BRET) (Crosetto and Filippin, 2013; Holzmeister and Pfurtscheller, 2016) to elicit risk preferences. The participant is presented with 100 boxes arranged in a 10 × 10 matrix. Of these, one randomly chosen box contains a bomb, but the location of the bomb is unknown. The participant is asked to choose how many boxes she wants to collect. Boxes are collected from the top-left corner of the matrix, left to right, at a rate of one box per second. The participant must decide when to stop collecting boxes. The value of each box is 2 cents. After she is done with collecting boxes, the subject opens them. If one of her collected boxes contains the bomb, it explodes and reduces the participant's earnings for the Bonus Task to zero. If the bomb is not in the collected boxes, the participant earns 2 cents for each collected box. Assuming a power utility function 10 , a risk-neutral participant opens 50 boxes. If a participant opens fewer than 50 boxes, she is considered risk-averse, and if she opens more than 50 boxes she is considered risk-seeking. Instructions for the BRET together with screenshots of the interface are in Section B.3. Once the participant completes the Bonus Task, she is taken to the Payment Page, where she can see her earnings for the experiment. She can also browse her history of play in the Baseline and Intervention. All participants are paid a fixed fee of 1 USD. Additionally, participants earn a performance-based bonus for the interactive part of the experiment, as well as the Bonus Task. In particular, to reduce wealth effects (Charness et al., 2016) , subjects are paid for 4 randomly chosen rounds in Baseline and Intervention. We conducted the experiment on Amazon Mechanical Turk (henceforth, Mturk; www.mturk. com) between May 4th and May 29th, 2020. The experiment involves two separate processesrecruitment and the main experiment. We used Qualtrics (www.qualtrics.com) for recruitment, and the main experiment was programmed in o-Tree (www.otree.org) (Chen et al., 2016) with a server deployed on Heroku (www.heroku.com). For recruitment, we set qualifications on MTurk to make our survey visible to US residents who have completed at least 500 Human Intelligence Tasks (HITs) on the platform and have an approval rate of at least 96%. We also utilize location and age qualifications provided by MTurk to get a sample that is broadly representative of the US population. The recruitment survey takes an average of 5 minutes (s.d. 3 minutes) to complete and pays a fixed reward of 1 USD. As part of the survey, we collect information about participants' age, gender, experience with decision-making experiments, and self-reported attitudes to risk (Weber et al., 2002) . Additionally, we inform participants about the upcoming interactive experiment and collect their consent for participation. Further, the survey contains several questions that test understanding of basic concepts of probability. Only those participants who correctly answer the qualifying questions and give their consent for participation are subsequently invited to the main experiment. Participants who correctly answer the qualifying questions take part in a bonus task for a chance to win an amount in the 0.6-4.0 USD range. The bonus task is the 6-item Social Value Orientation (SVO) task (Murphy et al., 2011) . The underlying idea of the SVO framework is that people vary in terms of their motivations when evaluating different allocations of resources between themselves and others. Consequently, the SVO scale identifies four different types of preferences: individualistic, competitive, prosocial, and altruistic. On a practical level, for 10 Utility of payoff x is defined as u(x) = x r , where r is the risk aversion coefficient. each of the 6 items of the SVO scale, participants are asked to choose between nine different allocations of money between themselves and another person. These preferences are then used to determine subjects' types. Further details on the distribution of types in our sample are in Section A.2.3. The instructions for the SVO scale along with screenshot of the interface from the recruitment survey are in Section B.2. We use the U.S. Census Bureau 2018 estimates of the demographics composition of US states to get the age-gender-state distribution of the adult American population (U.S. Census Bureau, 2019b) . With respect to age, we use 6 categories that match those provided by MTurk qualifications: 18-25, 25-30, 30-35, 35-45, 45-55, and 55+ . We also group US states into 10 Standard Federal Regions as defined by the Office of Management and Budget. 11 Figure A4 shows the map of the US divided into these 10 regions. The resulting distribution is presented in Table A3 . Note that we do not consider US territories. During recruitment, we use the distribution in Table A3 to generate a diverse standing panel of qualified participants. We then email a random sample from the panel, inviting them to the experiment. In the invitation email, we ask participants to search for the experiment on MTurk at the specified time and click on the link provided in the description of the HIT. Doing this transfers them to our application on Heroku. For a typical session of the experiment, we invite a representative sample of 150-200 participants to fill 30-35 places on a first-come-first-served basis. We rely on the distribution in Table A3 to determine the demographics of the invitees and then draw a random sample with the required characteristics from our standing panel. Throughout the experiment, we keep track of the demographics of those participants who have completed the experiment to minimize overand under-sampling. Finally, to ensure random assignment into treatments, we randomize the order of experimental sessions. It is important to note, however, that it is impossible to control precisely the demographics of the participants who actually (a) show up, and (b) complete all parts of the experiment. Nevertheless, the final sample is very diverse and near-representative in terms of age, gender, and geographic location. Further details on the sample and its characteristics and representativeness of the adult US population are in Section A.2.3. On average, the experiment takes 30 minutes (s.d. 6 minutes) to complete and pays 5.81 USD (s.d. 1.1 USD) including a 1 USD fixed fee. Upon completing the experiment, participants receive a code to submit on MTurk. Throughout the experiment, we keep track of the IP addresses of those participants who already took the experiment (or have seen the instructions), and exclude participants with duplicate IP addresses from our standing panel. The experimental dataset contains decisions of 400 participants. Each participant took part in one session, so she took part in one of the eight treatments. For all treatments, we collected data for 10 groups of 5 participants each. In each treatment, participants interacted for a total of 40 rounds, so we have 16,000 decisions in total. We also match the experimental data with data from the recruitment survey. To check there are no biases in assigning subjects to treatments, we run a chi-squared test on participants' assignment to treatment (a categorical variable with 8 options) and gender, age category, and geographical location variables. Results indicate that subjects in all eight experimental treatments are not significantly different from each other (two-sided χ 2 , p > 0.05). This suggests that our randomization procedures outlined in Section A.2.2 were effective. Apart from data on subjects' decisions in the experiment, we collect data on a set of variables, which can be broadly categorized as follows: demographic controls, preference controls, and location-based controls. Table A2 presents summary statistics for some of these controls. Demographic controls. All participants are residents of the US, 47% are female, and the mean age is just under 44 years. 84% of our subjects are white, and 70% are employed (either full-or part-time). Further, we estimate the number of years of education received based on educational attainment. 12 Based on this, the average subject in our sample has 15.3 years of education. Finally, 51% of our participants consider themselves religious, with 39% reporting Christianity as their religion. Preference controls. A total of 92.5% of subjects reported that they try to stay at home as much as possible because of the COVID-19 pandemic. Of these, 61.6% (or 57% of the full sample) report that the desire to 'protect others' is one of the main reasons behind this decision. To capture subjects' political ideology, we construct an index from three questions in the We draw a vertical line on the subplots for each subject whose score in BRET/SVO is of the corresponding value. More intense line color indicates that more subjects are concentrated at that value. Post-experimental Questionnaire. These questions ask about: 1) subjects' support for President Donald Trump's handling of the COVID-19 pandemic, 2) their support for universal healthcare, and 3) their belief that social distancing measures impose unjustified economic costs. All are on a 5-point Likert scale (Likert, 1932) , and so yield a 13-point index (0-12). Responses that indicate support for President Trump, opposition to universal healthcare, and belief in the unjustified economic costs of social distancing measures are scored positively. Higher scores, therefore, indicate a more Conservative ideology. In our sample, the mean ideology score is 3.1 (s.d. 3.1), and over 75% of subjects have a score of 5 or less. We also collect information on participants' attitudes to climate change. Specifically, our post-experimental survey includes three 5-point Likert scale questions asking whether subjects believe that global warming is (1) happening, (2) caused mostly by human activity, and (3) affecting weather in the United States . The resulting climate change attitudes index is on a 0-12 scale, with higher scores indicating greater skepticism towards climate change. The mean score in our sample is 1.99. Further, as explained in the previous sections, we collect information on subjects' social value (SVO) and risk (BRET) preferences. Figure A3 presents the distributions of these preferences for our sample. The average subject in the sample is moderately risk-averse, with a BRET score of 34.73 boxes. When it comes to social values, the majority of our subjects (58%) are classified as prosocial (with SVO angle ≥ 22.45°and < 57.15°). The second-largest category in the sample are individualists (the angle is ≥ 12.04°and < 22.45°). Notice that we only have 1 subject who is classified as competitive (angle < −12.04°), and no subjects classified as altruistic (angle ≥ 57.1°). Location-based controls. As part of both recruitment and the experiment, we collect subjects' IP-addresses. We use these to infer their geographic location, down to the county-level. Figure A4 shows the region-level distribution of our sample. Table A3 . X/Y indicate participants counts, where X indicates the actual count in our sample, and Y -the target representative count. 31 Further, using location information, we create a set of location-based control variables. First, for each subject, we record population density at the county level. The average density in the sample is 2.4k per square mil (U.S. Census Bureau, 2011 Bureau, , 2019a . Second, we use location data together with data on the development of the COVID-19 pandemic in the US. In particular, for each subject, we record the number of cumulative COVID-19 cases (average 63.4k) and new COVID-19 related deaths (average 59) in the state on the day of the experiment. We also record whether stay-at-home orders were in place in the participant's state on the day of the experiment. In our sample, 62% of the subjects had stay-at-home orders in place on the day they participated in our experiment. Sample representativeness. We check the representativeness of our sample along the three dimensions as defined in Section A.2.1 -i.e. age, gender, location. Table A3 presents the distribution of the US adult population. The gender distribution of our sample is not statistically different from that of the adult US population (two-sided t-test, p = 0.11). Further, Figure A4 presents a map of the US divided into the 10 Standard Federal Region. For each region, we indicate the target number of people and the realized count in our sample. The difference between the observed and target location distributions is not significant (two-sided χ 2 , p = 0.08). When it comes to age, our distribution is unfortunately not quite representative of the adult US distribution (two-sided χ 2 , p < 0.0001). In particular, the 35-45 and 45-55 age categories are over-represented, while the 55+ is under-represented. This section presents a detailed statistical analysis. Section A.3.1 tests for a structural break in the data between Baseline and Intervention. Section A.3.2 then presents analysis at the aggregate (group) level, and Section A.3.3 focuses on the individual level. We use a Wald test to look for a single structural break in social distancing decisions over time. This involves performing a simple linear regression of social distancing decisions (left-hand side) on the round number (right-hand side), and then testing whether the coefficient on the time variable exhibits a structural break. The test assumes there is at most one structural break but is agnostic as to at what point it may occur. We perform this test on the data at three levels of aggregation. First, we consider all observations together. Second, we split by the rate of contagion and the policy intervention. This gives four equal-sized group: (1) 15% contagion, fine; (2) 15% contagion, nudge; (3) 65% contagion, fine; and (4) 65% contagion, nudge. Finally, we subdivide each of the four groups above by network type (i.e. complete vs star). If the policy intervention is effective then we expect the test would find a structural break at round 21. With all observations together, the Wald test finds the structural break at round 21 (p < 0.001). When considering the dis-aggregated data, the Wald test finds a structural break at round 21 for each of the subsets, except for complete networks with the nudge treatment. In this section, we focus on the aggregate analysis of the decision data and conduct a set of nonparametric tests. Since individual observations are correlated at the group level, our unit of observation is a group of 5 participants. Recall that we have 80 groups in total, equally distributed across 8 treatments. For each group, we calculate the average proportion of participants that practice social distancing in the last 10 rounds of both parts of the experiment. Below, we refer to this average as 'distancing levels'. We discard the first 10 rounds of both parts of the experiment because our participants display clear convergence behavior. In particular, by round 11, in both Baseline and Intervention, at least 80% of participants converge to a particular strategy in all treatments. Further details on convergence behavior is in Section A.4.1. Section A.4.2 shows that the key results below are robust to including all 20 rounds of both parts of the experiment. Results. Table A5 presents the results of the non-parametric tests -Mann-Whitney Utest (MW) (Mann and Whitney, 1947) for unmatched samples and Wilcoxon Signed-Rank test (WSR) for matched samples. Samples are matched when they are generated by the same set of groups (e.g. comparing distancing levels in Baseline with those in Intervention after the introduction of a fine). On the other hand, when we make comparisons across groups from different treatments (e.g. those that were exposed to the low rate of contagion versus the high rate) then samples are independent or unmatched. Section A.4.2 shows that all of the findings of this section are robust to using parametric analogs of these non-parametric tests. In our aggregate analysis, we test the set of hypotheses in Section A.1. Note that we cannot test Hypotheses 8 and 9 here, which are instead covered in Section A.3.3. Part 1 of Table A5 shows that the data from the Baseline from fine and nudge treatments can be pulled together as these samples are statistically identical at all conventional significance levels. This result is robust to various specifications. In Part 2 of Table A5 , we examine the effect of introducing a fine in Intervention. From the table, it is evident that a fine has a positive and statistically significant effect on distancing levels compared to the Baseline with no fine in all specifications considered. Most results in this part are statistically significant at least at the 5% level. The effect of the fine is also sizable -it results in an increase in mean distancing levels of 5% to 9% depending on the specification. The following result summarizes the above observations. Notice that the numbering of all results in this section corresponds to the numbering of Hypotheses in Section A.1. Result A-1. The introduction of the fine increases the level of social distancing. The effect is observed for all three positions as well as for the two rates of contagion. The effect is both significance of the test where * -10%, ** -5%, *** -1%. statistically significant (WSR, p < 0.05 in all but one specification where p = 0.09) and sizable in magnitude. In Part 3 of Table A5 we repeat the analysis from Part 2, but now looking at the nudge Intervention rather than the fine. The tests find a statistically significant effect of the nudge in only 4 of the 6 specifications. Generally, we can see that the nudge has a smaller effect which is not as robust as that of the fine. In particular, the effect of the nudge is very significant and large in magnitude (7%) under high contagion, but disappears completely both statistically and in terms of economic magnitude under low contagion. Also, the nudge seems to have no effect on the superspreader. Note also that the effects of the nudge are not robust to using data from all rounds of Baseline and Intervention (see Section A.4.2 for more details). Result A-2. The introduction of the nudge generally increases the levels of social distancing. The effect, however, is not robust to alternative specifications and is smaller in magnitude than that of the fine. In particular, the effect is statistically significant (WSR, p = 0.0003) and sizeable in magnitude under high contagion, but not low contagion (WSR, p = 0.5). Further, a statistically significant effect is observed for the close-knit (p = 0.05) and peripheral (WSR, p = 0.06) participants but not for superspreaders. Part 4 of Table A5 shows that the data from the Intervention in the second part of the experiment is not statistically different for the fine and nudge treatments. In particular, even though the mean level of distancing in Intervention in fine treatments is higher than that in nudge treatments, our non-parametric testing fails to find any difference between them in 4 out of 6 specifications. We do, however, find evidence that the fine is more effective than the nudge (1) when data is pulled across treatments, and (2) for the superspreader. The effect is substantial in magnitude (7-10%) and significant at 10% level. Result A-3. There is limited evidence that the fine is more effective than the nudge, and the effect is not robust. In particular, the fine increases distancing levels by more than the nudge (1) when all data is pulled together (MW, p = 0.08), and (2) separately for the superspreader (MW, p = 0.06). The effect, however, is not statistically significant for other specifications. In Part 5 of Table A5 , we compare distancing levels in the three positions -close-knit, superspreader, and peripheral -in Baseline. From the table, we can see that (1) distancing levels are higher in the superspreader position relative to both the close-knit and the peripheral, and (2) higher in the close-knit relative to the peripheral. We run hypothesis tests both (a) aggregating over the rate of contagion, and (b) separately by two levels of the rate of contagion. All results in this part of the table are significant at the 5% level, with most also being significant at 1%. The differences between mean distancing levels in different positions are also large in magnitude -between 10% and 27% depending on the specification. Part 6 of Table A5 repeats the analysis in Part 5, but now focusing on Intervention, rather than Baseline. The conclusions here are the same as above, with all results being significant at the 5% and most also at the 1% level. Result A-4. Superspreaders practice more social distancing than close-knit participants, who, in turn, practice more distancing than peripheral participants. The differences are both statistically significant (MW, p < 0.05) and large in magnitude in all specifications. In Parts 7 and 8 of Table A5 , we investigate the effects on the rate of contagion on distancing behavior, separately for Baseline and Intervention. We find that there is generally significantly more social distancing under 65% rate of contagion relative to 15%. The test is not statistically significant only for the superspreader in Baseline. The effect is also large in magnitude -an average of 8-22% depending on the specification. Note that the effect on superspreaders is smallest, but is probably explained, at least in part, by ceiling effects. In particular, in all treatments, the mean distancing levels in the last 10 rounds in the superspreader position are at least 70%, so there is limited room for a further increase. Result A-5. Subjects in the experiment generally practice more social distancing in high contagion environments relative to low contagion environments. The effect is statistically significant for close-knit (MW, p = 0.9 in Baseline and p = 0.009 in Intervention) and peripheral (MW, p = 0.03 and p = 0.002 in Baseline and Interventon respectively) subjects. For superspreaders, the effect is relatively smaller, and less robust to specifications (MW, p = 0.16 in Baseline and p = 0.05 in Intervention). Finally, in Parts 9 and 10 of the table, we focus on testing Hypotheses 6 and 7. Specifically, we compare the outcomes in the complete and the star networks to (1) predictions of the Nash equilibrium, and (2) efficient outcomes. We find that the levels of social distancing are not in line with theoretical predictions, with all tests being statistically significant at least at the 5% level. In particular, distancing levels in the complete network are well above those predicted by Nash equilibrium. The difference is particularly large in the low contagion environment, where the equilibrium prediction is no social distancing, but the actual average level of social distancing in the last 10 rounds of Baseline stands at 62%. Distancing levels are also higher than those predicted by efficiency under low contagion. On the other hand, in the high contagion environment, the actual distancing levels are below efficiency requirements. Result A-6. Prior to intervention, the observed levels of social distancing in the complete differ from equilibrium and efficiency predictions. In particular, more social distancing is observed than predicted by Nash equilibrium. When it comes to efficiency, actual levels of distancing are above efficient under low contagion, but below efficient under high contagion. All results are statistically significant (MW, p ≤ 0.01). In the star network, the amount of social distancing done by the peripheral participants is well above both equilibrium predictions and efficiency requirements. On the other hand, the levels of social distancing observed for the superspreaders is above the equilibrium but below the efficient level for low contagion, and below both the equilibrium and efficient levels for high contagion. Result A-7. Prior to intervention, the observed levels of social distancing in the star network differ from equilibrium and efficiency predictions. The result is true for both positions and both rates of contagion. In low contagion, participants in both positions practice more distancing than predicted by equilibrium analysis, while in high contagion superspreaders practice less and peripheral participants practice more than predicted. When it comes to efficiency, peripheral agents practice more social distancing and superspreaders practice less social distancing in both high and low contagion. All results are statistically significant (MW, p < 0.0001). Perception and Behavior. As part of the Post-experimental Questionnaire, subjects are asked whether they agree that fines and nudges are effective in promoting social distancing. These two questions are both on a 5-point Likert scale, with higher values indicating greater disagreement with the statements. Table A6 summarizes individual self-reported perceptions of fines and nudges in our sample. The majority of subjects believe that nudges are effective in promoting social distancing, while attitudes to fines appear to be polarized. We construct a group-level index for the perception of fines, equal to one if at least three subjects in the group believe that fines are effective, i.e. they (strongly) agree with the corresponding statement, and an identical index for the perception of nudges. We then construct another index that captures changes in the observed behavior. This index is equal to one if the average level of social distancing in the last 10 rounds of Intervention is higher than in the last 10 rounds of Baseline. We then compare the perception indices to the behavior index. Figure A5 summarizes how group-level perception of fines and nudges compares to behavior. Two patterns emerge. First, subjects seem to underestimate the effectiveness of fines: only 42% of groups in the fine treatment (55% in the nudge treatment) believe in the effectiveness of fines, whereas 70% of groups subjected to fines show an increase in average distancing levels. Second, participants seem to overestimate the effectiveness of nudges -95% of the groups in the nudge treatment (98% in the fine treatment) believe that nudges are effective, while only 65% of the groups actually increase their average distancing levels when subjected to nudges. To test this formally, we use non-parametric analysis with the three group-level indices. We find that irrespective of treatment they were placed in, subjects' perception of fines matches their (fines') measured effectiveness (WSR and MW, p = 0.99 and p > 0.92 respectively). On the other hand, participants overestimate the effectiveness of nudges (WSR and MW, p = 0.001 p < 0.0001 respectively). We now turn to individual behavior, and examine the determinants of individual social distancing decisions. To do this, we use a Random Effects Logit model. This implicitly models the following Random Utility framework for binary choice: According to this model, subjects choose to practice social distancing if and only if they receive greater utility from doing so than from not doing so. Note that (implicitly) we have normalized the utility from not distancing to zero -this is without loss of generality. This is a flexible and tractable way to model binary decisions. The model assumes that the subjectspecific random effect, v i , is normally distributed and that it follows the logistic distribution. In addition, we only use data from the final 10 rounds of each part of the experiment, because it takes time for subjects' behavior to converge as shown in the convergence analysis in Section A.4.1. Section A.4.3 shows that the results are not sensitive to either using the Logit modeling framework or restricting the data to the final 10 rounds of each part. Having set out the econometric framework, we now need a set of control variables that might plausibly explain social distancing decisions. Clearly, it is neither possible nor desirable to control for every conceivable covariate, but we have five categories of controls that collectively cover a wide variety of factors. First, and most obviously, we control for the experimental treatments: the fine, nudge, rate of contagion, and network position (model 1 of table A7). As we randomly assign subjects to these treatments, we can be confident that their effects are causal. Next, we add controls for social demographics (in model 2), geographic and institutional factors (in model 3), social and risk preferences (in model 4), and finally, ideology (in model 5). To complete our model, we also control for interactions between ideology and the policy interventions (fine/nudge). This is because we find that subjects' responsiveness to the fine depends on their ideology -more conservative subjects are less responsive to the fine. Results. Ex ante, it seems plausible that each of the controls could be related to social distancing decisions. However, as we can see in model 6, only some of them are. 13 All experimental treatments have a significant effect -both statistically and in terms of economic relevanceand their direction is intuitive. Statistical tests are t-test on coefficients in the Random Effects Logit regression (REL hereafter), or t-test on coefficients in the instrumental variables Random Effects Logit (IV, hereafter). First, the fine significantly increases social distancing (Hypothesis 1), while the nudge marginally increases social distancing (Hypothesis 2). Qualitatively, the fine is more effective (the statistical significance of the nudge is also not robust), but in our preferred model, the difference between the two effects is not statistically significant (Hypothesis 3). Note that the numbering of the results corresponds exactly to the number of the hypotheses in Section A.1, and the "I" prefix denotes individual-level results. Result I-1. The fine increases the probability that an individual practices social distancing (REL, p = 0.001). Result I-2. The nudge marginally increases the probability that an individual practices social distancing (REL, p = 0.09). The effect is approximately half that of the fine and is not robust to changes in the regression specification. Result I-3. The effect of the fine is larger than the effect of the nudge, but the difference is not statistically significant (REL, p = 0.11). Second, superspreaders distance more than 'close-knit' agents, even though they have the same number of links in the network (Hypothesis 4). This suggests that when subjects know they are relatively highly connected, they take some action to compensate. Note that our experimental design cannot identify why they do this -whether to protect themselves or to protect others. Peripheral agents distance significantly less -being less exposed to community contagion in the first place reduces the likelihood that agents take protective action. Further, subjects do more social distancing in a high contagion environment than in a low contagion environment, irrespective of their network position (Hypothesis 5). Result I-4. Superspreaders practice social distancing with a greater probability than closeknits subjects (REL, p = 0.05), who in turn, practice social distancing with a greater probability than peripheral subjects (REL, p < 0.0001). Result I-5. Subjects practice social distancing with a greater probability in the high contagion setting than in the low contagion setting (REL, p < 0.0001). Beyond the experimental treatments, we find significant effects for age, gender, race, social and risk preferences, and ideology. Older, female, and non-white subjects all practice more social distancing. More risk-averse agents and prosocial agents (as classified by the SVO task) also practice more social distancing (Hypothesis 8 and Hypothesis 9). Result I-8. More risk-averse subjects practice social distancing with a higher probability (REL, p = 0.001). Result I-9. Subjects who indicate greater concern for others' well-being practice social distancing with a higher probability (REL, p < 0.0001). Perhaps the most interesting non-treatment effect is the subjects' political ideology. Subjects with stronger Conservative ideology practice less social distancing, and are marginally less responsive to fines. Recall that our ideology index is constructed from subjects' responses to three questions from the post-experiment questionnaire. They ask about subjects' support for President Donald Trump's handling of the COVID-19 pandemic, their support for universal healthcare, and their belief that social distancing measures impose unjustified economic costs. Result I-10. More conservative subjects both practice distancing with a lower probability (iv, p = 0.04) and are marginally less responsive to the fine (IV, p = 0.08). It is important to bear in mind that subjects were not randomly allocated to their ideology (as they were to the experimental treatments) -clearly, that would not be feasible. As such, the association we have found between ideology and social distancing decisions may not be The variable "Quiz attempts" measure the number of attempts subjects required to pass the quiz prior to the main experiment. It is a proxy for a subject's sophistication. Population density = 1000's of people per square mile. Cumulative cases = 1000's of confirmed cases in state. Significance levels: * -10%, ** -5%, *** -1%. causal -it is possible that ideology is endogeneous. That is, it could be correlated with some unobserved factor that affects social distancing decisions. Therefore, we use an instrumental variables approach to deal with possible endogeneity. We use a measure of subjects' skepticism of global warming as the instrument. As a partisan issue in the United States, this is strongly correlated with political ideology (McCright and Dunlap, 2011) . However, as it is unrelated to COVID-19 and the types of decisions that we ask subjects to make in this experiment, it should not have a separate effect on social distancing decisions. Mathematically, this means that after we have controlled for ideology, then skepticism of global warming is uncorrelated with the error term, , in the regression. This property is required for the instrument to be valid, and so to give consistent estimates. Note that it is not possible to test this -it is an assumption. 14 As we are using a Logit specification, it is not possible to do standard 2-Stage Least Squares. Instead we use a Control Function method (Train, 2009) . The Control Function method takes predicted residuals from the first stage regression, and adds them into the Logit regression. This is in contrast to 2-Stage Least Squares, which takes predicted values from the first stage, and uses them instead of the (potentially) endogenous variable in the second stage. Model 7 of table A7 reports the preferred regression specification with the instrumental variables method. Using the instrument does not have a large impact on the point estimate for the ideology variable. This suggests that there may in fact be a causal relationship between ideology and social distancing decisions -conservatives practice less social distancing. Partial Effects. Using our preferred Logit specification -model 6 of table A7 -we can predict the probability that an agent chooses to practice social distancing. With estimated regression coefficientsβ and a set of subject characteristics x, then: However, calculating partial effects with a Logit model is far from straightforward. In a non-linear model, the partial effect of one variable depends on the full set of an individual's characteristics, and so is highly heterogeneous. For example, the estimated partial effect of age depends on an agent's gender, race, religion, degree of risk aversion -and all other variables in the model. Note that this even includes variables that are not statistically significant. Given this, we calculate a variant on Average Partial Effects (APE). This is more useful than a Partial Effect at the Average; especially given our extensive use of binary variables -a Partial Effect at the Average would give us the partial effect for a subject who is 47% female, 85% white, and at a node position that simply does not exist (50% 'close-knit', 40% peripheral, and 10% superspreader). To calculate our variant on an APE for a variable, for example, gender, we predicted the probability of social distancing for each of our 400 subjects, first assuming that they are all male, and then second assuming that they are all female. This gives us an individual partial effect of gender for each subject -taking an average across all subjects yields the variant on an APE. We assume that subjects are not exposed to a policy intervention (fine/nudge) when calculating APEs for all (other) variables. When the variable is continuous, we first assume that all subjects are at the 25th percentile of that variable (based on the actual distribution in the experiment), and second that they are at the 75th percentile. We use this variant -only looking at the 400 subjects in our experiment -due to data constraints. While data is readily available on the population-wide distributions of most of the individual variables, they are only available as marginal distributions, not as joint distributions. That is, one can easily find an age distribution, a gender distribution, and an education distribution for the US population; but they are only available separately. A joint distributionespecially over all of the variables in our preferred specification is not available. Therefore, our variant allows us to calculate partial effects for individuals who actually exist, and so makes our APE meaningful. Figure A6 shows a box plot Partial Effects for variables that are statistically significant. The colored bars cover the 25th to 75th percentiles of estimated partial effects. The vertical line in each bar shows the 50th percentile, and the black dot shows the mean (the APE). Figure A6 : Average partial effects. Note that all variables from model 6 of Table A7 are used in calculations, but we only report APEs for statistically significant variables. This section presents additional statistical analysis. Section A.4.1 presents convergence analysis for Baseline and Intervention. Section A.4.2 summarizes robustness checks performed at the aggregate (group) level, and Section A.4.3 focuses on robustness checks at the individual level. The analysis in Sections A.3.2 and A.3.3 focuses on the last 10 rounds of Baseline and Intervention. In this section, we validate this choice by showing that the majority of participants exhibit clear convergence behavior after the first 10 rounds in both parts of the experiment. We define individual convergence as follows: Definition 1. A participant converges to a strategy s by round n if (i) she used this strategy for the last k rounds (including n), and (ii) in all subsequent rounds [n + 1, 20] the number of consecutive deviations from the chosen strategy does not exceed a. We consider three types of convergence strategies as follows: 1. The subject always chooses the same action. We define this strategy for both the complete and the star networks. 2. The participant always chooses the same action when she is the superspreader, and the complement action when she is peripheral. We define this strategy for the star network only. 3. The subject always chooses the same action when she is the superspreader and alternates between the two actions when she is peripheral. We define this strategy for the star network only. We set k = 4 and a = 2. This means that the earliest a participant can be considered to converge to a particular strategy is by round 4 if she has not deviated from this strategy in the last four rounds, and in all subsequent rounds, she never performs more than two consecutive deviations. Table A8 summarizes convergence analysis using the above definition. From the table, we can see that by round 11 at least 80% of our participants have converged to a certain strategy in all parameterizations. In fact, the lowest and highest convergence rates are 82.5% and 92% respectively, while the weighted mean convergence rate is 85.9% across the two parts of the experiment. Convergence is higher (a) in the complete network and (b) with the fine intervention. Note that we did not discretize by the rate of contagion, although the findings are robust to this. However, with this discretization, the lowest observed convergence rate falls to 79% (Baseline with the star network under 15% rate of contagion). As a robustness check, we also consider a = 1 and a = 3, allowing for one and three consecutive deviations respectively. Figure A7 plots the share of converged participants for each round separately by parts for a ∈ [1, 3]. We can see that the share of converged subjects does not change much when we allow for a more/less conservative definition. In particular, with a = 1 the share of subjects who converge by round 11 in Baseline drops to 79.8% while in Intervention it reaches 83.5%. With a = 3 the share of subjects who converge by round 11 in Baseline and Intervention stands at 86.3% and 90% respectively. The above analysis suggests that it is reasonable to claim that the absolute majority of subjects converge to a particular strategy by round 11 in both parts of the experiment. We test the robustness of findings in Section A.3.2 in two ways: (1) using data from all 20 rounds of both parts of the experiment, and (2) employing parametric rather than non-parametric tests on the data from the last 10 rounds of both parts. Table A9 summarizes the results of these robustness checks. In the columns under 'Robustness #1', we present the results of our hypothesis tests when using decision data from all rounds. From the tables, we can see that most of the results from Section A.3.2 are robust to using data from all 20 rather than the last 10 rounds of both parts of the experiment. Further, recall that we observed previously that the nudge effects appear to be small and sensitive to specification. Part 3 of Table A9 provides further evidence strengthening this observation -we see that the effect of the nudge on distancing levels is now insignificant in most of the specifications. In our second robustness check, we use the standard unpaired t-test (UT) on unmatched samples instead of the MW test, and the paired version of the t-test (PT) on matched samples in place of the WSR test. In the columns under 'Robustness #2' of Table A9 , we repeat the analysis from Section A.3.2 using these parametric tests. Comparing the tables, we can see that all significant effects are still present if we use parametric rather than non-parametric tests. Overall, the findings of this section suggest that the observations made in Section A.3.2 are robust to changes in the distributional assumptions of the tests, and inclusion of all observations. We now test the robustness of the parametric analysis in Section A.3.3. Model 1 of Table A10 restates our preferred specification to aid comparison. Models 2 and 3 vary the type of model: they use a Linear Probability Model (LPM) and a Probit model, respectively. Both include random effects, to allow for individual heterogeneity. Changing the type of model used does not materially affect the results. Note that the coefficients are not directly comparable across Logit, LPM, and Probit. Figure A8 shows the average partial effects under each type of model -as we can see, the differences are small. Model 4 shows that our results are not sensitive to excluding the period before subjects converge -they are little affected by including all rounds. The results could also be driven by subjects who either might not be paying attention or who might have a poor understanding of the game. To test for this, we exclude any subjects who chose to practice social distancing in all 40 rounds (there are 122 subjects who do this) in model 5. Then in model 6, we exclude subjects failing the quiz between the two parts of the experiment more than once, or who failed the post-experiment attention check. 15 Finally, Model 7 tests an alternative specification of the instrument -only using subjects' responses to the question of whether global warming is caused by humans. 16 While clearly not definitive, these suggest that our findings are robust, and in particular are not driven by the choices we made on the econometric model and which, if any, observations to exclude. In our main analysis, we used an ideology index, constructed from subjects' responses to questions immediately asked after the experiment. Table A11 shows that these results hold even if we use political party affiliation -a much coarser, but much more widely available, measure of ideology. Apart from the ideology control, all variables are the same as in the main model, so we suppress them in the table for convenience. Model 1 shows the main Random Effects Logit model for reference. Model 2 uses dummies for self-identified Republicans, and 'Other' (leaving self-identified Democrats as the base category). Model 3 omits the 'Other' category -leaving a direct comparison of Republicans and Democrats. Model 4 uses the instrumental variables method. Note that we only have one instrument, so it is not possible to instrument for both Republicans and Independents in the same regression. Our preferred specification only contains a single type of heterogeneous treatment effectthe effect of the fine varies with subjects' ideology. There also appear to be some other interaction effects present. Table A12 summarizes heterogeneous treatment effects. Older subjects are relatively less responsive to the fine (REL, p < 0.0001); although the fine does still increase their probability of social distancing (REL, p < 0.0001). Prosocial subjects are also marginally less responsive to the fine (REL, p = 0.07). Subjects in a high contagion setting, and older subjects are relatively more responsive to the nudge (REL, p = 0.006 and p = 0.02 respectively). This suggests that the impact of policy interventions may vary across groups, and perhaps more importantly may vary with how contagious the disease is. 15 A question in the post-experiment survey asked subjects to select a particular option to demonstrate that they were paying attention. 16 Using the final question on its own also gives similar results in terms of magnitude and statistical significance. The question regarding whether global warming is happening gives non-significant estimates -likely driven by the fact that there is very little variation in the responses to this question. 353 (out of 394) subjects respond that global warming is happening. a -all other controls from the main model. Significance levels: * -10%, ** -5%, *** -1%. Section B.1 presents instructions and the interface for the experiment. Analogously, Sections B.2 and B.3 present instructions and the interface for the social preferences (SVO) and risk preferences (BRET) elicitation tasks respectively. This section contains instructions for the Baseline and Intervention parts of the experiment. For the Baseline (Part 1), the rate of contagion is set at 65%. Note that the type of network and intervention do not feature in this part of instructions. For the Intervention (Part 2), we show instructions for the fine. Instructions for the nudge are similar, except that instead of explaining how the fine is implemented, participants are asked to watch a 3-minute video. The video can be accessed online at https://youtu.be/tyf6EpSMeGs. The section concludes with Figure B1 , which shows the decision and results interfaces from the main experiment. Part 1: Instructions (page 1/6) This experiment consists of two parts. You will be paid a fixed reward of $1 for completing all parts of the experiment. Additionally, you can earn points for your choices in Parts 1 and 2, which will be converted into $ at the end of this experiment. There may also be a Bonus Task at the end of the experiment. In Parts 1 and 2 of the experiment, you will interact with a group of other real people recruited through MTurk. Recall that you will never learn the identities of other people and no one will learn about your identity. The expected duration of the experiment is 30-40 minutes and your average expected total earnings will be $3-6 excluding the Bonus Task. Note that you can earn less or more than this amount depending on your choices and the choices of others in your group. At the beginning of Part 1 of the experiment, you will be randomly allocated to a group of 5 and you will remain in this group for the duration of Parts 1 and 2 of the experiment. Note that you might have to wait while we are matching you with 4 other people, but we will compensate you for the time you wait. Since this experiment is interactive, it is important that you remain continuously attentive, otherwise you may slow down others and may even be disqualified from the experiment. In Part 1 of the experiment, you will be asked to play a game with the other members of your group. The instructions on the next 5 pages explain the rules of the game. You will receive more information about Part 2 after you complete Part 1. All participants are given the same instructions. It is important that you read these instructions carefully. Note that there is no deception in this experiment. Once you read the instructions, you will be required to pass a short understanding Quiz. If you fail the Quiz, you will not be allowed to take part in the experiment and will not receive the fixed reward. To continue to the instructions, press the Next button below. Part 1: Instructions (page 2/6) In Part 1 of the experiment you are asked to play a game with the other members of your group. In what follows, you and the other members of your group are referred to as participants. At the start of the game, you are presented with a diagram with 5 circles labeled by capital letters (P, E, C, M, Q) and lines between them. Each circle represents a position -one for each participant. At the start of the game, each participant is randomly allocated to one of these positions in the diagram. Your position is the one colored in blue. The lines between positions indicate the structure of interactions between participants in these positions. These lines indicate which participants interact with one another in the game. An example is the diagram below. Here, you are in position M and directly interact with participants in positions P and E, but you do not interact directly with C and Q. The next page of the instructions explains the choice you need to make in the game. To continue to the next page, press the Next button below. To go to the previous page, press the Back button. Part 1: Instructions (page 3/6) In the game, you and the other participants face a risk of getting infected with COVID-19 (commonly known as coronavirus). The main symptoms of COVID-19 are shortness of breath, a high fever and a new, continuous cough. Most patients experience mild symptoms and recover in 1-2 weeks, but cases can progress to pneumonia and organ failure in the most vulnerable individuals. After you learn your position in the diagram, your need to choose whether to practice social distancing. Below you can see what the buttons to make your choice look like. You have 20 seconds to make your choice and the timer is displayed at the top of the interface throughout the experiment. If you do not make a choice within the allowed time, your choice is automatically recorded as a 'No'. After everyone in your group has made their social distancing choice, the computer randomly chooses one and only one participant to contract COVID-19 directly. If a participant does not practice social distancing and is randomly picked by the computer then s/he gets infected for sure. In other words, if you do not practice social distancing there is a 20% chance you get infected with COVID-19 directly. Social distancing gives you some level of protection against COVID-19. If you practice social distancing and you are the participant randomly chosen by the computer to be infected then the computer flips a fair coin. If the coin flip is Head you are infected with COVID-19. If the coin flip is Tail then you are not infected with COVID-19. In other words, if you practice social distancing there is a 10% chance you get infected with COVID-19 directly. A participant who practices social distancing cannot pass COVID-19 to other participants and cannot be infected with COVID-19 by another participant. On the other hand, an infected participant who does not practice social distancing may infect other participants through contagion. In particular, other participants who do not practice social distancing face a risk of getting infected because COVID-19 may spread through interactions between the participants. The next page of the instructions explains how COVID-19 spreads through interactions between participants. To continue to the next page, press the Next button below. To go to the previous page, press the Back button. A healthy participant who does not practice social distancing may get infected with COVID-19 through contagion by interacting with infected participants who do not practice social distancing. The probability a participant contracts COVID-19 through interaction with another participant is referred to as the rate of contagiousness of COVID-19. Throughout Part 1 of the experiment the rate of contagiousness of COVID-19 is fixed at 65%. Consider again the example diagram of interactions and, as an example, suppose that: • You (participant M) do not practice social distancing. • Participants in position E and Q practice social distancing, while participants in positions P and C do not. In this example, suppose C is randomly picked by the computer to contract COVID-19 directly. First, C has chosen not to practice social distancing so s/he gets infected for sure. Moreover, C can pass COVID-19 to other participants because s/he does not practice social distancing. Second, E and Q cannot get infected with COVID-19 through contagion because they practice social distancing. Next, P may get infected through contagion because s/he does not practice social distancing and interacts with C. This can happen with probability 65% -the rate of contagiousness. Finally, you may also become infected by contagion through your interaction with P. Specifically, there is a 65% probability you might get infected through your interaction with P if s/he becomes infected. However, if P remains healthy you will also remain healthy for the duration of the game. It follows that in this example if 1) you do not practice social distancing, 2) E and Q practice social distancing whereas P and C do not, and 3) C contracts COVID-19 directly, then you may become infected with probability 42.25%. The diagram below shows how this percentage is computed. Note that you will not be informed of the choices of other participants at any point during the experiment. The next page of the instructions explains how you earn points in Part 1 of the experiment. To continue to the next page, press the Next button below. To go to the previous page, press the Back button. • -35 points: if you practiced social distancing and got infected (0 points for being infected minus 35 points cost of social distancing). Note that if you fail to submit your social distancing choice then you will receive a penalty of 200 points. Your earnings for Part 2 are computed in the same way as in Part 1. At the end of this experiment, the computer randomly picks 4 out of 20 games to determine your earnings for Part 2 of the experiment. As in Part 1, the points are converted to $ at a rate of 115 points per $1. Suppose that you earn 300 points in the 4 randomly drawn games. Then, your total earnings for Part 2 are $2.61. Before you can start Part 2 of the experiment, you must answer a Quiz question on the instructions above. If you fail to answer the question correctly, you will not be able to continue to Part 2 of the experiment. To continue to the short Quiz, press the Next button below. The experiment: Interface Figure B1 shows the decision and results interface from the main experiment. We focus on game 5 of the Intervention part of the experiment, where the rate of contagion is set at 65%, the network is the star, and fine is the intervention. In this game, our example participant is assigned to position P. She decides to practice social distancing, and does not get infected. This section contains instructions from and interface of the Social Value Orientation (SVO) task which participants complete as part of the recruitment survey. You have answered the qualifying questions correctly and are now in the Bonus Task. In this Bonus Task you will be making a series of decisions about allocating money between you and another anonymous Turker (hereafter, Other). All of your decisions will be completely confidential. There is a total of 6 decisions to make which are independent of each other. For each decision, you are asked to pick the distribution of money between yourself and Other that you prefer, all values are stated in cents. After you have made your decision, select the resulting distribution of money by clicking on the button below your choice. As you will see, your choices will influence both the amount of money you receive as well as the amount of money the Other receives. There are no right or wrong answers, this is all about personal preferences. Every time 50 Turkers complete the task, we will randomly pick two of them and pay them for the Bonus Task as follows. We will pick one of the 6 decisions, and randomly implement the decision of one of the two chosen Turkers. For example, suppose we randomly chose Turkers X and Y out of those 50 Turkers, and that we further randomly chose to implement decision 3 of Turker Y. Suppose that in decision 3 Turker Y allocated 150 cents to themselves, and 140 cents to Other. Therefore, Turkers X and Y will be paid 140 and 150 cents respectively. Figure B2 presents part of the interface of the SVO task. We focus on one of the six decisions. In this example, a participant decided to allocate 288 cents to self and 188 cents to other. This section contains instructions from and interface of the Bomb Risk Elicitation Task (BRET) which participants complete after the main experiment. Thank you very much for taking part in the experiment! You are now in the Bonus Task in which you have an opportunity to earn an extra payment. On the next page, you will see 100 boxes. As soon as you start the task by pressing the 'Start' button, one box is collected per second starting from the top left corner. Once collected, the box is marked by a tick symbol. For each box collected, you earn 2 cents. One of the 100 boxes contains a bomb that destroys all of your earnings. You do not know where the bomb is located. You only know that the bomb can be in any box with equal probability. Your task is to choose when to stop collecting the boxes and open those you have collected. You stop collecting boxes by pressing 'Stop' at any time. After that you open the boxes you have collected by pressing the 'Open' button. Note that once you press 'Stop' you cannot restart collecting boxes. A dollar or a bomb symbol will be shown on each of the boxes you have collected. If the bomb symbol does not appear, that means that you have not collected the box with the bomb. In this case, you earn the amount accumulated by the boxes you have collected. If the bomb symbol appears, that means that you have collected the box with the bomb. In this case, you earn zero for the Bonus Task. To proceed to the Bonus Task, press the button below. BRET: Interface Figure B3 presents the interface of the BRET. Figure B3a shows an example of the interface after a participant has stopped collecting boxes. We can see that she opened 56 boxes. Clustering and superspreading potential of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections in Hong Kong Polarization and public health: Partisan differences in social distancing during the coronavirus pandemic Messages on COVID-19 prevention in India increased symptoms reporting and adherence to preventive behaviors among 25 million recipients with similar effects on non-recipient members of their communities County-level determinants of social distancing (or lack thereof) during the COVID-19 pandemic Risk perception through the lens of politics in the time of the COVID-19 pandemic Do the effects of social nudges persist? Theory and evidence from 38 natural field experiments Inferring the effectiveness of government interventions against COVID-19 Modelling transmission and control of the COVID-19 pandemic in Australia Experimental games on networks: Underpinnings of behavior and equilibrium selection Nursing home staff networks and COVID-19 A counterfactual economic analysis of COVID-19 using a threshold augmented multi-country model The "bomb" risk elicitation task Field experiments and the practice of policy The macroeconomics of epidemics Cooperation and punishment in public goods experiments Adaptive human behavior in epidemiological models Slow to anger and fast to forgive: Cooperation in an uncertain world Gender differences in COVID-19 related attitudes and behavior: Evidence from a panel survey in eight OECD countries The effects of reputational and social knowledge on cooperation Efficiency and equilibrium in network games: An experiment Partisan differences in physical distancing are linked to health outcomes during the COVID-19 pandemic Oxford COVID-19 government response tracker Which interventions work best in a pandemic? Meta-analyses of the determinants and outcomes of belief in climate change Geographic variation in opinions on climate change at state and local scales in the USA COVID-19 and the stiff upper lip -the pandemic response in the United Kingdom Locally noisy autonomous agents improve global human coordination in network experiments The UK COVID-19 response: A behavioural irony? The persuasive effect of Fox News: Non-compliance with social distancing during the COVID-19 pandemic Infectious diseases (COVID-19 -stay orders) regulations 2020 Cooperation and contagion in web-based, networked public goods experiments Single Year of Age and Sex Population Estimates Indications for healthcare surge capacity in European countries facing an exponential increase in coronavirus disease (COVID-19) cases Individual comparisons by ranking methods Network security and contagion Individual security, contagion, and network design Experimental methods: Pay one or pay all oTree -an open-source platform for laboratory, online, and field experiments oTree: the "bomb" risk elicitation task Geographic variation in opinions on climate change at state and local scales in the USA A technique for the measurement of attitudes On a test of whether one of two random variables is stochastically larger than the other The politicization of climate change and polarization in the American public's views of global warming Discrete choice methods with simulation Land Area County Population by Characteristics Single Year of Age and Sex Population Estimates A domain-specific risk-attitude scale: Measuring risk perceptions and risk behaviors Individual comparisons by ranking methods Econometric analysis of cross section and panel data Part 1: Instructions (page 5/6) At the end of the game you earn the following points depending on your social distancing choice and infection status:• 100 points: if you did not practice social distancing and did not get infected;• 65 points: if you practiced social distancing and did not get infected (100 points for being healthy minus 35 points cost of social distancing);• 0 points: if you did not practice social distancing and got infected;• -35 points: if you practiced social distancing and got infected (0 points for being infected minus 35 points cost of social distancing).Note that if you fail to submit your social distancing choice then you will receive a penalty of 200 points.The information about the points you can earn, the rate of contagiousness of COVID-19, the structure of interactions between participants and your position are always displayed on the screen when you make your social distancing choice. Below you can see an example of how this part of the interface looks like. The diagram of interactions is on the right, while the textual information on the left reminds you of the rate of contagiousness and the possible outcomes.At the end of the game, you are reminded of the structure of interactions between participants, your position within that structure, and your social distancing choice. You are also informed of your infection status and the number of points earned.Below you can see an example of how this part of the interface looks like.You have 15 seconds to review this information and the timer is always displayed at the top of the interface.The next page of the instructions explains how points are converted into your earnings in $.To continue to the next page, press the Next button below. To go to the previous page, press the Back button.Part 1: Instructions (page 6/6) Part 1 of the experiment has 20 separate games as described in these instructions. The choice you make in one game has no effect on other games.The only variation between games is the random reassignment of the positions of all the participants (including you) in the diagram of interactions. The participants assigned to your group, the structure of interactions between positions, the probability of contracting COVID-19 directly, the rate of contagiousness of COVID-19 and the number of points earned depending on social distancing choice and infection status remain unchanged.At the end of each game, you can review the history of your choices and outcomes for the last 5 games. The table below shows an example of how your history after 9 games might look like. It is important that you make a choice in every game. If you fail to make a choice for 3 consecutive games, you will be disqualified from the experiment. In this case, you will not receive any payment for this experiment.At the end of this experiment, the computer randomly picks 4 out of 20 games to determine your earnings for Part 1 of the experiment.The points are converted to $ at a rate of 115 points per $1.Suppose that you earn 260 points in the 4 randomly drawn games. Then, your total earnings for Part 1 are $2.26.To continue to the short Quiz, press the Next button below. To go to the previous page of the instructions, press the Back button. You have completed Part 1 of the experiment and will now proceed to Part 2.Below are the instructions for Part 2 of the experiment. It is important that you read these instructions carefully.This part of the experiment also has 20 games, and you are assigned to the same group of 5 people as in Part 1. The structure of interactions between participants, the probability of contracting COVID-19 directly and the rate of contagiousness of COVID-19 are the same as in Part 1.The single difference from Part 1 is that in Part 2 of the experiment you will receive a fine of 15 points in any game in which you do not practice social distancing.Hence, in Part 2 of the experiment the points you earn at the end of the game are:• 85 points: if you did not practice social distancing and did not get infected (100 points for being healthy minus 15 points fine);• 65 points: if you practiced social distancing and did not get infected (100 points for being healthy minus 35 points cost of social distancing);• -15 points: if you did not practice social distancing and got infected (0 points for being infected minus 15 points fine);