key: cord-0729417-s74jgknw authors: Gagnon, L.; Lloyd, J.; Gagnon, S. title: Social Distancing Causally Impacts the Spread of SARS-CoV-2: A U.S. Nationwide Event Study date: 2020-07-01 journal: nan DOI: 10.1101/2020.06.29.20143131 sha: cf8fcca36c0c7d4a0be39bbc33b84fa972ccf750 doc_id: 729417 cord_uid: s74jgknw Background: To date, no study has examined the effectiveness of social distancing, while controlling for social mobility and social distancing restrictions in the United States. We utilize the quasi-experimental setting created by the nationwide protests precipitated by George Floyd's tragic death on May 25, 2020, to assess the causal impact of social distancing on the spread of SARS-CoV-2. Methods: Our sample period spans from January 22, 2020, to June 20, 2020, and consists of 474,422 county-days representing 3,142 counties from all 50 states and the District of Columbia. To assess the change in COVID-19 case counts following the protests, we employ a differences in differences estimation strategy in a multivariate setting, in which we control for social distancing restrictions and social mobility across counties. We also control for covariates that may influence COVID-19 transmission, and implement placebo tests using a Monte Carlo simulation. Findings: We document a country wide increase of over 3.06 cases per day, per 100,000 population, following the onset of the protests (95%CI: 2.47-3.65), and a further increase of 1.73 cases per day, per 100,000 population, in the counties in which the protests took place (95%CI: 0.59- 2.87). Relative to the week preceding the onset of the protests, this represents a 61.2% country wide increase in COVID-19 cases, and a further 34.6% increase in the protest counties. Interpretation: Our study documents a significant increase in COVID-19 case counts in counties that experienced a protest, and we conclude that social distancing practices causally impact the spread of SARS-CoV-2. The observed effect cannot be explained by changes in social distancing restrictions and social mobility, and placebo tests rule out the possibility that this finding is attributable to chance. The highly contagious novel coronavirus, severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), responsible for coronavirus disease 2019 , emerged in December 2019 in Wuhan city, Hubei province, China. 1 The initial outbreak quickly evolved into a public health emergency of international concern, and by March 2020, the World Health Organization (WHO) characterized COVID-19 as a pandemic. 2 As of June 2020, COVID-19 has reached over 180 countries and regions, and the total number of confirmed cases has surpassed 10 million globally. 3 When compared to other countries, COVID-19 has spread throughout the United States (U.S.) at an unparalleled rate, infecting over 2·5 million individuals and claiming over 125,000 lives. 4 Transmission of SARS-CoV-2 can occur through both indirect and direct modes, including person-to-person contact and the spread of respiratory droplets from infected individuals via coughing and sneezing. 5 Recent evidence estimates the average and median basic reproduction number (R 0 ) of SARS-CoV-2 as 3·28 and 2·79, respectively. The R 0 indicates the contagiousness and transmissibility of a virus, with an R 0 greater than one implying that each infected individual spreads the virus to multiple individuals. Public health measures, designed in consideration of the virus's specific transmission properties, have been implemented with the aim of reducing the R 0 to a value less than one. As research has demonstrated that SARS-CoV-2 can travel across a minimum distance of 6 feet (2 meters), 6 social distancing, the maintenance of at least a 6 foot physical distance from others, has been introduced as an important public health measure. A variety of social distancing restrictions have been instituted across the U.S. ranging from statewide stay-at-home orders, to more focused policies including: non-essential business closures, large gathering bans, school closure mandates, and restaurant and bar limits. 7 Moreover, the U.S. federal government has granted individual states the authority to design their own COVID-mitigation strategy, therefore, the extent and type of social distancing policies adopted differs across states. 8 The widespread adoption of social distancing restrictions in various jurisdictions has created an opportunity to examine the effectiveness of social distancing measures in reducing the spread of SARS-CoV-2. In the U.S., research examining government-imposed restrictions found that social distancing measures were effective in reducing the doubling rate of COVID-19 among U.S. states, 9 as well as the daily growth rate of COVID-19 cases across counties, 7,10 with a lag period consistent with the 14-day incubation time of SARS-CoV-2. 11 This is consistent with early predictive models which suggest that the absence of social distancing measures would result in a greater spread of SARS-CoV-2. [12] [13] [14] However, recent evidence suggests that rather than reducing the number of daily confirmed cases, social distancing merely stabilizes the spread of COVID-19. 9 The lack of consensus in the literature regarding the effectiveness of social distancing measures stresses the necessity for a study to explore the causal impact of these measures on the SARS-CoV-2 infection rate. Research has demonstrated that greater population mobility influences the R 0 of SARS-CoV-2, and facilitates the spread of COVID-19 across different geographic areas. 15 Given the relationship between mobility and R 0 , several studies have used mobility data as a measure of social distancing when examining the effectiveness of social distancing in reducing the spread of SARS-CoV-2. 7, [16] [17] [18] However, social mobility measures represent an imperfect proxy for social distancing, because individuals can be mobile while still maintaining the recommended 6 foot interpersonal separation to prevent viral transmission. Therefore, future studies should control for mobility in order to identify the direct relationship between social distancing and the SARS-CoV-2 infection rate. Social distancing practices were abruptly relaxed during the mass protests precipitated by the tragic death of George Floyd in Minneapolis, MN, on May 25, 2020. During these protests, thousands of people across the U.S. congregated, potentially increasing their exposure to SARS-CoV-2. The unpredictable nature of the protests creates a natural experimental setting to investigate the causal impact of social distancing on the SARS-CoV-2 infection rate. Two key requirements for the identification of the causal link between social distancing and the spread of SARS-CoV-2 are satisfied in this setting, namely: 1) the existence of a strong theoretical basis supporting the relationship in question and, 2) exogenous variation in the variable of interest, i.e. social distancing. 19 The latter is key to establish causality, because it mitigates concerns that omitted variables correlated with both the protests and the spread of SARS-CoV-2 might be driving our findings. This experimental setting also enables us to circumvent common concerns about endogeneity and self-selection which besets most non-randomized-trial experiments. 20 To assess the causal impact of social distancing on the SARS-CoV-2 infection rate, we implement our empirical analysis in a differences-in-differences (DID) setting, in which the onset of the protests represents the treatment effect and the counties in which protests take place represent the treatment group. This paper differs from its predecessors in that rather than investigating the effectiveness of social distancing following the imposition of social distancing restrictions, it examines their effectiveness as social distancing practices are abruptly relaxed. Furthermore, this study controls explicitly for social distancing restrictions imposed by states in the period surrounding the protests, as well as for the concurrent increase in social mobility. Establishing the effectiveness of social distancing practices in a statistically reliable way has important public health implications, as states are in the midst of relaxing the social distancing restrictions initially imposed in March 2020. We source our U.S. COVID-19 data from the John Hopkins GitHub repository. This data consists of confirmed cases in each county at the end of every day since the start of the outbreak in late January 2020. We calculate the number of new cases for each county and each day by subtracting the cumulative number of confirmed cases at the end of the day from the number of cumulative cases from the previous day. This sampling procedure yields a panel data-set consisting of a total of 474,422 county-days representing 3,142 counties from all fifty states, as well as the District of Columbia (DC), for the period starting on January 22, 2020, and ending on June 20, 2020. We describe our sample in Table I , and in Figure 1 we show the counties in which protests took place according to media reports, along with the size of the first protest taking place within each county. We obtain our county-level population data and our county-level demographic data from the U.S. Census Bureau. We extract our county-level Gross Domestic Product (GDP) data from the U.S. Bureau of Economic Analysis' (BEA) Regional Economic Accounts database (Table CAGDP1 ). We retrieve county-level data on the prevalence of obesity, diabetes, smoking, and hypertension from the University of Washington's Institute for Health Metrics and Evaluation (IHME). The hypertension and obesity data are for the years 2009 and 2011, respectively, and the diabetes and smoking prevalence data are for 2012. The IHME reports hypertension and obesity data for females and males separately, so we construct a population-weighted average measure for these two covariates fitness centres, gyms, casinos, etc., 4) closure of non-essential businesses, 5) stay-at-home orders for non-essential activities, 6) state curfews on non-essential activities, 7) mandated quarantines for people entering the state, 8) travel restrictions prohibiting residents from leaving the state, nonresidents from entering the state, or residents from travelling across counties within the state, 9) self-isolation requirement for individuals with confirmed COVID-19 infection, and 10) mandatory wearing of masks or other mouth and nose coverings in public places. We construct our social distancing restrictions index by adding the number of restrictions that are in place in a state on any given day, based on the date at which each restriction is enacted, relaxed, or expired. Figure 2 shows the evolution of our index for randomly selected states. We obtain our mobility data from the Descartes Labs. This data consists of mobility indexes calculated at the end of every day and aggregated at the county level. The indexes, which we will refer to as the social mobility index, are based on geolocation reports from smartphones and other mobile devices, and track the movements of individual mobile phone subscribers. The methodology employed to construct these indexes is described in Warren et al., 2020. 21 The mobility index data is available at a daily frequency from March 1, 2020, until the end of our sample period. Thus, we lose a total of 122,538 county-day observations from the start of our sample period up until February 29, 2020, in all our regression analyses featuring this data. Figure 3 shows the mobility index for a randomly selected small and large county in the states of New York and Texas. Finally, we construct a comprehensive list of protests that took place across the U.S. Our starting point is the List of George Floyd protests in the United States assembled by Wikipedia. At the time of writing, the main Wikipedia page cited 134 news articles from national, regional, and local media outlets, and the secondary pages cited hundreds more. From these media citations, we extracted the location and the date at which the protests reportedly took place, as well as the estimated number of individuals involved in each protest. We complement this process with a search on the Dow Jones Factiva database. We examine the impact of the abrupt relaxation of social distancing practices, which occurred during the U.S. nationwide protests, on the SARS-CoV-2 infection rate with an Ordinary Least Squares (OLS) differences-in-differences (DID) panel regression equation, which is specified as follows: where CI i,j,t corresponds to new confirmed SARS-CoV-2 infections in county i from state j on day t, per 100,000 population. P rotest i is an indicator variable which is set equal to one if a protest took place in county i, and to zero otherwise. P ost GF i,j,t is an indicator variable set equal to zero from the first day of our sample period up until May 25, 2020, the day of George Floyd's tragic death, and to one on every subsequent date. P rotest × P ost GF is an indicator variable which captures the interaction between P rotest i and P ost GF i,j,t . X i,j,t and Y j,t are vectors of county and state characteristics which we use as control variables, and (γ i ) represents state-level fixed effects to control for time-invariant differences across states in our regressions. In equation (1), β 1 captures any differences that may exist between the SARS-CoV-2 infection rate in protest and non-protest counties, and that are unrelated to the protests. We expect this coefficient to be statistically indistinguishable from zero. regressions, we cluster the standard errors at the county level to account for any potential crosssectional dependence in the error terms, i,j,t . 22 We perform our statistical analysis with STATA 16 and use Sergio Correia's REGHDFE command to estimate equation (1) . 23 In our differences-in-differences regressions, we include control variables which may influence the transmission rate of SARS-CoV-2. These control variables account for demographic, health, geographic, and income level variations across counties. For demographic indicators, we include male sex and age (60 years+) since these factors are associated with both an increased risk of testing positive for SARS-CoV-2 and greater illness severity. 24 We also include ethnicity as a demographic variable to account for the increased risk of a positive SARS-COV-2 test observed among Blacks and Hispanics. Obesity, diabetes, and hypertension are clinical risk factors included as health covariates in the regressions, as they are associated with an increased risk of severe illness, and a greater risk of mortality from COVID-19. 25 We also include smoking as a clinical risk factor, as some evidence suggests that smoking may be associated with an increased severity of COVID-19. 26 We include population density among our control variables, as higher rates of SARS-CoV-2 infections are observed in more densely populated, urban areas. 15, 25 Consistent with previous research showing that residents from more economically deprived areas are more likely to test positive for SARS-COV-2, we use real GDP per capita to control for income in our regressions. 25 4 Results We report results from regression equation (1) in Table III . In Model (1), the coefficients associated with P rotest is equal to 1·22 (95%CI: 0·79-1·65) and is highly significant. This implies that, over the entire sample period, the SARS-CoV-2 infection rate is 1·22 cases per day, per 100,000 population higher in the counties where protests took place, relative to the counties where no protests took place. The coefficient associated with P ost GF is positive and highly significant, implying that the SARS-CoV-2 infection rate increases by 3·39 cases per day, per 100,000 across the U.S. following the onset of the protests. Finally, the coefficient associated with the P rotest × P ost GF interaction indicates that the infection rate is even greater in the counties in which protests actually took place, following the onset of the protests (4·01; 95%CI: 3·24-4·78). To put this number into perspective, recall that the average number of new case infections across all counties is equal to 5 per day, per 100,000 population, in the week preceding the onset of the protests (see Column (2) of Table I ). Using this number as a reference point, COVID-19 cases increase by a further 80·2%, on average, in protest-counties, relative to non-protest counties, and by 3·39 + 4·01 = 7·40 cases per day, per 100,000 population, or 148% overall. Models (2)-(6) of Table III provide evidence that is consistent with Model (1). The coefficient associated with P rotest loses its statistical significance in Models (2) and (6), suggesting that the higher overall infection rate of protest-counties is attributable to cross-county differences in demography. We note that the coefficient associated with P ost GF is very stable across the six models, ranging between 3·33 and 3·39. Likewise, the coefficient associated with our P rotest × P ost GF interaction is quite stable across the six models, ranging between 2·80 in Model (6) and 4·01 in Model (1) . Although there is no telling which one of these six models provides a better description of the causal impact of relaxing social distancing practices on the spread of SARS-CoV-2, out of conservatism, we will employ our omnibus regression Model (6) In the period preceding the onset of the protests, the number of new COVID-19 cases began to drop steadily across the country. 3 Accordingly, several states began to unwind their social distancing restrictions in a carefully staged manner. Figure ( 2) illustrates this trend in Alabama, California, Florida, and New York, for instance. Starting in mid-March, we observe a steady rise in our social distancing restrictions index in these four states and we observe the start of a slow unwind by mid-April. Notably, while social distancing restrictions were being relaxed across the nation, social mobility was on the rise (see Figure 3 ). Consequently, it may very well be that the concurrent relaxation of social distancing restrictions and the increase in social mobility during the event period has prompted individuals to relax their social distancing practices, and that the effect that we document in Table III is partly contaminated by these contemporaneous changes. We address this issue in Table IV , where we include our social distancing restrictions and social mobility indexes in our baseline DID regression equation (1) as additional control variables. (2) includes the additional control for social distancing restrictions, Model (3) includes the additional control for social mobility, and Model (4) includes both controls. In Model (2), we see a drop from 3·33 to 3·01 in the coefficient associated with P ost GF , relative to Model (1), but the coefficient remains highly significant. We observe a similar drop in the coefficient associated with the P rotest× P ost GF interaction, from 2·80 to 2·29, with no drop in its statistical significance. In Model (3), controlling for social mobility has a slightly larger impact on the coefficients associated with P ost GF and P rotest × P ost GF . The first coefficient drops from 3·33 to 2·97, while the P rotest × P ost GF drops from 2·80 to 1·81. Evidently, social distancing restrictions and social mobility are correlated with one another. For instance, we should expect social mobility to rise when travel restrictions are lifted. When we control for both factors in Model (4), the coefficient associated with P ost GF is equal to 3·06 (95%CI: 2·47-3·65), which is highly significant, and the coefficient associated with P rotest×P ost GF is equal to 1·73 (95%CI: 0·59-2·87), also highly significant. In summary, after controlling for the reduction in social distancing restrictions and the increase in social mobility that occurred following the onset of the protests, we still observe a significant increase in the number of daily COVID-19 cases across all counties (61·2% relative to the week preceding the event), and a further increase of 1·73 cases (34·6%) in the counties where protests took place. We attribute the latter to the relaxation of social distancing practices during the protests. This interpretation is supported by the abundance of video footage demonstrating that the mass protests brought people into close physical proximity to one another, in contravention to social distancing restrictions that were in place at the time. In Table V , we report the results of a placebo test assessing whether the causal impact of the protests on the spread of SARS-CoV-2 that we document in Tables III and IV can be attributed to chance. For this purpose, we implement a Monte Carlo simulation exercise centered on our baseline DID panel regression specification (1), i.e. Model (4) of Table IV . In this simulation, we pick a random date between February 6, 2020, and June 1, 2020, to represent the onset of the protests and we assign counties to the protest group randomly, in proportion to the actual fraction of counties that took part in the protests (18%). We carry out this exercise 10,000 times, and each time, we estimate our model with the simulated protest onset date and protest county pair, and collect the key parameter estimates from the regression, i.e. P rotest, P ost GF , and P rotest × P ost GF , along with their county-cluster robust t-statistics. In this simulation, the impact of the random date on the randomly assigned protest counties on the SARS-CoV-2 infection rate is negligible, on average. Only 25% of the coefficients from the simulation are positive, and at most 1% of them are statistically significant. Furthermore, the coefficients associated with P ost GF and P rotest × P ost GF from the actual regressions (Panel B), i.e. 3·06 and 1·73, are well above the 95% confidence thresholds inferred from the simulated distribution, i.e. 1·35 and 1·37, respectively. Indeed, our baseline regression coefficients fall at the very top end of the simulated distribution. This implies that we can safely reject the null hypothesis that the causal impact of protests on the SARS-CoV-2 infection rate that we document is due to pure chance, with at least 99% confidence. In this paper, we employ the natural experimental setting created by the U.S. protests precipitated by George Floyd's tragic death to document the causal impact of social distancing measures on the spread of SARS-CoV-2. Using a DID analysis, in which the treatment effect corresponds to the onset of the protests and the treatment group corresponds to the counties in which protests reportedly took place, we document a country-wide increase of more than 3·06 cases per day, per 100,000 population, following the onset of the protests, and a further increase of 1·73 cases per day, per 100,000 population, in the counties in which the protests took place. Relative to the average number of new cases per day during the week preceding the onset of the protests, this represents a 61·2% country-wide increase in COVID-19 cases, and further 34·6% increase in the protest counties. The increase in the SARS-CoV-2 infection rate that we document in this study cannot be explained by the relaxation of state-imposed social distancing restrictions in the period surrounding the protests, nor by the concurrent increase in social mobility during the protest period, as we control explicitly for these two factors in our regressions. Therefore, it stands that the increase in SARS-CoV-2 infections that we observe following the onset of the protests can be attributed to the relaxation of social distancing practices. The causal impact is also robust to both the inclusion of a host of covariates that are known to influence the SARS-CoV-2 infection rate, and to placebo tests that enable us to rule out the possibility that our findings are attributable to chance. Our study is not without limitations. In particular, over 70 testing centers across the U.S. were closed following the onset of the protests. Therefore, the increase in the SARS-CoV-2 infections that we document likely underestimates the true increase. We are also unable to assess protest participants' vulnerability (e.g. age, underlying health conditions, personal protective wear, etc.), and variability along these dimensions may influence the risk of SARS-CoV-2 infection. Additionally, we cannot control for the actual degree of physical proximity between participants, which would impact the transmission rate of SARS-CoV-2 during the protests. Moreover, we rely on the accuracy of media reports to identify the counties in which protests took place. Finally, we do not account for the magnitude of the protests in each county, however, expressing the case counts in rates rather than in levels should minimize any potential scale-related effects. We wish to express our sincere thanks to the Descartes Labs for making their mobility data available to us. LG acknowledges the financial support from the Smith School of Business Distinguished Faculty Fellowship at Queen's University. LG conceived the study, and all authors contributed to the final study design. LG performed the data analysis, created the tables and figures, and wrote the methods and results sections in the initial draft of the manuscript. SG and JL conducted the literature search, and assisted LG with the data collection. All authors contributed substantially to the interpretation of the data, and equally to the write up. All authors wrote and approved of the final manuscript submission. Authors declare no competing interests. This study uses publicly accessible data exclusively. S. This analysis is centered on our baseline differences-in-differences panel regression specification, i.e. Model (4), presented in Table IV . In this simulation, we estimate our baseline regression model 10,000 times. In each regression, we assign a random date for the start of the protests, ranging between February 6, 2020, and June 1, 2020, and we assign the counties to the group of protest participants randomly, in proportion to the actual fraction of counties that took part in the protests (18%). In Panel A, we report the simulated distribution of the regression model's key parameter estimates, i.e. P rotest, P ost GF , and P rotest x P ost GF , along with the distribution of the t-statistics for these coefficients. In Panel B, we report the actual value of the parameter estimates from Model (6) of Table III Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The Lancet WHO declares COVID-19 a pandemic COVID-19 Map -Johns Hopkins Coronavirus Resource Center An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2) Airborne transmission route of covid-19: Why 2 meters/6 feet of inter-personal distance could not be enough The Immediate Effect of COVID-19 Policies on Social Distancing Behavior in the United States Pandemic Politics: Timing State-Level Social Distancing Responses to COVID-19. medRxiv Social Distancing Has Merely Stabilized COVID-19 in the US. medRxiv Strong Social Distancing Measures In The United States Reduced The COVID-19 Growth Rate Social Distancing is Effective at Mitigating COVID-19 Transmission in the United States Evaluating the effectiveness of social distancing interventions against COVID-19. medRxiv Second waves, social distancing, and the spread of COVID-19 across America How will countrybased mitigation measures influence the course of the COVID-19 epidemic? The Lancet Population density and basic reproductive number of COVID-19 across United States counties Determinants of Social Distancing and Economic Activity during COVID-19 A Global View No Place Like Home: A Cross-National Assessment of the Efficacy of Social Distancing during the COVID-19 Pandemic (Preprint). JMIR Public Health and Surveillance Time Dynamics of COVID-19 Identification is not causality, and vice versa Shock-Based Causal Inference in Corporate Finance and Accounting Research Mobility Changes in Response to COVID-19 Estimating standard errors in finance panel data sets: Comparing approaches Stata module to perform linear or instrumental-variable regression absorbing any number of high-dimensional fixed effects. Statistical Software Components Risk factors for mortality in patients with Coronavirus disease 2019 (COVID-19) infection: a systematic review and metaanalysis of observational studies. The aging male : the official journal of the International Society for the Study of the Aging Male Risk factors for SARS-CoV-2 among patients in the Oxford Royal College of General Practitioners Research and Surveillance Centre primary care network: a cross-sectional study. The Lancet Infectious Diseases COVID-19 and smoking: A systematic review of the evidence