key: cord-0889336-4hr0prhw
authors: Brinkman, Jeffrey; Mangum, Kyle
title: The Geography of Travel Behavior in the Early Phase of the COVID-19 Pandemic()
date: 2021-07-27
journal: J Urban Econ
DOI: 10.1016/j.jue.2021.103384
sha: 156fb27d27535c5ff5c0236044eadc319b5f16cb
doc_id: 889336
cord_uid: 4hr0prhw

We use U.S. county-level location data derived from smartphones to examine travel behavior and its relationship with COVID-19 cases in the early stages of the outbreak. People traveled less overall and notably avoided areas with relatively larger outbreaks. A doubling of new cases in a county led to a 3 to 4 percent decrease in trips to and from that county. Without this change in travel activity, exposure to out-of-county virus cases could have been twice as high at the end of April 2020. Limiting travel-induced exposure was important because such exposure generated new cases locally. We find a one percent increase in case exposure from travel led to a 0.21 percent increase in new cases added within a county. This suggests the outbreak would have spread faster and to a greater degree had travel activity not dropped accordingly. Our findings imply that the scale and geographic network of travel activity and the travel response of individuals are important for understanding the spread of COVID-19 and for policies that seek to control it.

In the early stages of the COVID-19 outbreak, people drastically reduced their travel.

Governments enacted numerous policies including stay-at-home orders, business closures, and limits on mass gatherings to reduce exposure and slow the spread of the virus. The change in travel behavior may reflect the implementation of these policies but also may be attributed to people responding to information about the number of virus cases in their proximity. How did people reduce their travel behavior during the onset of the outbreak? Did they avoid places with larger outbreaks? And how did this response affect exposure and slow the spread of the disease?

In this paper, we use data on the movement of smartphones between U.S. counties to study the change in travel behavior and virus exposure in the early stages of the outbreak. The data provide daily measures of the network of bilateral travel flows between counties. 1 Aggregate patterns in the data confirm that travel between counties declined as COVID-19 cases rose.

People not only traveled less, they avoided locations that had higher numbers of cases. Using gravity regressions of bilateral travel flows on case counts, we show that flows between locations declined in response to increased cases in both the origin and destination. During a period of explosive growth in cases, the results suggest that a doubling of cases in either end of a trip led to roughly a 3.5 percent decline in travel flows. This result holds even when controlling for government orders, suggesting that people adjusted travel behavior based on available information about the geography of the outbreak. A policy implication of this result is the importance of providing timely, accurate information about the geography of an outbreak.

Changes in mobility had large effects on overall virus exposure. We construct a measure of nonlocal (out-of-county) exposure as a sum of flows between counties weighted by the number of confirmed cases in the counties visited. In counterfactual experiments, we find exposure would have been twice as high at the end of April 2020 had people not changed their travel behavior. Furthermore, a decomposition shows that roughly one third of the difference in exposure came from changes in the travel network, as opposed to overall declines in travel.

The reduction in out-of-county exposure matters because such exposure led to increases in new COVID-19 cases. Under our preferred instrumental variable method, we find that a 1 percent increase in the exposure measure led to a 0.21 percent increase in new cases. Therefore, changes in travel patterns likely had significant benefits in reducing the spread of the disease by decreasing exposure.

Finally, we provide a simple model of the spatial dynamics of an outbreak. The model is used to illustrate the importance of the connectedness of locations and the mobility response of individuals to the geographic spread of new cases. The important takeaway from the model is that travel can both speed the spread in the short run and amplify the outbreak over the longer run, while a mobility response mitigates both of these effects. The model does not include important features of an epidemiological model such as recovery rates, deaths, or immunity. However, it demonstrates the concept of how reductions in mobility reduce aggregate infections.

Our findings on travel complement other recent research on declines in local activity during the outbreak. Gupta et al. (2020) find that government policies led to significant declines in mobility, while Engle et al. (2020) find that policy as well as local case levels reduced mobility. There is also evidence that reductions in mobility and government policies mitigated the outbreak, including work by Chinazzi et al. (2020) , Courtemanche et al. (2020) , Fang et al. (2020) , Glaeser et al (2020) , Kraemer et al. (2020) , and Wilson (2020). In addition show that migration out of urban areas drove the spread of the outbreak. In contrast to these studies, our research explicitly considers changes to the travel network in addition to declines in mobility levels. We also construct a measure of case exposure in addition to generic trip rates, which we find is an important determinant of case growth.

Other researchers have looked at the role of networks during the pandemic following work by Christakis and Fowler (2010) and Bailey et al. (2018) . Kuchler et al. (2020) show that social networks in New York and Lodi, Italy predict the spread of COVID-19, while perform a similar analysis for New York, but also consider differences in mobility among demographic groups. In contrast to these papers, we consider how the observed travel network changed in response to the outbreak, and how this affected the spread of the disease. Monte (2020) also shows how the connectedness of counties shrank during the pandemic, but does not explicitly study the effects on exposure or case growth.

Several papers have used quantitative urban and trade models to study spatial health and economic outcomes during a pandemic. Fajgelbaum et al. (2020) examine optimal commuting restrictions in an epidemiology and trade model calibrated to several cities. Giannone et al. (2020) use cross-state flow data to understand optimal 4 mobility restrictions. Relative to these papers, we focus on empirical identification of mobility responses, exposure, and disease spread using a geographically richer data set.

Lastly, our research connects other work that seeks to inform policies that restrict mobility. For example, Atalay et al. (2020) and Dingel and Neiman (2020) study the ability of workers to work from home in different occupations and industries. By providing insights into spatial dynamics, our work can also help inform current theoretical research that seeks to understand the tradeoff between health and economic welfare, including work by Farbodi et al. (2020) , Guerrieri et al. (2020) , and Kaplan et al. (2020) .

We briefly introduce the data and describe the key features. More detailed discussion and summary of the data are provided in Appendix A.

There are two main data sets used in our analysis. The first is the record of COVID-19 daily case diagnoses by county as reported by Johns Hopkins University. 2 We combine this with a listing of state-level activity restrictions including stay-athome orders and closure of "nonessential" businesses. 3

The second data set is an anonymized summary of movement between counties derived from a microdata record of smartphone locations. The measure was con-2 Johns Hopkins University Coronavirus Resource Center.

Data were retrieved from https://coronavirus.jhu.edu/.

3 These data were collected by the Institute for Health Metrics and Evaluation at the University of Washington. They were downloaded from https://covid19.healthdata.org/ structed and generously made publicly available by Couture et al. (2021) (hereafter, CDGHW) using data provided by vendor PlaceIQ. The individual device locations are collected when an application requests GPS location data. These "pings" are aggregated at the county level. The data set consists of a time-consistent list of 2,018 counties in the U.S covering 97 percent of the U.S. population. In our analysis, we use data from January 20, 2020 to May 25, 2020. 4 Specifically, the data report: (i) the number of devices registering in a county each day, and (ii) the fraction of those devices that registered in each county (of the 2,108) sometime in the preceding 14 days. The product of (i) and (ii) is a measure of the number of trips between two counties. 5

Trips are best viewed as indicators of connectedness. There is no definitive notion of origin or destination, and the reported statistic is the probability of a binary event, not a transition from a starting place to ending place. In correspondence with the timing in the data construction procedure, we refer to the current location as "focal" county and the previous location as the "visit" county. The data construction also induces a moving average quality that we will account for in the analyses that follow. 6

These data depict travel between counties as opposed to within counties. 7 By 4 CDGHW continue to update the data, but we chose to focus our analysis on the initial phase of the U.S. outbreak. 5 According to CDGHW, a spatially inactive device is less likely to register its location and appear in the data-and active devices fell substantially as the pandemic took hold-but they caution against strict quantitative interpretation of active device counts. Clearly the variation is relevant for our study of travel across counties. 6 CDGHW chose the lag window of two weeks based on public health guidance of COVID-19 incubation time. 7 We focus on CDGHW's location exposure index ("LEX"), but CDGHW also publish a withincounty measure of activity (the device exposure index, "DEX") that we use as an important control in our case growth analysis in Section 5. 6 studying this form of mobility, our focus is on trips that are more likely to create contact between regions, rather than those that create contacts between neighbors (such as visits to a store or restaurant). We will refer to this out-of-county travel from here on as "travel activity" or "mobility."

To set the stage for analysis, we first show the dynamics of travel activity in the early phase of the pandemic. To construct a consolidated measure of mobility (m) for a county j on date t, we summarize the out-of-county trips as the product of active devices in the county, d jt , and trip probabilities between j and other counties i (within the lag window of two weeks) as reported on day t, σ ijt :

We then index the series asm jt = m jt m j0 , wherem j0 is the mean of the county's index in the pre-pandemic period (January 20 to February 23). Additional details about the components of the index can be found in Appendix B. One notable feature is that the first drop in mobility occurred immediately following the initial run-up in cases-and before mobility restrictions were enacted-as it became clear that the U.S. was experiencing community spread and not just isolated 7 cases due to foreign travel. From March 1 to March 14, though no county was yet under stay-at-home order, mobility dropped by 20 percent as cases rose 500%.

Travel activity continued its downward trend from that point into April as case counts continued an exponential rise and stay-at-home orders and other mobility restrictions were more widely enacted. Mobility reached a bottom in mid-April at 56 percentage points below its pre-virus average but recovered as the level of new cases tapered in May. These patterns suggest that households may have been responding to information about virus prevalence as well as formal emergency declarations and restrictions. 8

Did travel activity drop in a uniform way, or were some locations affected more than others? We next exploit the full geographic structure of the data to see which sets of visits changed to produce the decline in mobility.

To study the geography of the change in activity, we use a gravity regression of travel flows on local case counts. Specifically, we regress recorded visits between county pairs on the case counts on each side of a trip. The model is , and the number of new cases reported nationally in the preceding two weeks (C). In A, "Close NE Business" means a mandated site closure of businesses deemed "nonessential." Sources: Couture et al (2021) , all panels; healthdata.org (2020), panel A; Johns Hopkins University (2020), panels B and C.

where d jt σ ijt is the number of visits (number of active devices observed in focal county j times the probability of a visit to county i in the lag window before t), and n j,t−14:t−1 , n i,t−14:t−1 are new cases reported 9 in the focal and visited counties, respectively, in the preceding two weeks (the travel window). 10 R it and R jt represent mobility restrictions (stay-at-home orders) in the focal and visit counties, respectively. This model recovers, via parameters ω j and ω i , the observed relationship between visits and cases in the locations on each side of a trip (the focal and visited place). This is to test whether the pullback in overall mobility shown in the last section is associated with the locations' severity of outbreak.

The specifications include fixed effects for each dimension of the panel: time (ρ t ) and directed county pair (ρ ij ). 11 Therefore, the identifying variation is within a given trip route over time, relative to the national average change in trips. The effect measured is how the visits on a route change with case counts compared with the baseline period, pre-pandemic. Because of the moving-average nature of the visit rate definition, the daily data have a mechanical degree of serial correlation.

To reduce this but still account for the fast-moving dynamics of the outbreaks, our main specifications use one observation per week (Wednesdays). Standard errors are clustered by county pair and time. Table 1 reports the results of the gravity regressions from equation (2). Column 1 shows the coefficients on the two-week new case count in the focal and visit counties. 9 Throughout the paper, we focus on new case diagnoses reported within the travel window, although we have found similar results when measuring total cases and deaths. 10 The timing is such that the cases are being publicly reported within the travel window, so they would be salient to travelers and and would produce exposure as defined in Section 4. 11 Pairs are "directed" in that they are potentially asymmetric (ρ ij = ρ ji ). Cases in the focal county reduce trips outside the county (ω j < 0). A doubling of new cases in the focal county (an increase of about 69 log points) reduces recorded trips by 3.7 percent (≈ 0.69 × 0.054). New cases in the visit county also limit the visit probability (ω i < 0). That is, conditional on making a trip, devices are less likely to visit counties with relatively higher infection rates. A doubling of new cases in the visit county reduces trips by 3.5 percent (≈ 0.69 × 0.0498).

Column 2 adds controls for shut-down orders on either side of a trip. The estimates show that stay-at-home orders reduced travel, but conditioning on case counts, the magnitude of these effects was relatively small. Stay-at-home orders in the focal county reduced trips by 1.5 percent, and orders in the visit county reduced trips by 3 percent. Notably, the inclusion of the shut-down orders does not change the 11 marginal effects of new cases.

The distribution of visit frequency is highly skewed and distance-dependent, and perhaps not all trips were affected by cases in the same way. In column 3 we add an interaction of cases in the visit county with pre-pandemic visit probability to allow the case elasticity to depend on the base rate. The negative coefficient indicates that visits declined more (in proportional terms) to places that were visited regularly (as opposed to episodically) prior to cases arising. We add in column 4 an indicator for whether the counties are neighbors, allowing the nearest places to be affected at different rates. The coefficient on neighbors is positive, and the coefficient on the baseline visit rate interaction increases. Together, these results show that the trips declining the most were those to regularly visited counties, but with some persistence in the most proximate places.

Column 5 includes all controls together and the results are consistent with previous specifications. Finally, in column 6, we test the robustness of the model to using biweekly observations instead of weekly to correspond with the two-week lookback in the visit construction. We find the results very much consistent with the weekly data, even in the standard errors.

A natural question is whether these estimated effects can be interpreted causally.

Two potential threats to causal interpretation are omitted variables and reverse causality. On omitted variables, a two-way fixed effects design accounts for a host of potential problems. In this design, we are comparing trip frequency within a focalvisit county pair relative to the national average change for all pairs in a given week.

Results are driven not by cross sectional differences in mobility rates but by visits decreasing in proportionally greater amounts along routes with relatively more cases on either side of the trip. Any remaining threat would have to be a local, timevarying omitted factor driving cases and mobility in opposite ways. Likely more relevant, given the evidence in Section 2, is the potential for reverse causality. The extant evidence is that more mobility leads to more cases, while here we find that more cases lead to less mobility, suggesting our coefficients are if anything biased downwards. 12

In summary, the results indicate households were not only traveling less, they were avoiding places with more severe outbreaks. This suggests that households were less exposed to virus cases than if they had continued travel activity as in the days before the pandemic, a topic we treat in more detail in the next section.

Travel between counties likely results in people coming in contact with outbreaks outside their local area. Are these encounters consequential for case growth? To examine this question, we begin by defining nonlocal case exposure and then consider how the pattern of case avoidance shown above affected exposure and altered the trajectory of virus spread.

To summarize the case contacts a county is incurring via out-of-county travel, we construct a measure of nonlocal case exposure as

where n it represents new cases in the visit county at time t, and m ijt = σ ijt d jt is a pairwise mobility measure as in equation (1). The index is a summary of contacts with cases encountered outside the focal county: a case-weighted sum of the travel flows. We refer to this index simply as "exposure."

The exposure index could be high for a given county because of some combination of (i) high frequency of travel and (ii) travel to high caseload areas. In Appendix E.1, we decompose the sources of exposure. The general pattern is that more exposed counties have greater contact with high caseload areas and not necessarily higher levels of overall mobility. That is, the severity of the outbreak within the geography of a county's network is far more consequential for case exposure than the level of trips. For example, early in the U.S. outbreak, places connected to the New York metro area exhibited high levels of exposure, irrespective of their overall mobility.

Following from the results in Section 3, we examine the importance of case avoidance for exposure, comparing realized exposure to counterfactual exposure measures that assume travel activity did not change despite the increase in cases. Specifically, we calculate the exposure measure in equation (3), letting the number of cases n it evolve as in the data, but holding mobility constant at pre-pandemic averages (as if

. Table 2 shows the ratio of counterfactual exposures to actual exposures at monthend checkpoints. Column 1 shows the total effect declining mobility had on exposure.

Had travel activity continued as usual, the median county would have had exposure NOTES: The table reports the median ratio of counterfactual exposure, projected using pre-pandemic period mobility rates, relative to actual exposure for each listed point in time. Nonlocal case exposure is defined in equation (3). Column 1 is the combined exposure index, and columns 2 through 4 are its components. Column 2 holds fixed total active devices, column 3 holds fixed out-of-county pings per device, and column 4 holds fixed the visit county share in the focal county's travel network. Source: Authors' calculations using data retrieved as described in section 2.

to 54 percent more cases at the end of March, 109 percent more cases at the end of April, and 40 percent more cases at the end of May. Thus, at the springtime height of the pandemic, the median county would have been exposed to twice as many cases had mobility not adjusted.

Columns 2 through 4 show decompositions of the effects of the components of mobility on exposure. 13 From equations (1) and (3), there are three components to mobility and therefore three ways the contact intensity could change. First, the number of devices registering as active could change. 14 Second, the total frequency of out-of-county visits could change. Third, for a fixed amount of mobility, the network of visited places could change. 15 We find that each of the three components of the exposure measure contributed to 13 The decompositions do not add to the total because each is a median of a univariate calculation. 14 Devices could become active with or without registering out-of-county trips. 15 To see this, consider a decomposition of the contact from county j to county i as m ijt = d jt σ ijt = d jt M jt π ijt , the number of devices in j (d jt ), the total number of trips from j (M jt ), and the share of those trips from j to i (π ijt ). the decrease in exposure. For example, in April, had active devices counts continued as usual (column 2), case exposure would have been 24 percent higher. Had total visit frequency continued as usual (column 3), case exposure would have been 34 percent higher. Had the network of visited counties remained as usual (column 4), case exposure would have been 22 percent higher.

The last column is especially interesting because it shows a substantial amount of the change in exposure resulted not just from staying home, but from avoiding places with higher levels of cases when traveling. Notably, even as the level of total mobility edged higher in May, a reduction in exposure resulted from people avoiding counties with high caseloads. 16

The remaining question is whether out-of-county exposure causes increases in new cases. To test this, we regress new cases in a county on our index of exposure to out-of-county cases, controlling for lagged cases and other county attributes. The baseline model is

where county j at time t is the unit of analysis, n denotes new cases, x is the outof-county exposure from equation (3), and the Z's are county-level controls. In this 16 In Appendix E.2 we provide more detail on the evolution of exposure over time.

specification, time is measured in weeks.

The θs are parameters of interest, and principally, the exposure parameter θ 2 . Z jt is a set of controls for time-varying county characteristics-mainly, a within-county device exposure index, and in some specifications, the mobility index from equation

(1). The within-county device exposure index (also provided by CDHGW) is a measure of the number of other devices a typical device encounters at points of interest (e.g., stores) within the focal county. 17 This is distinct from the out-of-county travel activity in focus in our study, but it is similar to other measures of device activity in the literature. 18 Z j is a set of controls for fixed county characteristics, such as population size and density, or fixed effects to capture attributes nonparametrically.

Specifications include time fixed effects, ρ t . The ε is the error term.

The outcome variable is the natural logarithm of one plus the number of new cases reported in the last week. The observation level is county by week beginning the first week of March, when community-spread cases began to emerge in the U.S. The exposure index is lagged one week (representing activity one to three weeks prior to the observation date) so as not to overlap with the new case period in the outcome variable, and lagged new cases are measured over the same window as exposure.

Because the model includes time fixed effects, estimates are identified off of spatialtemporal variation relative to the national average. Standard errors are clustered by week and state. 19

Column 1 of Table 3 presents results using ordinary least squares (OLS) regres-sion. The regression shows two features of viral spread. First, and unsurprisingly, lagged cases in the county create new cases. A one-percent rise in past cases is associated with a 0.74 percent rise in new case growth. Second, and more novel, exposure to out-of-county cases increases local new cases. A one percent rise in outside exposure is associated with a 0.11 percent increase in new case growth. Moving from the median to 90th percentile county in terms of network exposure (roughly, from Ohio to New Jersey) would mean a 24 percent increase in new cases added in a given week.

The control variables indicate that larger and denser counties, and places with more within-county device exposure (i.e., fewer people staying at home), have higher case growth.

Next, we consider some alternative explanations to the causal effect of exposure.

One hypothesis is that the exposure measure is picking up something about overall mobility that is predictive of new cases. 20 Column 2 adds the county's mobility index directly, and its coefficient is marginally negative. 21 In light of the results of Sections 2 and 3, we attribute this to reverse causality-the pullback in mobility during the periods of higher case growth. The results suggest that any effect of mobility on new cases is operable via exposure to outside cases. There was regional heterogeneity in the severity of the outbreak and a predictable geographic component to the observed travel network, and thus another alternative explanation to exposure is spatial correlation in travel and case outcomes. As one way to address the possibility, 22 in column 3 we split exposure by nearby (neighboring county) and farther-away (non-neighboring county) exposure. (Together, these sum to the county's total exposure.) If all the exposure effect were coming from nearby counties, the exposure result may actually be spurious and due to spatial correlation.

Instead, we find significant effects for each source of exposure independently.

Another potential concern is that unobserved local attributes were driving both exposure and local virus spread. In column 4 we add county fixed effects in order to sweep out time-invariant features and focus on exposure variance within a county over time. The coefficient estimate rises relative to column 1, showing that even within a county, periods of greater exposure are followed by periods of greater increase in cases.

These results indicate case exposure through travel creates new cases within a focal county. However, the preceding sections showed that mobility dropped, and especially to and from counties with higher levels of new cases, which reduced the amount of exposure a county would experience. Hence, there is potential for reverse causality that may downward bias the estimated effects. With this concern, we seek an instrument correlated with exposure but not itself generating new cases.

Our strategy is to build a predicted exposure measure based on pre-determined features of a county. Using a gravity regression of trips on a flexible county-pair distance function (detailed in Appendix A), we recover a predicted county-pair visit rate,σ ij , based on proximity of counties. The predicted mobility then enters an expected exposure index,x jt = iσ ij n it , which is used an instrument for actual exposure. The exclusion restriction is that the distance to other county's cases affect the focal county's case rate only through potential travel-related exposure. 23

Column 5 reports the results of the IV regression. The coefficient on exposure rises to 0.21, indicating attenuation from reverse causality is indeed present in the OLS specification. Column 6 uses the IV with county fixed effects. Because the predicted visit weight used in constructing the instrument is distance-based and hence invariant for each county pair, the instrument loses power when adding fixed effects. The point estimate with fixed effects rises to 0.41 but is less precise. 24 While these are broadly consistent, the IV model without fixed effects is our preferred specification because it mostly relies on pre-determined variation coming from the way a county's point in space would affect its travel network.

In summary, we find consistent evidence that out-of-county exposure via the travel network affects new case diagnoses. Appendix F provides a number of additional robustness checks.

We have marshaled evidence for three important facts: (i) Travel activity dropped significantly as case counts rose, with a particular avoidance of areas with relatively larger outbreaks; (ii) Such a drop in activity limited exposure to out-of-county virus (4); "Expo" is shorthand for out-of-county case exposure. The outcome variable is the natural log of one plus the number of new cases in the county. The observation level is county by week. Standard errors are double clustered by county and week. Source: Authors' calculations using data retrieved as described in section 2.

cases; (iii) Out-of-county exposure affects the rate of new cases added. Together, these facts suggest cases would have been higher had travel activity not dropped in response to cases. Our last exercise is to combine these insights into a single model in order to evaluate conjectures about spread of the virus in alternative travel scenarios.

We construct the following spatial vector autoregressive model of mobility, case exposure, and case growth. The primary outcome of interest is new cases added, in equation (5a), which is affected by own-county and out-of-county exposure. For the rate of transmission from local and nonlocal case exposure, we take point estimates from our preferred model in Table 3 , column 5. Nonlocal case exposure, in (5b), is a function of outside cases and mobility, which is itself affected by the path of cases locally and nonlocally (equation 5c). To calibrate the responsiveness of mobility to cases, we take point estimates of equation (2) 

σ ij,t = δ 1 n j,t + δ 2 n i,t + δ 3σij,0 n i,t

We emphasize that this is an autoregressive process and not an epidemiological model. There are no notions of recovery, death, or immunity among the population.

(Indeed, our unit of analysis is a spatial area, not a person.) We will note the values the model produces for the sake of exposition, but we intend this exercise to be more illustrative than empirical. 26

Accordingly, to keep the model simple, we illustrate a three location system.

Two locations are calibrated with symmetric mobility rates to represent two closely connected counties and another more distant county. We set the baseline visit rate to 7.5 percent for the closely connected locations and 0.55 percent for the distant one. 27

The model is used for the following thought experiment: if an outbreak of new cases exogenously appears in one of the two connected locations, what happens to the spread of the disease locally and throughout the system? To illustrate the importance of endogenous travel for the rate of disease spread, we simulate the model in three scenarios: (i) a default without mobility (i.e., a purely autoregressive process, (5a) alone), (ii) with mobility but without the feedback effect of cases on travel ((5a) and (5b)), and (iii) with mobility and endogenous feedback ((5a), (5b), and (5c)). Figure   2 plots the impulse responses for an experiment of 10 new cases dropped into the "treated" location.

The path of new cases added is depicted in the first row of figures. The rate of own-location spread is below one, so that if there were no mobility (and consequently no exposure), the virus would asymptotically die out in the treated location, as 26 In particular, we suspect that the parameters measuring the mobility response to cases could be downward biased, but the central point of the model can be made by contrasting responsive and unresponsive mobility. 27 We have in mind connected but distinct economic regions. A visit rate of 7.5 percent represents the integrated counties of the Philadelphia and New York CBSAs, for example. See also Table A2. illustrated by the "isolated/no mobility" lines.

In scenarios with mobility and exposure, the outbreak jumps locations, which themselves grow through local spread. Exposure then leads to the subsequent reinfection of other places in the system, keeping the disease alive. The impact of exposure then depends on the degree of mobility.

In the treated location, the path of cases shows an initially oscillating pattern, as own-location case contribution slows but exposure to outside cases rises. New case rates then rise to a steady state. In the initially virus-free connected location, nonlocal exposure seeded the local outbreak, and it eventually reaches the same steady-state level as the treated location. The distant location experiences its own outbreak, although its lower connectivity translates into a lower long run average rate of exposure, so its steady state is lower than the two closely integrated counties. mobility does not decline in response to the outbreak, the rate of new cases added is faster and steady-state level is higher.

In summary, the model shows why spatial connectedness matters for both the spread and the perpetuation of the virus. Most directly, nonlocal exposure allows the virus to jump from one area to another. Perhaps less obvious, however, is how travel also affects the rate of growth of cases and the steady state level. Connectedness generates higher caseloads as travel compounds local transmission through reinfection across areas.

This paper has used county level location data from smartphones to document the change in travel activity during the early phase of the COVID-19 pandemic in the U.S. We find that mobility across counties dropped substantially as case counts rose. Relatively larger case counts decreased spatial activity on both sides of a trip:

Mobility decreased more in counties with more cases, and the activity that did occur tended to avoid areas with higher caseloads.

Understanding the nature of the change in activity is important because mobility across county lines produces contact with nonlocal cases. Such case exposure contributes to local case growth which in turn has a feedback effect on nonlocal case growth, creating exposure for other localities in a continuing loop.

Our findings have several implications for policy and practice. First, public information about the spread of the virus is important. We find people responding to such information by restricting their activity in rational ways-both in level and in direction. In a sense, a "healthy fear" of the virus appears to provide motivation for social distancing and similar behavioral interventions, perhaps even more so than government mandates.

Second, because spatial activity never entirely disappears, localities could benefit from coordinated responses and shared information. Connectedness means there are spatial externalities. A policy that suits one area may inadvertently produce a threat to a connected area. Fragmented policy across regions could inhibit society's ability to control the spread of COVID-19. Wilson, Daniel John. "Weather, social distancing, and the spread of COVID-19." medRxiv (2020).

This appendix reports additional details regarding the underlying case and mobility data sets.

First we present some basic statistics from the daily COVID-19 case data. 28 In the spring of 2020, the early phase of the pandemic, COVID-19 cases were relatively concentrated in the Northeast U.S., and especially the New York City metro area, although there was some presence of cases throughout the country. Table A1 reports summary statistics on case prevalence in terms of per capita cumulative diagnoses and rates of new diagnoses in each month of our period of interest. The distribution of cases is skewed with a long right tail, with many counties having low rates but some having major outbreaks. The ratio of the 99th percentile to the median per capita infection rate is at or above 19 for each month in our sample.

The mass of the distribution shifted to the right as the virus percolated throughout the country. The peak of new infections was in early to mid April (although these rates were then surpassed by surges in the summer of 2020 and winter of 2020-2021). 28 We have taken care to adjust the data for changes in reporting format or geography (e.g., counties that report together in some periods and separately in others) and to exclude outliers and values outside the domain of possible outcomes (e.g., negative new cases). Nevertheless, the data are reported subject to some discretion by health care providers and state and local health departments that introduces unavoidable measurement error. 

The Table A2 reports summary statistics for one of the main objects of interest, the fraction of devices in the focal county present in the visit counties in the previous 14 days, which includes the own-county rate as a "visit." The typical county has a same-county ping rate of 90 percent, meaning 10 percent of devices present today are "new" and were not present in the preceding two weeks. When limiting to the non-reflexive counties, the average ping rate is dramatically lower by nature. There is a clear geographic pattern. The average county pair has a ping rate of 0.3 percent.

This rises to 0.5 percent within region, 3 percent within state, 19 percent within commuting zone, and 23 percent among neighboring counties. The visit rates are also highly skewed, with some county pairs showing frequent interaction (50 percent visit rate and above), and the mean visit rates larger than the medians. As such, the typical county has about half of its total trips between its top 10 most frequent connection partners and the remaining half through its (thousands of) other less frequent connections.

With the geographic pattern as guidance, in equation 6 we write down a basic gravity model of log visit frequency as a function of distance between county pairs.

Various distance measures, denoted D k , including log miles between county centroids and indicator variables for being in the same discrete geographic areas, form a flexible distance function. The model is run using daily data on the pre-pandemic period (January 20 to February 23, 2020), and we also include day-of-week ("dow") effects to adjust for correlation between county pairs and commuting patterns. Table A3 reports the coefficients δ k , τ dow . These reflect the spatial patterns suggested by Table A2 : visit probability is strongly declining in distance, with discrete jumps (on average) for counties within the same delineated geographic boundaries of metro areas, states, and census regions.

The projection of the pairwise visit rate from this regression forms our instrument for expected case exposure detailed in Section 5.

In addition to the out-of-county mobility index measuring travel behavior, CDHGW use the same underlying smartphone location data to construct a measure of withincounty device activity. The device exposure index, or the "DEX," measures for a smartphone residing in a given county, how many distinct devices also visited any of the commercial venues that the device visited on a given day. We use their countylevel average DEX with adjustment for active devices (see Couture et. al. (2021) for details). Table A4 presents summary stats in the pre-pandemic period for counties by size of commuting zone (CZ). Larger counties tend to have higher device exposures.

This appendix reports on the dynamics of mobility indices derived from the smartphone data. over time, each of these is indexed within county so that the period of January 20 to 

The national series mask a fair amount of regional heterogeneity. States varied in the timing and intensity of their travel restrictions, and case diagnoses varied substantially across the U.S. Did the mobility of households in more affected areas respond more strongly?

To address this question, we leverage the spatial variation in the mobility index and case counts by county in addition to state-level restrictions on mobility in the following model:m

The left-hand side is the indexed mobility rate, c j,t−13:t denotes new case diagnoses in the home county in the preceding two weeks, 29 and the R q terms denote type q government restrictions of on activity. These take the form of indicator variables, I(), for whether the restriction is in place at time t.

The gravity regression in Section 3 measures more precisely the effect of two-sided case prevalence on trip rates between county pairs, but this simple regression provides a more descriptive, atheoretical approach to compare the travel activity in hard-hit (or heavily regulated) counties to others. Our main objective for this analysis is simply to measure covariances in the data to explore whether cases and shutdown orders independently correspond to changes in mobility. We do not intend to make causal claims here, although most forms of endogenous threats seem to work against detecting an effect: Most studies find mobility of various forms to cause cases, and we find cases to reduce mobility, the opposite of what reverse causality or simultaneity would suggest. Table C1 reports coefficients from this regression of county-level mobility rates on local cases and restrictions. We use a biweekly frequency to correspond to the lookback period in the data and avoid overlapping observations, although this limits somewhat the power of the regression to detect the effect of cases (which can vary

week to week). We have found similar results using higher frequency data. Around the same time cases were growing, state and local governments enacted restrictions designed to limit mobility of residents and suppress the spread of the virus. In column 3, we include indicator variables for the presence of the two most common of these measures, closure of nonessential businesses and stay-at-home orders, in order to measure their effects on cross-county activity. 31 Each independently 30 Recall that each county has been indexed to a pre-period average of 100 so that average level differences in mobility between counties will not affect the covariances we estimate. 31 Many of these restrictions were designed to limit within-county activity as much as between county activity, which we do not study in this paper. These effects may show up in our measures depressed activity to a significant degree, although their inclusion scarcely affects the estimate of local cases, suggesting people were still reacting to public information about the virus. In column 4, we remove the local new case variable to make the comparison. Failure to include public information about local cases causes a larger estimate of the marginal effect of government activity restrictions.

Many of the worst outbreaks were in large counties in major metro areas, so the regression may be picking up the drop in mobility from these hard-hit large areas.

Column 5 scales the variable of interest, using new case per capita, and the results hold. Scaling by population also sets up a way to compare across geographic areas.

In column 6, we simultaneously include new cases in the local county, the state, and the larger region (census division). The results show significant effects at each spatial scale, but attenuating with distance, suggesting people are attuned to general conditions but most responsive to outbreaks in their local areas.

D Appendix: Cases predict the decline in mobility The next exercise is meant to show whether the effect of cases on trip rates that we estimate in the gravity regression of equation (2) can predict the aggregate declines described in Sections 2, B, and C. Figure D1 plots the median mobility index from predicted values from the regression of equation (2) (column 3 from Table 1 ) alongside the actual mobility index to the extent that they depressed the number of active devices in a county, d jt . NOTES: The outcome variable is the county-level index of mobility as defined in equation 1, and indexed by the pre-pandemic average for each county. Units are percentage points. Standard errors are clustered by county and time of observation. Each regression contains 2,018 counties and 9 weeks for a total of 18,153 observations. Source: Authors' calculations using data retrieved as described in section 2.

from Figures 1, B1 , and B2. There are two versions of the projection: one with the time dummies factored in and one with them excluded so that the projection relies only on case counts. 32 In either version, the projection does a remarkably good job of predicting the fall in mobility, indicating that case avoidance was critically important in explaining the drop in spatial activity. The version without time dummies fails to predict two blips in activity-just before the emergency declaration and the depths of the trough in mid-April-but more notably, it fails to predict the rise in mobility in mid to late May, as cases were still fairly prevalent. These turning points in the pandemic seem to deviate somewhat from the average pattern during the escalation phase of case growth in late March and April 2020. A version of the projection using the stay-at-home orders produces similar results and only slightly better predicts the recovery in activity in May, as some restrictions were relaxed.

This appendix provides greater detail about nonlocal (out-of-county) case exposure as defined in equation (3). Table E1 presents summary statistics of this exposure measure for the counties in our mobility sample at each checkpoint as the case summary in Table A1 . This measure averages values in the hundreds, and like the cases themselves, is highly skewed to Table 1 , column 3 estimate of (2). One projection uses the time dummies in its forecasted values, and the other uses the coefficient estimates from the same model but omits the time dummies. Source: Authors' calculations using data retrieved as described in section 2. the right tail. Exposure rose over time even as mobility fell because cases became more widespread.

Exposure measured in this way could be high for a given county because of some combination of (i) high frequency of travel and (ii) travel to high caseload areas. In Table E2 , we present some decompositions to illustrate the source of higher exposures. The general pattern is that more exposed counties tend to have greater contact with extremely high caseload areas and not necessarily higher levels of overall mobility.

The upper panel of Table E2 shows statistics for the whole sample and for a split between the 50 highest exposure areas and the lesser exposed areas. The highest exposure areas actually had relatively less total mobility on average, making outside county trips at a rate of 287 percent compared to 317 percent for lesser exposed counties. (Recall that the mobility measure is a sum of binary event probabilities and can therefore sum to more than 100.)

The differences appear when splitting by destination. Columns 4, 5, 6 and 7 show statistics for contact with destinations among the highest two percent of cases per capita in the U.S. The highest exposure areas visited these drastically more often than the typical county, encountering a high case area at a trip rate of 160 percent (or 58 percent share of all out-of-county visits) versus 14 percent (4 percent share) for lesser exposed counties. These high case areas comprise nearly all of a highly exposed county's exposures (and still a significant fraction of the lesser exposed places too).

As a benchmark, columns 8 to 11 show the travel to the counties' top 50 most visited destinations. The more exposed counties did not travel to their usual partners at any higher frequency than the lesser exposed, but they accumulated much more exposure because these partners had larger outbreaks. Thus, exposure is largely a function of outbreak size in a county's usual network.

To illustrate these patterns, the bottom panel of the 

Next we examine the importance of changes in travel behavior on exposure. To do so, we compare actual exposure to counterfactual exposure measures that assume travel behavior did not change despite the increase in cases. Figure E1 plots the median actual exposure, measured in counts of cases, and the median exposure that would have obtained had each county continued with business as usual. (To obtain these series, each county's exposure was scaled by its pre-pandemic mobility in order to allow cross-sectional comparisons.) Clearly, the pullback in mobility significantly altered the degree of outside exposure to virus Figure E1 : Actual Exposure Compared With "Business As Usual" Exposure NOTES: The figure plots the median exposure index across the sample of counties in the smartphone data. The unit of measure is number of cases per unit of mobility in the pre-pandemic period; its scale is comparable across counties.

cases: The median county would have been exposed to twice as many cases if travel behavior had not adjusted.

In Table 2 in the main text, we decompose the differences in exposure to understand the importance of declines in overall travel activity versus the avoidance of highly affected locations.

This appendix section provides additional robustness checks for the new cases model reported in Section 5, Table 3 .

First, Table F1 presents robustness that focus on the spatial nature of the outbreaks. Table 3 found that higher out-of-county case exposure lead to higher levels of new cases added locally. However, there was regional heterogeneity in the sever-ity of the outbreak and a predictable geographic component to the observed travel network, and thus an alternative explanation to the effect of exposure is spatial correlation in travel and case outcomes. In other words, the exposure measure could be simply picking up overlap in the regional components of each variable. Table F1 provides another version of this check, splitting exposure into in-state and out-of-state in order to correspond the level of governance at which most emergency policy was handed down. We find results consistent with those in Table 3 .

As another way to address the issue of spatial correlation, in columns 2 to 5, we add to the OLS specification increasingly fine nonparametric controls, interacting the week of observation with geographic areas from region to commuting zone. To some extent, this reduces variation we actually want to capture-that Pennsylvania, for example, is more exposed to the New York metro area than, say, North Carolina.

Hence, unsurprisingly, the coefficients shrink as the spatial-temporal nonparametric controls shrink narrower. Even so, we find statistically and economically significant effects of exposure. Column 5 uses the finest controls of commuting zone by week, 33 and the effect is mechanically smaller, but still significant. While the controls soak up too much relevant variation to be our preferred model, these specifications indicate that variation in exposure even at the most local level is meaningfully predictive of future new cases added.

Our baseline exposure measure followed the timing of the mobility data construction, weighting trips out from the focal county by the cases encountered in the visited county. In columns 6 and 7 we experiment with a reverse method of measuring exposure, the "exposure in" index. This flips from our baseline, weighting visits by cases in the visited county and applying to the focal county (A → B), to instead weighting visits by cases from the focal counties into the visited counties (B → A).

Our preferred measure follows the time structure of the data (devices today observed in visited counties 1 to 13 days prior), but as we described in Section 2, the mobility data has no definitive notion of the direction of a trip, so these measures turn out to be highly correlated, and the results look similar to the exposure out metric we use in the baseline. They are not perfectly correlated, however, due to the timing difference, and when entered simultaneously, both show effect on new cases added. Table 3 that focus on the instrumental variable specifications and the mobility alone index.

Column 1 presents an alternative instrument for exposure, a projected exposure using realized pre-pandemic mobility weights. The coefficient arrives between the OLS and our preferred IV estimates (columns 1 and 2 in Table 3 ), suggesting it reduces but does not eliminate the omitted variable bias present in the OLS specifications.

Columns 2 to 5 examine the mobility index separately from the exposure measure.

Columns 2, 3 and 4 use an instrument for mobility derived from weather conditions, a predicted mobility from a regression model of total travel activity based on focal county weather and a travel-weighted measure of visit county weather. 34 Column 2 enters the mobility index alone to compare with column 2 of Table 3. The coefficient is again negative, suggesting the issue is reverse causality and not its being conditionally negative only when exposure is included, and the weather-based instruments do not affect this result. Neither does the instrument affect the result when exposure is included again in column 3. Column 4 instruments for exposure with predicted exposure and for mobility with weather, and the results are much the same. Finally, column 5 uses shutdown orders as instrument for mobility. These cause the mobility coefficient to turn positive, but it is small and not statistically different from zero. We note that shutdown orders-enacted when cases were rising and deemed a threat-probably fail the exclusion restriction. Table F3 reports robustness checks on the functional form of the exposure-to-new cases model in Table 3 . We alter the baseline exposure measure of (3), using

Varying the exponents α 1 and α 2 changes the degree of variance attributed to differences in mobility vis-a-vis contact with out-of-county cases. Our baseline set α 1 = α 2 = 1. In F3, we use permutations of the exponents in increments of 1/2 up to 2. The baseline regression, appearing in Table 3 , is highlighted by a box in column 2.

Coefficients decline mechanically as the mean of the index increases, but otherwise results are similar across specifications. The baseline model with unitary weights on the mobility components provides a slightly better fit. Over-weighting the mobility counties lack weather observations. what matters for variance in exposure is contact with high caseload areas (see Table   E2 ). (4). The outcome variable is the log number of new cases in the county. The observation level is county by week. The table runs through different calibrations for the exponents in the exposure measure, equation 3, as indicated by column and row of the table. The means of the exposure metric are reported in each specification block in brackets. The boxed specification is the preferred model in Table 3 . Source: Authors' calculations using data retrieved as described in section 2. Panel A illustrates the impact of the exposure transmission rate, the rate at which nonlocal exposure converts to local new cases. There are low and high scenarios: low transmission of 0.11 (from column 1 of Table 3 ), and high transmission of 0.21 (from column 5 of Table 3 ). We use the higher transmission rate in our baseline because it derives from our preferred instrumental variable specification. In panel A1, the model assumes mobility is unresponsive to cases (i.e., equation (5c) is set to 0), while A2 allows mobility to be reduced by the observation of cases. The magnitude of the transmission rate is consequential in determining the size of the steady level of new cases, especially when mobility is not responding to cases (and hence exposure is higher). An increase from 0.11 to 0.21 transmission rate leads new cases to be 70 percent higher when mobility does not react to cases and 23 percent higher when mobility does respond.

Panel B illustrates the impact of the baseline mobility rate, the degree of connect-55 edness between locations. There are low and high scenarios: low mobility of a 7.5 percent visit rate, the calibration featured in Section 6, representative of two highly connected but spatially distinct economic areas (e.g., New York and Philadelphia metro areas), and high mobility of a 10 percent visit rate, typical of same-commuting zone counties. In the high mobility calibrations, the distant county visit rate is also increased by one third from 0.55 percent to 0.72 percent. In panel B1, the model assumes mobility is unresponsive to cases, while B2 allows mobility to be reduced by the observation of cases. The magnitude of the mobility rate is greatly important in determining the size of the steady level of new cases, especially when mobility is not responding to cases. An increase in the baseline mobility of just one third leads new cases to double when mobility does not react to cases. When mobility does respond, the increase in baseline mobility causes cases to rise 32 percent. In Panel A, each figure has a line from each of two scenarios quantifying transmission rate from nonlocal exposure as measured by table 3: -A1. Low transmission, column 1, nonlocal transmission rate of 0.11, local transmission of 0.74; -A2. High transmission, column 2, nonlocal transmission rate of 0.21, local transmission of 0.73 (used in Figure  2 ).

In Panel B, each figure has a line from each of two scenarios specifying the initial mobility rate: -B1. Low Connectedness of 7.5 % visit rate between connected and 0.55 % visit rate between distant locations (used in Figure 2) ; -B2. High Connectedness of 10 % visit rate between connected and 0.72 % visit rate between distant locations; Source: Authors' calculations using estimates from Tables 1 and 3. 

Reopening the Economy: What Are the Risks, and What Have States Done?

Social connectedness: Measurement, determinants, and effects

The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak

Social network sensors for early detection of contagious outbreaks

Strong Social Distancing Measures In The United States Reduced The COVID-19 Growth Rate: Study evaluates the impact of social distancing measures on the growth rate of confirmed COVID-19 cases across the United States

JUE Insight: Measuring movement and social contact with smartphone data: a real-time application to COVID-19

Disparities in mobility responses to covid-19

Urban flight seeded the covid-19 pandemic across the united states

How many jobs can be done at home?" No. w26948

Staying at home: mobility effects of covid-19

Optimal lockdown in a commuting network

Human mobility restrictions and the spread of the novel coronavirus (2019-ncov) in china

Internal and external effects of social distancing in a pandemic

A cell phone data driven time use analysis of the COVID-19 epidemic

Pandemic in an inter-Regional modelstaggered restart

JUE Insight: How Much does COVID-19 Increase with Mobility? Evidence from New York and Four Other U.S. Cities

Macroeconomic Implications of COVID-19: Can Negative Supply Shocks Cause Demand Shortages?

Tracking public and private response to the covid-19 epidemic: Evidence from state and local government actions. No. w27027

The great lockdown and the big stimulus: Tracing the pandemic possibility frontier for the US

The effect of human mobility and control measures on the COVID-19 epidemic in China

All authors contributed to all aspects of the research.