key: cord-0453567-ovub3kah authors: Callaway, Brantly; Li, Tong title: Understanding the Effects of Tennessee's Open Covid-19 Testing Policy: Bounding Policy Effects with Nonrandomly Missing Data date: 2020-05-19 journal: nan DOI: nan sha: 18622e0c003400892b33c7f4b39e08af553ee9a6 doc_id: 453567 cord_uid: ovub3kah Increased testing for Covid-19 is seen as one of the most important steps to be implemented to re-open the economy. The current paper considers Tennessee's ``open-testing'' policy where the state substantially increased the number of available tests while opening testing to all individuals that wanted a test; this is unlike most other states that have required that individuals must be showing specific symptoms in order to be tested. In the current paper, we examine whether Tennessee's policy has affected (i) the number of confirmed Covid-19 cases, (ii) the number of trips to work, and (iii) the (unobserved) number of actual Covid-19 cases. To study these effects, we employ standard identifying assumptions in the policy evaluation literature, but this strategy is greatly complicated by the non-random nature of the tests. We construct bounds on the policy effects of interest. We find suggestive evidence that Tennessee's open-testing policy has led to a reduction in the number of confirmed and total cases as well as reduced travel in counties that have experienced relatively large increases in confirmed cases. Widespread testing for Covid-19 is seen as one of the requirements for re-opening the economy. 1 For example, in California, "the ability to monitor and protect our communities through testing, contact tracing, isolating, and supporting those who are positive or exposed" is the first of six indicators for when the state would potentially modify its stay-at-home order. 2 The idea is that mass testing would include individuals with mild symptoms or even no symptoms. Individuals that test positive could be isolated for some period of time and their contacts could be traced, notified of their potential exposure, and potentially also be tested. In principle, this extra testing would further result in detecting and mitigating localized outbreaks much earlier than they would be otherwise. In the current paper, we examine a policy implemented by Tennessee that expanded their testing capacity and made testing available to anyone who wanted to take the test. This policy was substantially different from neighboring states that all have had fewer tests per capita along with substantial eligibility requirements for taking the test. The goal of the paper is to examine whether or not Tennessee's open-testing policy has affected the number of Covid-19 cases in the state as well as other economic outcomes (in particular, we focus on number of trips to work). Our identification strategy is to take counties in Tennessee and compare them to "similar" counties in nearby states. In particular, we compare outcomes in Tennessee and Alabama among counties that had similar populations, Covid-19 cases, and had conducted a similar number of tests prior to Tennessee's open-testing policy. Under standard assumptions, differences in outcomes experienced by counties in Tennessee relative to outcomes in counties in Alabama with similar characteristics can be attributed to the policy differences between the two states. This would be a relatively straightforward exercise if testing were administered randomly in each state; but that is not the case. In particular, individuals who are more likely to have Covid-19 appear to be much more likely to take the test. For most states, including Alabama, this is by construction: showing certain symptoms is a requirement to be able to be tested for . Even in Tennessee, where the policy allows any individual that would like to get a test to take it for free, it still seems likely to be the case that there is some selection into taking the test. In practice, this creates two challenges. First, it is not possible to compare counties that had the same number of total Covid-19 cases before the open-testing policy was implemented because the total number of Covid-19 cases is not observed. Second, and for the same reason, it is challenging to evaluate the effect of the policy on the total number of Covid-19 cases. 3 We propose several strategies for dealing with nonrandomly missing testing data (discussed at length below). In particular, we make relatively weak assumptions on the fraction of untested individuals that have had Covid-19 that lead to bounds on the policy effects of interest. For observed outcomes such as confirmed cases and trips to work (which are relatively simpler due to only suffering from the first issue mentioned above), we find suggestive (though not conclusive) evidence that the open-testing policy (i) decreased the number of confirmed cases, and (ii) decreased the amount of travel to work in counties that experienced relatively large increases in their number of confirmed cases over time. For the per capita number of total Covid-19 cases (which is the relatively harder case due to suffering from both issues mentioned above), we obtain non-trivial bounds on the effect of the policy suggesting that the policy has reduced the number of total cases in Tennessee relative to what they would have been in the absence of the policy. The most important driver of these results is that it simultaneously appears that the open-testing policy has increased the number of tests while decreasing the number of confirmed cases -together, these form a strong piece of evidence that the policy has reduced total cases even though total cases are unobserved and it is hard to come up with reasonable assumptions that lead to point identification of the effect of the policy on total cases. Taken together, these results provide tentative evidence that Tennessee's open-testing policy has had a positive effect along several dimensions. Our paper's main contribution is two-fold. First, it has been a widely held view that more testing is important to contain the outbreak, and to reopen the economy. 4 It is thus of profound importance to quantify the effects of adopting the open-testing policy. To the best of our knowledge, our paper is the first one to evaluate the effects of an open Covid-19 testing policy where we study the effect of the policy in Tennessee which is the first state to offer open-testing. Second, we make a new methodological contribution in providing a method to bound the policy effects with nonrandomly missing data, which is a serious concern in our case, and cannot be dealt with employing the standard methods used in the treatment effects literature. Building on Manski and Molinari (2020), who study bounding the Covid-19 infection rate under weak assumptions, we provide a novel method to bound the policy effects of the open-testing policy. Our method can also be applied to other policy evaluation applications with nonrandomly missing data. On Wednesday, April 15, Tennessee's Republican governor Bill Lee announced free testing in the state for anyone who wanted a test. 5 That Saturday, April 18, more than 6,500 Tennessee residents were tested at 20 different testing locations across the state. Unlike almost all other states at that time, obtaining a test did not require an individual to be showing symptoms or to be in a high risk group. These tests were also available on the weekends of April 25 and May 2. The open-testing policy was only temporary though as the state is now adjusting the policy to be more targeted to vulnerable groups. 6 Compared to its six neighboring states, Tennessee was already a high testing state when the new policy was implemented (see Figure 1 ). 7 Out of these seven states, Tennessee was roughly tied with Mississippi for the most tests per capita when the open-testing policy was implemented. Between April 17 and May 6, the number of test per capita increased in Tennessee by over 2 percentage points -more than any of its neighboring states. And by May 6, Tennessee had conducted more tests per capita than any of its neighboring states (49% more than Alabama, 70% more than Arkansas, 73% more than Georgia, 144% more than Kentucky, 23% more than 4 For example, in his tweet on May 14, 2020, U.S. Senator Lamar Alexander (R-Tenn) says that more testing is key to ensuring people are safe as they go back to work and go back to school; https://twitter.com/SenAlexander/status/ 1260957179210653697. Mississippi, and 112% more than North Carolina). On the other hand, Tennessee was closer to the middle in terms of number of confirmed cases. On April 17, Tennessee was essentially tied with Alabama for third out of seven in terms of per capita confirmed cases (see Figure 2 ) behind Georgia and Mississippi. By May 6, Tennessee had moved somewhat ahead of Alabama in terms of per capita number of confirmed cases but was still behind both Georgia and Mississippi. It is also important to remember that the number of confirmed cases depends on the number of tests especially in cases like Covid-19 where the number of tests is relatively low and there may be a large number of asymptomatic cases or cases with relatively mild symptoms. Thus, for example, the increase in confirmed cases in Tennessee relative to Alabama could be explained by an increase in actual cases in Tennessee relative to Alabama or just due to a mechanical increase in the number of confirmed cases arising from more extensive testing. Our analysis uses two main datasets. The first dataset is state-level and includes data from Tennessee and each of its six bordering states: Alabama, Arkansas, Georgia, Kentucky, Mississippi, and North Carolina. The state level dataset contains information on the total number of Covid-19 cases by state over time and total number of tests by state over time. This data comes from the Covid Tracking Project (https://covidtracking.com/). This state level data is also merged with state population data from the Census Bureau. The second dataset consists of county-level data from Tennessee and Alabama. The Covid Tracking Project provides case counts by county over time. But, many of our main results Notes: Per capita confirmed Covid-19 cases for Tennessee and its six neighboring states over time. Sources: Covid Tracking Project (https://covidtracking.com/) and Census Bureau require county-level testing data. Tennessee provides historical county-level data at its Department of Health website (https://www.tn.gov/content/tn/health/cedep/ncov/data.html). Of its bordering states, the only state that currently provides county-level testing data is Alabama, but it only provides current (not historical) county-level testing data (https://dph1.adph.state. al.us/covid-19/). In order to recover historical county-level testing data, we were able to use the Internet Archive's Wayback Machine (web.archive.org). The earliest available county-level data for Alabama using this approach is from April 17. This is after Tennessee's open-testing policy was announced but before individuals in Tennessee could actually get a test; there is also typically around a 72 hour delay in the test results being available. Together these suggest that we can treat April 17 as being "pre-treatment" for both Alabama and Tennessee. We merge the county-level testing and cases data with (i) data from the Census Bureau on county-level population, and (ii) data from Google's Covid-19 Community Mobility Reports (https://www.google.com/covid19/mobility/). These are aggregated cell phone data that Google has published to help researchers studying Covid-19. We focus primarily on county-level trips to work and how this variable evolves over time. It is reported as a percentage change relative to pre-Covid trips to work. Finally, it is worth mentioning that Trousdale County and Bledsoe County experienced large Covid-19 outbreaks in prisons. 8 In some of our descriptive analysis, we keep these counties, but in our main results, we drop these counties. 8 See https://www.tennessean.com/story/news/politics/2020/05/01/tennessee-testing-all-inmates-prison-staff-after-multiple-ou 3067388001/ and https://www.tennessean.com/story/news/local/2020/04/23/coronavirus-bledsoe-county-prison-inmates/ 3003595001/. We use the following notation: • C ilt -a binary variable for whether or not individual i in county l has had Covid-19 by time period t. • R ilt -a binary variable for whether or not individual i in county l has tested posted for Covid-19 by time period t • T ilt -a binary variable for whether or not individual i in county l has taken a test for Covid-19 by time period t Our first goal is descriptive: to see what fraction of the population has had Covid-19 by time period t in a particular county l (note that the same arguments would apply for another fixed location such as a state as well). That is, our interest centers on P (C ilt = 1). To be clear about the notation here, this is the fraction of the population in county l at time period t that has had Covid-19. That is, we are averaging over all individuals in a particular county l at time period t. Identifying the fraction of individuals that have ever had Covid-19 is challenging because (i) not all individuals have been tested and (ii) for individuals that have been tested for Covid-19, testing has not been randomly assigned. The goal of this section is to develop non-trivial bounds on the fraction of total Covid-19 cases in a particular location at a particular time under plausible identifying assumptions. In particular, following Manski and Molinari (2020), notice that P (C ilt = 1) = P (C ilt = 1|T ilt = 1) P (T ilt = 1) + P (C ilt = 1|T ilt = 0) P (T ilt = 0) (1) which follows immediately by the law of total probability. Next, consider each of these terms individually: • P (C ilt = 1|T ilt = 1). This is the fraction of the population in county l at time period t that has had Covid-19 conditional on being tested. We discuss this term in more detail below. • P (T ilt = 1) is the (observed) fraction of the population in county l at time period t who have been tested for Covid-19. • P (T ilt = 0) is the (observed) fraction of the population in county l at time period t who have not been tested for Covid-19. • P (C ilt = 1|T ilt = 0) is the (unobserved) fraction of the population that have had Covid-19 but have not been tested in county l by time period t. This term is the hardest to identify, and we discuss plausible assumptions that lead to bounds on this term below. Next, consider P (C ilt = 1|T ilt = 1). It can be written as where the first equality holds by the law of total probability and the second equality holds because (i) R ilt = 1 =⇒ T ilt = 1 (i.e., in order to test positive, an individual has to be tested), (ii) we suppose that the false positive rate of the test is equal to 0 which implies that P (C ilt = 1|R it = 1) = 1, 9 and (iii) repeated application of the definition of conditional probability for the second term. Then, rearranging implies that is the (observed) fraction of tests that have come back positive in county l at time period t. • P (R ilt = 0|T ilt = 1, C ilt = 1) is the false negative rate of the test. This is a property of the test, and we set the false negative rate to be equal to 0.25. 10 Equation (2) says that the probability of having Covid-19 conditional on being tested is increasing in the fraction of positive tests and the false negative rate of the test. It also implies that every term in Equation (1) is identified except P (C ilt = 1|T ilt = 0). Without employing some additional assumption on this term, the bounds on the rate of total cases are given by In our case, these sorts of bounds would be extremely wide. For example, for the whole state of Tennessee, P (R ilt = 1) is about 0.2% and P (T ilt = 0) is about 97% (i.e., about 3% of Tennessee's population has been tested and about 0.2% have had a positive test). If the only 9 The false positive rate is given by P (C ilt = 0|R ilt = 1), and there is evidence that the false positive rates are extremely low; see https://www.usatoday.com/story/news/nation/2020/03/16/ coronavirus-what-expect-when-you-get-tested-covid-19/5061120002/. 10 Manski and Molinari (2020) put bounds on a closely related term called the Negative Predictive Value of the test; we could similarly put bounds on the false negative rate of the test. We do not do this in the current paper in order to mainly focus on the bounds arising from non-random testing. In the results presented below, in general, the bounds are not very sensitive to different reasonable values of the false negative rate of the test. More specifically, a higher false negative rate increases both the lower bound and the upper bound, but it increases the upper bound relatively more under Assumption 1 (see Equation (4) below); thus bounds tend to be somewhat wider (although the actual difference is small) under larger values of the false negative rate of the test. restriction on P (C ilt = 1|T ilt = 0) is that it is bounded between 0 and 1, then this will lead to extremely wide bounds on Covid-19 cases (essentially uninformative). Instead (and continuing to follow Manski and Molinari (2020)), we make the following assumption. Assumption 1 (Covid-19 Bound for Untested Individuals). Assumption 1 says that the fraction of individuals who have had Covid-19 (in a particular county) is lower among the group of individuals who have not been tested than among those who have been tested. This is a fairly weak assumption. This assumption is likely to hold for two reasons. First, tests have been predominantly given to individuals expressing Covid-19 symptoms. Second, even in Tennessee where testing has been available to anyone who wants to take a test, (i) individuals expressing symptoms are still among those most likely to take the test and (ii) it seems likely that there is some self-selection into taking the test among individuals who think they may have Covid-19 even if they do not have the right combination of symptoms to otherwise warrant a test. It is also helpful to think about the limiting cases of the assumption. P (C ilt = 1|T ilt = 0) = 0 in the case when no untested individuals have had if the probability of having had Covid-19 is the same for individuals who have not been tested as for individuals who have been tested. This condition would hold if testing were randomly assigned. In practice, neither of these limiting conditions seems likely to hold but it does seem like a weak condition that the probability of having had Covid-19 for the group of individuals who have not been tested falls in between these two limiting cases. Assumption 1 does not affect the lower bound on the total number of cases, but it is potentially very useful in lowering the upper bound on the number of Covid-19 cases in a particular location. In particular, notice that under Assumption 1, This can lead to a much tighter bound especially when P (C ilt = 1|T ilt = 1) is substantially less than one. For example, for the whole state of Tennessee, P (C ilt = 1|T ilt = 1) is roughly equal to 8%. This immediately leads to a much tighter bound on the total number of cases relative to not putting any restrictions on P (C ilt = 1|T ilt = 0). The previous section discussed how to bound the total number of Covid-19 cases in a particular location. The second goal of the paper is to go beyond these descriptive bounds and evaluate how Tennessee's open-testing policy has affected the (unobserved) total number of Covid-19 cases as well as other outcomes such as confirmed cases and trips to work. The notation for this section is somewhat different from the previous section. In particular, These are defined at the county-level and correspond to the fraction of the population in the county that has had Covid-19, that have tested positive for Covid-19, and that have been tested for respectively. 11 We also suppose that we have access to county-level covariates X l that do not vary over time; the main covariate that we use is county population. Some of the results below consider policy effects on other outcomes; in that case we denote the county-level outcome in time period t by To think about the effect of Tennessee's open-testing policy, we define potential outcomes for county l in time period t. In particular, let C lt (1), T lt (1) (0), 1}. This collects the covariates and all potential outcomes except for C lt (d). Also, (d)) . Next, let D l be a binary variable indicating treatment participation. For counties that participate in the open-testing policy (i.e., counties in Tennessee), D l = 1; otherwise, D l = 0. Also suppose that there are two time periods: t * and t * −1, 12 and that the policy is implemented between time periods t * and t * − 1. In this setup, we observe In other words, in post-treatment time periods we observe "treated" potential outcomes for counties that participate in the treatment (i.e., counties in Tennessee) and observe "untreated" potential outcomes for counties that do not participate in the treatment (i.e., counties in Alabama). In pre-treatment time periods, we observe untreated potential outcomes for all counties -these are the outcomes under the baseline policy of restricting testing to individuals meeting the symptom requirements. To start with, consider identifying the effect of Tennessee's open-testing policy on some observed outcome (e.g., the number of confirmed Covid-19 cases or the number of trips to work) in county l at time period t * . 13 We start with this case because it is simpler because Y lt , the outcome, is fully observed while C lt , the per capita number of total cases in county l, is not. AT T Y (Z lt * −1 ) is the average effect of the open-testing policy on the outcome for counties in 11 Also, notice that we do not need to estimate these quantities; rather each of them is exactly observed. 12 Our results extend immediately to the case where there are more available time periods. 13 The arguments here apply to other outcomes as well. open-testing policy on the outcome across all counties in Tennessee. We make the following assumption Assumption 2 is a standard and widely used assumption to identify the affect of some economic policy (see, for example, Imbens and Wooldridge (2009)). It says that, if the policy had not been enacted, on average outcomes in counties in Tennessee would have been the same as outcomes in counties in Alabama that had the same pre-treatment characteristics; i.e., the same outcomes in the previous period, the same per capita number of confirmed cases, the same number of per capita tests, the same population, as well as the same per capita number of total cases. One cannot immediately use Assumption 2 because Z lt * −1 (0) includes C lt * −1 (0) -the per capita number of total Covid-19 cases in a particular county -which is unobserved. But, in practice, most outcomes in period t * are likely to depend heavily on how widespread Covid-19 has been -even if it has gone largely undetected. Therefore, it seems quite important to control for the (unobserved) number of cases. To address this issue, we make the following assumption Next, we consider trying to identify the effect of Tennessee's open-testing policy on the per capita number of total Covid-19 cases. This is distinctly more challenging than the previous case because the total number of cases is not observed. To start with, we continue to make Assumption 3, and we modify Assumption 2 to hold for total Covid-19 cases: Assumption 4 (Covid Unconfoundedness). Assumption 4 is analogous to Assumption 2 but for total Covid-19 cases. It says that, in the absence of the policy intervention, on average, the per capita number of total Covid-19 cases is the same for counties in Tennessee and Alabama that had the same pre-treatment characteristics (including the same number of unobserved total Covid-19 cases). Similarly to the previous section, we focus on identifying AT T C (Z lt * −1 ) is the average effect of the policy on the per capita number of total Covid-19 cases for counties in Tennessee with pre-treatment characteristics Z lt * −1 . AT T C is the overall average effect of the policy on the per capita number of total Covid-19 cases in Tennessee. In addition, all of the results in the previous section go through suggesting that and The problem here is that C lt * is not observed. Instead, we only have the bounds given in Equation (4). The next proposition provides bounds on the policy effects on the total number of Covid-19 cases under the assumptions that we have made so far. It essentially holds by using the same bounds as in the previous section, invoking Assumptions 3 and 4, and then taking differences across counties in Tennessee and Alabama that have the same characteristics. Before stating this result, we define two more terms to conserve on notation below. First, for d ∈ {0, 1}, This corresponds to the first term in Equation (1) (now conditional on Z lt * −1 and D l = d) and is point identified. Second, for {d} ∈ {0, 1}, define which is also an observed quantity. Proposition 2. Under Assumptions 1, 3 and 4, The proof of Proposition 2 is provided in Appendix A. These sort of bounds arise under the combination of (i) standard identifying assumptions for policy effects and (ii) Assumption 1that the probability of having had Covid-19 is lower among untested individuals than among tested individuals. The term in common for each of the bounds, γ 1 (Z lt * −1 ) − γ 0 (Z lt * −1 ), comes from (i) differences in the number of confirmed cases per test between counties in Tennessee and Alabama with similar characteristics and (ii) differences in the testing rate between counties in to be as large as possible (under Assumption 1) for counties in Tennessee. 14 The weights on these terms (the terms involving τ 1 and τ 0 ) also tend to be very large because the fraction of untested individuals is much larger than the fraction of tested individuals. This implies that The drawback of these bounds is that they are unlikely to be informative about the sign of the policy effect. To see this, notice that the terms involving γ d (Z lt * −1 ) are often quite small. On the other hand, the extra terms can be orders of magnitude larger. In our case, using these bounds, the bounds cover 0 for all counties and are not very informative. In order to deliver tighter bounds, we make some additional assumptions. Assumption 5 (Joint Unconfoundedness). Assumption 6 (Bound on Total Cases and Untested Individuals). Assumption 5 says that, in the absence of the policy, the joint probability that a person has Covid-19 and that they were not tested for Covid-19 is the same for individuals located in counties with similar pre-policy characteristics regardless of whether or not the county experiences the policy. This is essentially an unconfoundedness assumption (similar to, e.g., Assumption 4), but it strengthens that assumption to hold jointly for both the fraction of the population that have had Covid-19 and the fraction that have not been tested, both in the absence of the policy. Assumption 6 says that, for individuals in counties that experience the policy, the joint probability of having Covid-19 and not being tested under the policy is lower than the joint probability of having Covid-19 and not being tested in the absence of the policy. We provide a set of more primitive conditions and more detailed discussion of this assumption in Appendix B. Proposition 3. Under Assumptions 1 and 3 to 6, The proof of Proposition 3 is provided in Appendix A. Notice that the lower bound is the same as it was in the previous case, but that the upper bound can be substantially tighter. In particular, the upper bound does not contain the same extra term as in Proposition 2; as discussed earlier, this term is the "dominant" term in the upper bound, and it is removed under the additional conditions in Assumptions 5 and 6. Moreover, recall that γ 1 (Z lt * −1 ) − γ 0 (Z lt * −1 ) comes from the difference between confirmed cases in counties in Tennessee relative to counties in Alabama with similar characteristics. 15 The number of confirmed cases depends on two things. First, it depends positively on differences in the fraction of positive cases per test among counties in Tennessee relative to counties in Alabama with similar characteristics; this term will tend to be negative if the policy is expanding testing availability to individuals who are less likely to have Covid-19. Second, it depends positively on differences in the fraction of individuals who take the test among counties in Tennessee relative to counties in Alabama with similar characteristics. This difference will tend to be positive if the policy is increasing the number of tests. Although it is hard to reason whether the number of confirmed cases will go up in response to the policy, our estimates suggest that this term is somewhat negative; i.e., we find that the number of confirmed cases appears to decrease in response to the policy. Before concluding this section, it is worth mentioning that the upper bound in Proposition 3 is still likely to be quite conservative and discussing why this is the case. Omitting covariates below in order to simplify the expressions, notice that which holds immediately by rewriting the marginal probabilities in terms of joint probabilities and rearranging. Term (A) is equal to γ 1 −γ 0 (the upper bound in Proposition 3) and is what we just discussed. Next, consider Term (B). This term is, in general, not point identified because it depends on cases among untested individuals. Assumption 6 says that this term is non-positive, and the upper bound comes from setting this term equal to 0. In practice, though, it seems quite likely that this term would be negative (implying that our upper bound is conservative). To see this, notice that it can be written as The identification results above are constructive and suggest plug-in estimators of each parameter of interest. There are a number of possibilities here (e.g., regressions or weighting estimators), but we found it natural to use a matching estimator where, for each county in Tennessee, we found a "match" in Alabama based on pre-policy county characteristics, Z lt * −1 . Overall, average effects can be calculated by taking the average outcome experienced by counties in Tennessee and subtracting the overall average outcome experienced by "matched" counties in Alabama. In practice to construct the matched dataset, we match on per capita tests and per capita confirmed cases on April 17 (the pre-treatment date when we observe tests by county in Alabama). We also match on county population. And, finally, we match on pre-treatment declines in county-level trips to work from pre-Covid baseline. To actually construct the matches, for county l in Tennessee, we construct its match by choosing the county in Alabama that minimizes the squared standardized Euclidean distance Another issue is that, in practice, it is not clear how to conduct inference in our case. We essentially have no estimation uncertainty because we observe all the exact number of confirmed cases and tests in each county. In light of this, to conduct inference, we construct a p-value by counting the fraction of counties whose outcome, for example, increases relative to their match county in Alabama. This is similar to the sorts of randomization inference procedures used in the synthetic control and matching literatures (e.g., Abadie et al. (2010) ; Ferman (2019)). Before presenting our main results, we briefly discuss the timing of other policy decisions made by Tennessee and Alabama. We list the timing of implementing major policies by Tennessee and Alabama in Table 1 16 An alternative way to construct a dataset here would be to compare bordering counties in Tennessee and Alabama. This would result in a substantially smaller matched dataset though the matches in this case could be better than the ones we use. Next, we compute bounds on the per capita number of total Covid-19 cases across counties and separately for Tennessee and Alabama. These results are available in Figure 4 . It is immediately clear that the bounds on the per capita number of total cases tend to be noticeably tighter in Tennessee counties than in Alabama counties. The mean width of the bounds is 0.06 in Tennessee and 0.13 in Alabama. It is also worth discussing the bounds in some particular cases. The lower bound on the rate of total Covid-19 cases is uniformly quite small as it is primarily driven by the number of confirmed cases. To give an example, Davidson County has the 5th highest lower bound on total Covid-19 rates, and it is only 0.6%. Main Results: Policy Effects of Tennessee's Open-Testing Policy is not statistically significant at conventional significance levels, but it is at least suggestive that the policy is actually increasing the number of tests. Next, we consider a similar exercise in terms of number of confirmed Covid-19 cases. These results are available in Panel (b) of Figure 5 . Here, we estimate that, on average, the opentesting policy decreased the number of confirmed cases by 33% relative to what they would have been in the absence of the policy. Using the same randomization inference procedure as above, we get a p-value of 0.18. Again, this is suggestive evidence that the increased testing is decreasing the number of confirmed cases. Next, we consider how the policy is affecting number of trips to work. It is important to be careful here. Our strategy here is to see how the number of trips to work changes as a function of the number of observed cases. In particular, we separate counties in Tennessee into two groups. First, we consider a group of counties that experienced large increases in the number of confirmed cases over time and a group of counties that did not experience a large increase in the number of confirmed cases over time. 18 For the group of states with large increases in confirmed cases, 17 This is computed by taking the average number of tests per capita across counties in Tennessee subtracting the average number of tests per capita in matched counties and dividing by this same term. 18 Defining what is a large increase in confirmed cases is quite ad hoc; here, we set the group of counties that had a large increase in the number of confirmed cases to be the 15 counties that had the largest increases in per capita confirmed cases over time. This corresponds to a noticeable discrete jump in the change in confirmed cases over time between Humphreys County and Meigs County; see Panel (c) of Figure 5 . The average of the upper bound is a reduction of total cases by 0.45 per 1000 individuals. This seems quite small, but it is also useful to compare this number to the number of confirmed cases 19 Proposition 3, in combination with the expression in Equation (2), implies that the upper bound is scaled version of the difference in confirmed cases across counties in Tennessee and counties in Alabama with similar cases. Therefore Panel (b) of Figure 6 is very similar to Panel (b) Figure 5 . Proof of Proposition 1. The result follows because which is the result. The first equality is the definition of AT T Y (Z lt * −1 ); the second equality holds by the law of iterated expectations (the outer expectation averages over the distribution of C lt * −1 (0) conditional on Z lt * −1 and D l = 1); the third equality holds by Assumption 2; the fourth holds by Assumption 3; the fifth equality holds by the law of iterated expectations; and the sixth equality holds because Y lt * (1) is the observed outcome when D l = 1 and Y lt * (0) is the observed outcome when D l = 0. Proof of Proposition 2. First, recall that where the second equality using the same arguments as in the proof of Proposition 1 and the third equality holds by the definition of C lt * . Omitting the dependence on Z lt * −1 for notational simplicity, and then plugging in Equation (1) and the definition of γ d (Z lt * −1 ) further implies where the two underlined terms are not identified because the number of Covid-19 cases is not observed for individuals that have not been tested. But bounds on the effect of Tennessee's open-testing policy on total Covid-19 cases arise from restrictions on these terms. In particular, Assumption 1 says that, for d ∈ {0, 1}, , the upper bound in the proposition, comes from setting P (C ilt * = 1|T ilt * = 0, D l = 1) = P (C ilt * = 1|T ilt * = 1, D l = 1) (its maximum value under Assumption 1) and from setting P (C ilt * = 1|T ilt * = 0, D l = 0) = 0. C B,L lt * (Z lt * −1 ), the lower bound in the proposition comes from setting P (C ilt * = 1|T ilt * = 0, D l = 1) = 0 and from setting P (C ilt * = 1|T ilt * = 0, D l = 0) = P (C ilt * = 1|T ilt * = 1, D l = 0) (its maximum value under Assumption 1). The bounds on AT T C arise from averaging over the bounds for AT T C (Z lt * −1 ) as discussed in the text. Next, we provide an auxiliary result that is useful for proving Proposition 3. Lemma 1. Under Assumptions 1 and 3 to 6, P (C ilt * = 1, T ilt * = 0|Z lt * −1 , D l = 1) ≤ P (C ilt * = 1, T ilt * = 0|Z lt * −1 , D l = 0) Proof. To show the result (and omitting conditioning on Z lt * −1 ), notice that P (C ilt * = 1, T ilt * = 0|D l = 1) = P (C ilt * (1) = 1, T ilt * (1) = 0|D l = 1) ≤ P (C ilt * (0) = 1, T ilt * (0) = 0|D l = 1) = P (C ilt * = 1, T ilt * = 0|D l = 0) where the first equality holds because treated potential outcomes are observed outcomes when D l = 1, the second line holds by Assumption 6, and third line holds by Assumptions 3 and 5. Proof of Proposition 3. Following the same logic as in the proof of Proposition 2 (see Equation (8) in particular) and continuing to omit conditioning on covariates to simplify the notation, the lower bound arises by making P (C ilt * = 1|T ilt * = 0, D l = 1) as small as possible while making P (C ilt * = 1|T ilt * = 0, D l = 0) as large as possible. Neither Assumption 5 nor Assumption 6 has any additional effect on these terms though so the lower bound remains unchanged. For the upper bound, plugging the result of Lemma 1 into Equation (8) implies that P (C ilt * (1) = 1|D l = 1) − P (C ilt * (0) = 1|D l = 1) ≤ γ 1 − γ 0 which implies the result for the upper bound of AT T C (Z lt * −1 ). The result for AT T C holds by averaging over the Z lt * −1 in AT T C (Z lt * −1 ). This section gives some more primitive Assumption 6 to hold. We consider the following conditions: Extra Conditions: (i) P (C ilt * (1) = 1|T ilt * (1) = 0, D l = 1) ≤ P (C ilt * (0) = 1|T ilt * (1) = 0, D l = 1) (ii) P (C ilt * (0) = 1|T ilt * (0) = 0, T ilt * (1) = 1, D l = 1) ≥ P (C ilt * (0) = 1|T ilt * (0) = 0, T ilt * (1) = 0, D l = 1) (iii) T ilt * (1) = 0 =⇒ T ilt * (0) = 0 Extra Condition (i) says that the probability of untested individuals having Covid-19 does not increase under the policy relative to the absence of the policy holding the group of tested individuals fixed (here, it is equal to the group that would be tested under the policy). Extra Condition (ii) says that the probability of having Covid-19 is greater for the group of individuals that would be tested if the policy is implemented but not tested if the policy is not implemented than for the group of individuals that would not be tested under either policy. 20 Extra Condition (iii) says that all individuals who would have been tested in the absence of the policy (i.e., individuals meeting the symptoms requirement and who had sought a test) would continue to be tested under the open-testing policy. Next, notice that Assumption 6 holds if the following difference is less than or equal to 0. P (C ilt * (1) = 1, T lit * (1) = 0|D l = 1) − P (C ilt * (0) = 1, T ilt * (0) = 0|D l = 1) = P (C ilt * (1) = 1|T ilt * (1) = 0, D l = 1) − P (C ilt * (0) = 1|T ilt * (1) = 0, D l = 1) P (T ilt * (1) = 0|D l = 1) + P (C ilt * (0) = 1|T ilt * (1) = 0, D l = 1) − P (C ilt * (0) = 1|T ilt * (0) = 0, D l = 1) P (T ilt * (1) = 0|D l = 1) + P (C ilt * (0) = 1|T ilt * (0) = 0, D l = 1) P (T ilt * (1) = 0|D l = 1) − P (T ilt * (0) = 0|D l = 1) Term (C) where the equality holds by adding and subtracting P (C ilt * (0) = 1|T ilt * (1) = 0, D l = 1)P (T ilt * (1) = 0|D l = 1) and P (C ilt * (0) = 1|T ilt * (0) = 0, D l = 1)P (T ilt * (1) = 0|D l = 1). Term (A) ≤ 0 holds immediately by Extra Condition (i). For Term (B), notice that P (C ilt * (0) = 1|T ilt * (0) = 0, D l = 1) = P (C ilt * (0) = 1|T ilt * (0) = 0, T ilt * (1) = 0, D l = 1)P (T ilt * (1) = 0|T ilt * (0) = 0, D l = 1) + P (C ilt * (0) = 1|T ilt * (0) = 0, T ilt * (1) = 1, D l = 1)P (T ilt * (1) = 1|T ilt * (0) = 0, D l = 1) which holds by the law of total probability. Then, applying Extra Condition (ii) implies that P (C ilt * (0) = 1|T ilt * (0) = 0, D l = 1) ≥ P (C ilt * (0) = 1|T ilt * (0) = 0, T ilt * (1) = 0, D l = 1) and Extra Condition (iii) additionally implies that P (C ilt * (0) = 1|T ilt * (0) = 0, D l = 1) ≥ P (C ilt * (0) = 1|T ilt * (1) = 0, D l = 1) which implies that Term (B) ≤ 0. That Term (C) ≤ 0 immediately holds by Extra Condition (iii). The extra conditions outlined above are stronger than are needed for Assumption 6 to hold, but they provide one set of plausible, low-level conditions where Assumption 6 would hold. 20 Another way to explain this condition is that there is positive self-selection into taking the test among individuals that become tested under the open-testing policy but would not have been tested without the open-testing policy. Synthetic control methods for comparative case studies: Estimating the effect of california's tobacco control program Matching estimators with few treated and many control observations Recent developments in the econometrics of program evaluation Estimating the covid-19 infection rate: Anatomy of an inference problem