key: cord-0854068-f9dnwhob authors: Chen, X.; Huang, H.; Ju, J.; Sun, R.; Zhang, J. title: Impact of vaccination on the COVID-19 pandemic: Evidence from U.S. states date: 2021-05-12 journal: nan DOI: 10.1101/2021.05.08.21256892 sha: 90db32248877494bf6bbd8b4b84f08c2500301aa doc_id: 854068 cord_uid: f9dnwhob Governments worldwide are implementing mass vaccination programs in an effort to end the novel coronavirus (COVID-19) pandemic. Although the approved vaccines exhibited high efficacies in randomized controlled trials, their population effectiveness in the real world remains less clear, thus casting uncertainty over the prospects for herd immunity. In this study, we evaluated the effectiveness of the COVID-19 vaccination program and predicted the path to herd immunity in the U.S. Using data from 12 October 2020 to 7 March 2021, we estimated that vaccination reduced the total number of new cases by 4.4 million (from 33.0 to 28.6 million), prevented approximately 0.12 million hospitalizations (from 0.89 to 0.78 million), and decreased the population infection rate by 1.34 percentage points (from 10.10% to 8.76%). We then built a Susceptible-Infected-Recovered (SIR) model with vaccination to predict herd immunity. Our model predicts that if the average vaccination pace between January and early March 2021 (2.08 doses per 100 people per week) is maintained, the U.S. can achieve herd immunity by the last week of July 2021, with a cumulative vaccination coverage of 60.2%. Herd immunity could be achieved earlier with a faster vaccination pace, lower vaccine hesitancy, or higher vaccine effectiveness. These findings improve our understanding of the impact of COVID-19 vaccines and can inform future public health policies regarding vaccination, especially in countries with ongoing vaccination programs. and the total number of second doses administered per 100 people. Without any control variables, Fig. 2 shows the negative correlation between the vaccination rate and the growth rates of total cases and hospitalizations. To make the individual states as comparable as possible, we first accounted for observable factors associated with the COVID-19 pandemic based on previous studies (see Extended Data Table 1 ). These time-varying control variables included non-pharmaceutical interventions [5] [6] [7] , election rallies 19,20 and anti-racism protests 21 that involved mass gatherings, and climate measures of snow depth and temperature 22 . To address the concern that changes in the number of total cases reflect the testing capacity of each state 23 , we also controlled for each state's testing capacity. As the proportion of susceptible individuals declines, the infection rate may slow; therefore, we included the share of susceptible individuals in the regressions. We estimated the dependent variables of COVID-19 cases and hospitalizations with a one-week lag to account for the latency period of infection. Finally, we added state fixed effects and time fixed effects to capture spatial and temporal invariants to alleviate omitted-variable bias. Our data show that the national average weekly growth rate of total cases was 7% (s.e.m. = 0.05) between 12 October 2020 and 7 March 2021. At the individual state level, the average growth rate was highest in Vermont (11%) and lowest in Hawaii (4%). The average growth rate of total hospitalizations across the 35 states that reported hospitalization data was 5% (s.e.m. = 0.04%); the highest growth rate was seen in Montana (8%) and the lowest in New Hampshire (2%). Vaccination has significantly slowed the growth of total COVID-19 cases and hospitalizations in the U.S. Our baseline results ( Fig. 3a and Extended Data Table 2 If systematic correlations existed between the pre-vaccination growth rates of infection and hospitalization and the rate of vaccination, our results would have been subject to selection bias. However, this was not the case. We demonstrated that the number of vaccines allocated to each state was proportional to its population size (Extended Data Fig. 1a ). More importantly, we found that the pre-vaccination average growth rates of total cases and hospitalizations were not correlated with the average vaccination rate (Extended Data Fig. 2 ). Our baseline results focus on the average treatment effect of vaccination. This effect may be heterogeneous across states that have different characteristics. For example, some evidence shows that the prevalence of COVID-19 differs across age groups, with older adults bearing the highest risk 24, 25 . Because older adults were given priority during the rollout of vaccination, it is intuitive 7 to ask whether this strategy made a difference. We separated the states into two groups according to their proportion of older adults (at least 65 years of age). Despite the slightly larger point estimate for the states with a share of older adults above the national median, the results do not differ significantly from those for the states below the median (Extended Data Fig. 3c ). In addition to age, we conducted heterogeneity tests on political affiliation, nonpharmaceutical interventions, race, income, and vaccine brand. We found no significant heterogeneous effect of vaccination on any of these characteristics (Extended Data Fig. 3 ), implying that COVID-19 vaccines have similar effectiveness across these characteristics. We conducted a range of sensitivity tests. First, instead of using weekly data, we ran regressions with daily data and obtained results of similar magnitudes (Extended Data Table 3 ). Second, we used alternative measures to capture the development of the pandemic, including the logarithms of new cases and hospitalizations and the changes in logarithms of total cases and hospitalizations. Again, using these measures, we found that vaccination has significantly slowed the pandemic (Extended Data Fig. 4 and Extended Data Table 4 ). Although the vaccination rollout began on 14 December 2020, our vaccination data begin 11 January 2021; we thus used linear extrapolation to impute the missing data. Our results with the inclusion of imputed data are very similar to the baseline results (Extended Data Fig. 5 ). Finally, we selected approximately the same number of weeks for the pre-treatment and post-treatment periods to balance the sample in our baseline results. To check the sensitivity of our results to the sample period, we ran our regressions with varying time windows, and our results remain remarkably stable. We obtained approximately the same coefficients for sample periods from 18 to 45 weeks (Extended Data Fig. 6 ). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. To predict how the pandemic will develop with vaccines, and especially when herd immunity might be achieved, we built a Susceptible-Infected-Recovered (SIR) model with vaccination and calibrated it to our data. Our model predictions of the infection rate during the study period showed 99.69% correlation with the empirical data at the national level (Extended Data Fig. 7) . Herd immunity is achieved in the model when the real-time basic reproduction number is less than one (Supplementary Methods). According to our model predictions, at the national average vaccination pace of 2.08 doses per 100 people per week between January and early March of 2021, the U.S. will achieve herd immunity around the last week of July 2021, with a cumulative vaccination coverage rate of 60.2% and a cumulative infection rate of 13.3%. To understand how the speed of vaccination rollout would affect the time needed to reach herd immunity, we simulated herd immunity dates by varying vaccination pace (Fig. 4) . We observed a general trend that a faster vaccination pace would allow the U.S. to achieve herd immunity sooner, but with a greater number of total vaccine doses administered and a lower cumulative infection rate. This result can be explained as more individuals gaining immunity from vaccines than from infections if the vaccination pace increases. If the vaccination pace increases to 4 doses per 100 people per week, herd immunity could be reached in early May 2021, but if it decreases to 1 dose, herd immunity would not be achieved until mid-October 2021. Our predictions of herd immunity assume a continuation of vaccine uptake. In reality, however, a few potential factors could affect uptake. A certain proportion of the population might not receive All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. To examine how vaccine hesitancy and changes in vaccine effectiveness could affect our predictions for herd immunity, we incorporated in our model a range of potential vaccine hesitancy and vaccine effectiveness estimates. We assumed that if x% of the population is hesitant, then cumulative vaccination coverage in each state will stop when (1 − x%) of the population is vaccinated. Table 1 shows that a higher percentage of vaccine-hesitant individuals will lead to lower vaccination coverage with more individuals infected with COVID-19 at herd immunity. In particular, if vaccine hesitancy reaches 50%, herd immunity will be delayed until the end of August, with a cumulative infection rate of 14.9%; that is, 14.9% of the total population will have been infected with COVID-19 by then. This level of vaccine hesitancy is plausible, given that approximately 40% of U.S. Marines have declined vaccination 29 . In our baseline model, the effectiveness of the first dose of the vaccine was approximately 73% (Supplementary Methods). If the vaccine effectiveness increases, fewer individuals will require vaccination to reach herd immunity, resulting in fewer cumulative cases. However, if the vaccine effectiveness decreases to 60%, herd immunity will not be reached until the week of August 16, 2021, with a cumulative vaccine coverage of 67.6% and a cumulative infection rate of 14.7%. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint A few unanswered questions could still affect herd immunity. One key issue is how long the vaccine immunity will last. Definitive evidence regarding the duration of immunity protection is lacking 30 . Another issue is moral hazard, that is, whether vaccinated individuals will change their behaviors and undertake more social interaction 31 . This change could result in higher risks of infection and a delay in reaching herd immunity. Our study has a few limitations. We covered only the early periods of vaccination rollout, when the demand for vaccines was greater than the supply. As more individuals become vaccinated, the vaccination pace will likely slow due to the decrease in demand. In addition, our model predictions assume a continuation of the non-pharmaceutical interventions in place in early March. Relaxation of these policies would likely increase the time needed to reach herd immunity. Our SIR model assumes that only susceptible individuals undergo vaccination. However, in real life, many individuals who recovered from COVID later received vaccines. As a result, our model predictions are optimistic, and herd immunity will be achieved later based on this empirical fact. In addition, our study assessed the effects of vaccination in the U.S. using mRNA vaccines. More studies are needed to study vaccination in other countries using different types of vaccines. Our study provides strong evidence that vaccination has significantly decreased COVID-19 cases and hospitalizations in the U.S. At the average pace of vaccination between January and early March, our model predicts that herd immunity will be achieved around the last week of July 2021, with a cumulative vaccine coverage of 60.2% and a total infection rate of 13.3%. These findings provide grounds for optimism that the pandemic will end during 2021 in the U.S. However, a few All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Vaccination rate is the number of individuals vaccinated per hundred. The solid line in each figure is a fitted linear curve between the growth rate of total cases/hospitalizations and vaccination rate. a, Association between the growth rate of total cases and at least 1 dose of vaccination (coefficient = -0.006, R 2 = 35.3%). b, Association between the growth rate of total cases and 2 doses of vaccination (coefficient = -0.013, R 2 = 28.6%). c, Association between the growth rate of total hospitalizations and at least 1 dose of vaccination (coefficient = -0.003, R 2 = 20.8%). d, Association between the growth rate of total hospitalizations and 2 doses of vaccination (coefficient = -0.007, R 2 = 16.6%). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint A summary is provided of the data used in our analysis. Our supplementary notes give further details, including a summary statistics table for all variables. Nonpharmaceutical Interventions In addition to epidemiological data, we obtained information on nonpharmaceutical intervention policies. We adopted the policy stringency index constructed by the Oxford COVID-19 Government Response Tracker 40 , which systematically collects information on various policy responses implemented by various governments in response to the pandemic. We focused on the policy category of "containment and closure," which covers eight policies: school closings, workplace closings, cancelation of public events, restrictions on All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint gathering sizes, cessation of public transportation, stay-at-home requirements, restrictions on internal movement, and restrictions on international travel. This stringency index is a weighted score across these eight containment and closure policies and is scaled between 0 and 100. A detailed explanation of these measures was given by Hale et al. (2021) 41 . We determined the stringency index for each state on a weekly basis by averaging the daily data. Meteorological Data Another set of important independent variables included in this study regarded the local climate. We obtained station-level hourly weather data provided by the National Centers for Environmental Information 42 . These station-level weather data were then matched with the station location and corresponding state provided by the Global Historical Climatology Network Daily 43 . We calculated the average values from these weather reports for each week across all stations within each state. Given the lack of humidity data, temperature and snow depth were used as our climate measures. Several large-scale mass gatherings for political campaigns and protests also occurred during our study period. We constructed binary measures for election rallies 44 . For states with a rally during week t, this binary measure takes the value of 1 for week t and for the week after (t+1). Our BLM data from Elephrame offered detailed information (date, location, etc.) about each demonstration from news reports 45 , which were extracted using a Web scraper. We then calculated the total number of demonstrations that occurred across all cities within each state for each week. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Sociodemographic Data We also collected the sociodemographic characteristics of each state's population using 2019 estimates from the U.S. Census Bureau 46, 47 . Specifically, we downloaded data on age, race, and income. We constructed each of our sociodemographic variables to be binary, above or below the national median. We derived the proportion of individuals 65 years of age and older in the population, the proportion of the white population, and the income for each state to calculate a national median. Finally, our data for the 2020 Electoral College results were obtained from the National Archives 48 . We classified the states into those won by Joe Biden and those by won by Donald Trump. The following reduced-form empirical model was used to estimate the impact of vaccination on the pandemic: Here, is the dependent variable that measures the growth of either total cases or total hospitalizations in state i at period t. Our baseline measure is the growth rate, which is defined as Our key independent variable, , −1 , is the rate of vaccination of state i in period t-1, and 1 is the coefficient of interest. We used two measures of vaccination rate: the number of vaccinated people (i.e., those who had received at least one dose of vaccine) per hundred and the number of fully vaccinated people (i.e., those who had received two doses of vaccine) per hundred. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint As the proportion of susceptible individuals in the total population decreases over time, the growth rate of infection may also decline. To deal with this intrinsic dynamic, , −1 / was included in the regression model to control for the stock of susceptible individuals , −1 in the total population . We measured , −1 as the difference between the population size and the total number of infections. To adjust for differences in testing intensity across states, we added , −1 / to control for the number of tests relative to the total population. Our control variables contain a dummy variable , , which equals 1 when an election rally occurred in state i at period t. We also added a variable , , which is the number of protests held across all cities in state i at period t. To capture the influence of climate on the pandemic, we included measures of state-level meteorological conditions, including average temperature, temperature deviation from the state mean, and the logarithm of the average snow depth. Note that we included state fixed effects ( ) to capture state-specific unobserved factors, which are timeinvariant, such as location, geography, and culture. We also included week fixed effects ( ) to capture unobserved shocks, which are common across states, such as macroeconomic conditions. Finally, , is a random error term of the model, which has a mean of zero. We estimated equation (1) using the method of Ordinary Least Square with weekly data for 50 states and DC in the baseline. Robust standard errors for the estimated coefficients with two-way clustering were calculated at the state and week levels 49 . Therefore, we allowed for within-state autocorrelation in the error term to capture the persistence of the pandemic within each state. We also allowed for spatial autocorrelation in the error term to capture common pandemic shocks or systematic misreporting across states. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. Here, , is the state-specific (i) and time-varying (t) proportion of susceptible individuals in the population, , is the proportion of infected individuals, and , the proportion of recovered (plus dead) individuals. , is the infection rate, which determines the spread of the pandemic. includes both recovered individuals and deaths and is referred to as the removal rate 5 . Here varies only by state and not over time. , is the proportion of vaccinated individuals, and is the population-level vaccine effectiveness, which remains the same across states and time. We fit the SIR model above with state-level COVID-19 epidemiology data, from which we observed data on the cumulative number of cases, deaths, and vaccination doses administered. Only 29 of the 51 states (counting DC as a "state" for this purpose) reported valid recovery data. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint β i,t = 0 + 1 • , + + + , . Similarly, we estimated vaccination coverage using the following equations, controlling for state and time fixed effects. We adopted two vaccination measures in our data: the total number of people who had received at least one vaccine dose and the total number of fully vaccinated people. No time trends were observed in the total doses administered for at least one dose of vaccine, but an apparent time trend was seen in the doses administered for the second dose. We therefore added a time trend in the estimation equation above when we conducted the sensitivity check using the total number of fully vaccinated people as our measure of vaccination. We used equations (3) and (4) For each given vaccination pace, we ran the simulation forward and projected the future dynamic of the pandemic across the U.S., assuming that no changes are made in nonpharmaceutical interventions. We then computed the time required for every state to achieve herd immunity and All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint calculated the share of the U.S. population vaccinated when herd immunity is achieved. In addition, we conducted a sensitivity analysis regarding herd immunity with variations in vaccine effectiveness and with the addition of vaccine hesitancy. We incorporated vaccine hesitancy into our model by assuming that if x% of the population is hesitant, the cumulative vaccination coverage in each state will stop when (1 − x%) of the population is vaccinated. The supplementary information provides supplementary methods, figures, and tables. We used our reduced-form estimates to carry out back-of-the-envelope calculations to derive the number of new cases prevented by vaccination. For this purpose, we first calculated the counterfactual growth rate of total cases by ̂, = , −̂1 , −1 with ̂1 and , −1 . . Repeating this process with hospitalization data, we evaluated the impact of vaccination on the total number of hospitalizations during our sample period. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Here we provide more details on parameter estimation for our SIR model with vaccination. We (2), γ i stands for the removal rate from the infection group. We calculated a state-specific but time-invariant γ i by considering both recovered individuals and deaths following Hsiang et al. (2020) 2 . We obtained complete death data over the study period, but valid recovery data are available for only 29 states. Therefore, we first calculated the average recovery and mortality rates in the 29 states for which we have valid recovery data as where T is the number of weeks in the sample period. The removal rates in the remaining 22 states were assumed to be the median of the removal rates in the 29 states for which complete recovery data are available, that is, 30.15% 3 . 1 The states with valid recovery data are AL, AR, DC, ID, KY, LA, ME, MD, MA, MI, MN, MS, MT, NE, NH, NM, ND, OH, OK, PA, SC, SD, TN, TX, UT, VT, WV, WI, and WY. Although IA also reported recovery, the number was higher than the cumulative number of infections. We therefore excluded IA as well. 2 Hsiang (2020) 3 We adopted the median instead of the mean to dampen the influence of outliers, similar to Hsiang (2020). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Infection rate ( , ) In our SIR model, β i,t determines the spread of the pandemic. According to equation (2), we have which we used to calculate β i,t in the 29 states that we have recovery data to derive the removal rate directly. To estimate β i,t for the other 22 states with no recovery data, we first assumed that β i,t is determined by the stringency of nonpharmaceutical interventions and used the following reduced-form equation, which estimates β i,t for the 29 states with recovery data using the observed nonpharmaceutical interventions, along with state fixed effects ( ) and time fixed effects ( ). We then inferred β i,t for the remaining 22 states 4 based on the estimated 1 , the observed policies, and the median estimates of state and time fixed effects. We also assumed that future non-pharmaceutical interventions would remain at the same level as in the last week of our sample (i.e., the week of March 1, 2021) when generating model predictions. Vaccination rate ( , ) We calculated the population share of newly vaccinated people by δ i,t = ( , − , −1 )/ , where L i is the total population size in state . We then estimated δ i,t with state fixed effects and time fixed effects. Specifically, we used δ i,t (1) = 0 (1) + (1) + (1) + , (1) to estimate the vaccination rate for the first dose and δ i,t (2) = 0 (2) + 1 (2) + (2) + (2) + , (2) for the second dose. We predicted δ i,t for each state in future periods based on the estimated constants 4 See footnote 1 for details, in which we list all states with complete recovery data. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint ( 0 (1) , 0 (2) ), coefficient ( 1 (2) ), state fixed effects ( (1) , (2) ), and the median of time fixed effects ( (1) , (2) ). Vaccine efficacy (e) According to previous studies, the Pfizer vaccine has an efficacy of 52.0% after the first dose, and the Moderna vaccine has an efficacy of 92.1% after the first dose 5 We then matched the model generated {̂, } from the equations above for each of the 22 states with observed data for cumulative infection {I , } by minimizing the loss function below . 5 Creech et al, 2021 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We examined how well our calibrated model fits the empirical data. The infection rates predicted by our model match the general trend in the U.S. and in most states quite well (see Supplementary Table 1 and Extended Data Fig. 7) ; the average correlation was 99.69% at the national level. Table 1 compares the fitness of the model results with the empirical data for each state. There were two exceptions, Kentucky (KY) and Maryland (MD), for which our model predictions were off-target by relatively large margins. However, this was due to the estimated removal rates ( ) for these two states, which are outliers ( Supplementary Fig. 1 ). The basic reproduction number, ′ = , is the key measure used to assess the dynamics of the pandemic and to calculate the vaccination coverage to achieve herd immunity 6, 7 . It is worthy of note that this formula only applies during the early stage of disease when the susceptible density approaches 1. However, at a later stage of the pandemic and with vaccines, a considerable share of the susceptible population has been vaccinated or has recovered, so the share of susceptible individuals can be significantly less than 1. According to the definition of the basic reproduction number, at period t, an infected person is expected to infect β t people with an expected duration 6 Sun, 2010 7 Sun & Shi, 2011 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint infection time of 1/γ. Therefore, the time-varying reproduction number is ′ = β t / . At the beginning of the pandemic, we have = 0, = 1 and 0 = β 0 / , which is consistent with the conventional definition. To assess whether the U.S. as a whole has acquired herd immunity, we use the "Third Statistics" approach; that is, the third-worst state's reproduction number is used to form the national level "reproduction number": We used this measure to rule out the impact of outliers. 8 As Supplementary Fig. 1 (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Extended Data Fig. 2| COVID-19 infections (average total infection and hospitalization rates) before vaccination and average vaccination rate. a, Association between the total infection rate before vaccination and at least 1 dose of vaccination (coefficient = 0.0002, R 2 = 0.0%). b, Association between the total infection rate before vaccination and 2 doses of vaccination (coefficient = 0.0002, R 2 = 0.0%). c, Association between the total hospitalization rate before vaccination and at least 1 dose of vaccination (coefficient = 0.0000, R 2 = 1.0%). d, Association between the total hospitalization rate before vaccination and 2 doses of vaccination (coefficient = 0.0001, R 2 = 0.9%). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Extended Data Fig. 3| Heterogeneity tests on the effect of vaccination across various state characteristics. Blue markers are the estimated effects of at least 1 dose of vaccine, and red markers are the estimated effects of 2 doses of vaccine. a, Effect of vaccination in states where the 2020 presidential election was won by Joe Biden versus Donald Trump. b, Effect of vaccination in states with non-pharmaceutical interventions more stringent than the national median (+) versus less stringent than the median (-). c, Effect of vaccination in states with the proportion of the elderly population (65+) greater than the national median (+) versus less than the median (-). d, Effect of vaccination in states with the proportion of the white population greater than the national median (+) versus less than the median (-). e, Effect of vaccination in states with per capita income greater than the national median (+) versus less than the median (-). f, Effect of vaccination in states with the share of Pfizer vaccine greater than the national median (+) versus less than the median (-). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Extended Data Table 2 . Baseline regression results. Standard errors (in parentheses) are two-way clustered at state and week level. Significance levels: *** p<0.01, ** p<0.05, * p<0.1. a People who received at least 1 dose of the vaccine. b People who received 2 doses of the vaccine. c 21 observations of DC temperature were missing, estimated using average temperature from neighboring states of Virginia and Maryland. 8 observations of Delaware temperature were missing, estimated using average temperature from the neighboring states of New Jersey and Maryland. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Extended Data Table 3 . Baseline regression results with data of daily frequency. Standard errors (in parentheses) are two-way clustered at state and week level. Significance levels: *** p<0.01, ** p<0.05, * p<0.1. a People who received at least 1 dose of the vaccine. b People who received 2 doses of the vaccine. c 21 observations of DC temperature were missing, estimated using average temperature from neighboring states of Virginia and Maryland. 8 observations of Delaware temperature were missing, estimated using average temperature from the neighboring states of New Jersey and Maryland. Growth rate of cases (N = 7,497) (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 12, 2021. ; https://doi.org/10.1101/2021.05.08.21256892 doi: medRxiv preprint Safety and efficacy of the BNT162b2 mRNA Covid-19 vaccine Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine WHO. WHO Coronavirus Disease (COVID-19) Pandemic Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe The effect of large-scale anti-contagion policies on the COVID-19 pandemic India under COVID-19 lockdown Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period Next-generation vaccine platforms for COVID-19 SARS-CoV-2 vaccines The COVID Tracking Project. The data COVID-19 testing and cases in immigration detention centers Comparison of estimated rates of coronavirus disease 2019 (COVID-19) in border counties in Iowa without a stay-at-home order and border counties in Illinois with a stay-at-home order Digital public health and COVID-19. The Lancet Public Health COVID-19 vaccinations in the United States Our World in Data: COVID-19-data COVID-19 vaccine distribution allocations by jurisdiction -Pfizer COVID-19 vaccine distribution allocations by jurisdiction -Moderna Variation in US states responses to COVID-19 2.0. Blavatnik School of Government Working Paper Climate data online: Dataset discovery Global Historical Climatology Network Daily (GHCND) List of post-2016 election Donald Trump rallies List of 5,788 Black Lives Matter demonstrations State population by characteristics Median Household Income by State National Archives. 2020 Electoral College results Robust inference with multiway clustering Our 21-week baseline period is from 12 We thank K.E. Warner and S. Mennemeyer for their feedback. Funding: H.H. is supported by the startup grant from the City University of Hong Kong (grant no. 7200689). All authors designed the analyses, interpreted the results, and designed the figures, and are listed alphabetically. X.C., H.H., R.S., and J.Z. contribute equally to the paper. H.H. and R.S. collected the data. J.Z. conducted the reduced-form empirical analysis. X.C. conducted the analysis with the SIR model. H.H., J.J., and R.S. wrote the paper. The authors declare no conflicts of interest.