key: cord-0942952-cxnwlroy authors: Chudiik, A.; Pesaran, M. H.; Rebucci, A. title: COVID-19 Time-varying Reproduction Numbers Worldwide: An Empirical Analysis of Mandatory and Voluntary Social Distancing date: 2021-04-07 journal: nan DOI: 10.1101/2021.04.06.21255033 sha: 35cb561160f551190c9ff0c6f376976fc7459167 doc_id: 942952 cord_uid: cxnwlroy This paper estimates time-varying COVID-19 reproduction numbers worldwide solely based on the number of reported infected cases, allowing for under-reporting. Estimation is based on a moment condition that can be derived from an agent-based stochastic network model of COVID-19 transmission. The outcomes in terms of the reproduction number and the trajectory of per-capita cases through the end of 2020 are very diverse. The reproduction number depends on the transmission rate and the proportion of susceptible population, or the herd immunity effect. Changes in the transmission rate depends on changes in the behaviour of the virus, reflecting mutations and vaccinations, and changes in people's behaviour, reflecting voluntary or government mandated isolation. Over our sample period, neither mutation not vaccination are major factors, so one can attribute variation in the transmission rate to variations in behaviour. Evidence based on panel data models explaining transmission rates for nine European countries indicates that the diversity of outcomes resulted from the non-linear interaction of mandatory containment measures, voluntary precautionary isolation, and the economic incentives that governments provided to support isolation. These effects are precisely estimated and robust to various assumptions. As a result, countries with seemingly different social distancing policies achieved quite similar outcomes in terms of the reproduction number. These results imply that ignoring the voluntary component of social distancing could introduce an upward bias in the estimates of the effects of lock-downs and support policies on the transmission rates. The COVID-19 pandemic has claimed millions of lives and brought about very costly government interventions to contain it, with unprecedented and widespread economic disruption worldwide. China responded to the initial outbreak with strict and binding mandatory social distancing policies to contain the epidemic and it is widely credited to have been successful in eradicating the virus each time it resurfaced. At the other end of the spectrum, for example, Sweden initially attempted to let its epidemic run its course with only minimal interventions from the government. Other countries responded by adopting a mixture of policies, either by deliberate choice or due to popular opposition to the implementation of lock-downs or even milder forms of social distancing. Yet outcomes in terms of per capita cases and deaths do not always diverge or align with these priors. The purpose of this paper is two-fold: (a) to compare time-varying estimates of COVID-19 effective reproduction numbers across a large number of diverse countries, and to highlight the wide disparities that exist across countries. (b) to model empirically the effects of mandatory and voluntary social distancing as well as incentives to comply on the evolution of the virus transmission rate in a sample of European countries. Reproduction numbers are epidemiologic metrics to measure the intensity of an infectious disease. The basic reproduction number, denoted by R 0 , is the number of new infections expected to result from one infected individual at the start of the epidemic. Within a classical susceptibleinfective-removed (SIR) model the basic reproduction number is given by R 0 = β 0 /γ, where β 0 is the initial (biological) transmission rate, and γ is the recovery rate. Since the transmissibility of a disease will vary over time due to changes in immunity, mitigation policies, or precautionary behavior, the effective reproduction number, which we denote by R et , measures the R number t periods after the initial outbreak. As we show in the paper, in the classical SIR model, we have changes in contacts or susceptibility to infection. As a result, in our model, one can separate changes in R et due the extent to which the susceptible population is shrinking, 1 − c t (which we call herding), or due to social distancing whether mandated or voluntary. To estimate the COVID-19 time-varying transmission rates, β t , we apply a new method proposed by Pesaran and Yang (2020) , henceforth PY, based on a moment condition that can be derived from an agent-based stochastic network model of epidemic diffusion. As PY show, a linearized version of this moment condition can aggregate up to the classical deterministic SIR model. From this moment condition, we then derive reduced form regressions in confirmed cases that control for under-reporting due to unreported asymptotic cases and/or in absence of universal testing. The estimation approach that we propose is applicable to any level of jurisdiction and could provide guidance on how to measure the health impact in causal studies of specific mitigating policies. For the sake of brevity we report the results for selected countries and regions, but compute rolling estimates of effective reproduction numbers for all jurisdictions for which Johns Hopkins University (JHU) reports case statistics. 1 This is important since measuring health outcomes is challenging in studies that seek to establish causal effects of policies to address COVID-19. Our method of moment estimation requires only data on infected cases, thus complementing estimation methods based on death statistics. The reported number of infected cases and deaths is problematic and different countries might have better quality data on either one or the other. For example, Spain has very good death statistics. In other countries, death statistics have undergone major revisions on several occasions. For example, the United Kingdom death toll was revised downward by 5,377 on August 12, 2020 after a review concluded that daily death figures should only include deaths which had occurred within 28 days of a positive COVID-19 test. Our estimation method is not only simple to apply, but also fairly robust to the under-reporting of infected cases. Many existing estimation methods of reproduction numbers do not allow for measurement errors and might not be robust to the under-reporting problem. For instance, the seemingly unrelated regression estimates developed by Korolev (2020) may be biased downward if one neglects under-reporting of confirmed cases. Stock (2020) focuses on measurement errors and explores the benefits of randomly testing the general population to determine the asymptomatic infection rate. Following the medical evidence in Gibbons et al. (2014) , we use a multiplication factor to allow for under-reporting in a way that will be elaborated in later sections of the paper. This is particularly important given that it has been widely acknowledged that number of reported infected cases may suffer from considerable under-reporting, especially during the early stages of the epidemic. For example, estimate that only 14 percent of all infections were documented in China prior to the January 23, 2020 travel restrictions. This translates to a multiplication factor of 1/0.14 ≈ 7.14. Jagodnik et al. (2020) estimate that the recorded cases were under-reported by a multiplication factor in the range of 3 to 16 times in the seven countries that they considered as of March 28, 2020. 2 According to the Centers for Disease Control and Prevention study of Havers et al. (2020) , in the United States, the number of infected cases is likely to be 10 times more than reported based on antibody tests from March through May, 2020. More recently, Rahmandad, Lim, and Sterman (2020) estimate that the cumulative cases across 86 countries through July 10, 2020 are 10.5 times the number of officially reported cases, with a 10 th − 90 th percentile range of 3.35 − 23.81. We find that for China the effective reproduction number drops below 1 within 30 days of the lock-downs. The reproduction number estimates obtained for other countries initially also fall, as found for instance by Atkeson, Kopecky, and Zha (2020a) . Contrary to evidence based on COVID death statistics, however, the pace and the magnitude of the decline varies significantly across countries. For example, we estimate it took about three times longer for the southern hemisphere to bring the reproduction number down to one compared with the northern hemisphere, excluding China. The critical difference between China and all other countries is not only the initial slower decline of the reproduction number, but also the fact that, with very few exceptions, the epidemic was never eradicated completely and hence it resurfaced every time restrictions were loosened. This is important because without herd immunity, community transmission resumes as soon as social interaction resumes. Indeed, our estimates show that, in most cases, the estimated reproduction number does not remain permanently below one and most countries experience more than one wave, with the second wave typically larger than the first one. The paper also finds that, whilst in China and Asia the reproduction number is driven entirely by a reduction in contacts and vulnerability to COVID-19, in the United States, the United Kingdom, Brazil, and several other countries, community transmission drops also due to the rise in the proportion of population becoming infected, and the rise in the relative importance of herd immunity. Critically, our results show that countries with seemingly different social distancing policies achieved quite similar outcomes in terms of the reproduction numbers. For a better understanding of the factors behind the evolution of effective reproduction numbers, we separate the herding component from the transmission rate and empirically model the latter for a sample of European countries with a similar start to the outbreaks in March 2020, but with differing outcomes subsequently. In line with the agent-based stochastic epidemic network model of PY, we argue that it is the transmission rate, and not the R number, that depends on behavioral changes. The agent-based model shows the transmission rate to depend on the average number contacts multiplied by individual-specific susceptibility to become infected which in turn depends on average duration of contacts, wearing of face masks, and other recommended precautions. Accordingly, in our empirical analysis we assume that a country's time-varying country-specific transmission rate depends on three factors. Consistent with a simple decision theoretic model presented in the paper and a large body of empirical evidence, we distinguish between government-mandated social distancing policies and voluntary self-isolation. We also control for government economic support that affects the incentive to comply as reported in survey data (e.g., Papageorge et al., 2021; Hamermesh, 2020) . To measure mandated-social distancing and incentives to comply with these policies we use the stringency and support indices compiled by To assess the impact of voluntary social distancing, we allow for a threshold effect capturing the impact of fear of becoming infected arising from news of rising cases 3 Available at https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-government-response-tracker. on individual precautionary behavior. The importance of these factors in controlling the effective rate of transmission is jointly estimated within the context of the epidemic model, allowing for a lag of two or three weeks between the policy or behavioral changes and the infection outcomes. We find that all three determinants of the transmission rate are statistically highly significant and have the expected signs. However, consistent with the heterogeneity documented in the first part of the empirical analysis, we also find that when we control for voluntary behavior by adding the threshold effect, the magnitudes of estimated coefficients on stringency and compliance indicators decline markedly. This is clear evidence suggesting that the role of mandatory policies might be overestimated in studies that do not explicitly allow for voluntary self-isolation and herding. In addition, our estimates suggest that voluntary social distancing alone would not have been sufficient to bring the R number below one in Europe and to keep it there without substantial contributions from herd immunity and/or mass vaccination. To summarize the main message, our empirical analysis shows that mandatory social distancing is critical, as voluntary social distancing alone does not seem capable of bringing the reproduction number below one. Draconian mandatory social distancing, as in the case of China, can succeed. However, in light of the economic and social costs of such an approach, other countries attempted to pursue alternative strategies. As a result, epidemic curves show a great deal of heterogeneity. They show that protracted, albeit not full, China-style lock-downs alone are not sufficient to contain the epidemic, although they are a necessary ingredient of a policy response aimed at bringing the effective reproduction number below one. Similarly, it does not seem plausible for voluntary social distancing to do the job on its own, without relying on herd immunity and/or mass vaccination. Related Literature A very large body of research investigates the COVID-19 outbreak and the policies to contain its spread. 4 For example, Fang, Wang, and Yang (2020) analyze efforts to contain the COVID-19 outbreak in China, measuring the effectiveness of the lock-down of Wuhan and showing that these policies also contributed significantly to reducing the total number of infections also outside of Wuhan. Similarly, there is ample reduced form evidence on the impact of mandatory social distancing using state and county level data in the case of the United States, and for a few other countries. However, there are not many studies on the relative importance of mandatory and voluntary social distancing, especially for large cross sections of countries. Caselli et al. (2020) find that both lock-downs and voluntary social distancing helped contain the first wave of COVID-19, but mandatory interventions have been critical. Jinjarak et al. (2020) find that more stringent policies are associated with lower mortality growth rates in a large cross section of countries but with some heterogeneity depending on demographics, the degree of urbanization and political freedom, as well as the international travel flows. In general, however, countries with more stringent policies at the onset of the epidemic realized lower peak mortality rates and exhibited lower duration during the first epidemic wave. We distinguish not only between mandatory and voluntary social distancing, but also consider the role of herd immunity in lowering the reproduction number. To our knowledge, no study which considers voluntary or government-mandated social distancing also controls for the possibility of herd immunity and distinguishes its impacts on effective reproduction numbers from the influence of policy and/or behavioral factors. A number of studies consider the effects of different intervention strategies -such as isolating the elderly, closing schools and/or workplaces, and alternating work/school schedules -which should lower the average number of contacts of specific age groups, contact locations, or time windows relative to normal (pre-COVID) patterns using calibrated behavioral SIR or compartmental models. 5 We take an empirical/econometrics approach calibrating only the recovery rate, γ; a parameter on which we have much more precise clinical information. Various methods are available in the epidemiological literature to estimate the reproduction numbers at the beginning and/or in real time during epidemics, but there is no uniform framework. Estimation approaches that are data-driven and which involve simplifying assumptions include the use of the number of susceptibles at endemic equilibrium, the average age at infection, the final size equation, and calculation from the intrinsic growth rate of the number of infections (Heffernan, Smith, and Wahl 2005) . Estimation of reproduction numbers based on different models 5 See for example Acemoglu et al. (2020) , Akbarpour et al. (2020) , Atkeson, Kopecky, and Zha (2020b), Cakmakli, Demiralp, Kalemli-Ozcan, Yesiltas, and Yildirim (2021) , Cakmakli, Demiralp, Kalemli-Ozcan, Yesiltas, and Yildirim (2020) , Matrajt and Leung (2020) , Toda (2020), and Chudik, Pesaran, and Rebucci (2020) among many others. are reviewed by Chowell and Nishiura (2008 ), Obadia, Haneef, and Boëlle (2012 ), and Nikbakht et al. (2019 . More recent contributions, focusing on estimation of reproduction numbers for the COVID-19 pandemic based on death statistics include Atkeson, Kopecky, and Zha (2020b), Baqaee et al. (2020), Korolev (2020) and Toda (2020). Three closely related papers are Fernández-Villaverde and Jones (2020), Atkeson, Kopecky, and Zha (2020a), and Cakmakli and Simsek (2020) . Fernández-Villaverde and Jones (2020) estimate transmission rates (β t /γ) for many jurisdictions as we do based on the number of observed deaths to infer the number of infections. The rationale is that confirmed infections are subject to significant measurement errors due to limited testing. Our estimates require only case data and correct for measurement errors due to under-reporting associated with testing capacity. Atkeson, Kopecky, and Zha (2020a) report a set of stylized facts on death rates for a large cross section of countries and conclude that mortality falls rather uniformly across countries within the first 20-30 days after the first 25 cumulative deaths and remain low through the summer of 2020. As the paper notes, this implies that both the effective reproduction numbers and the transmission rates of COVID-19 fall uniformly from high and heterogeneous initial levels and remain relatively low thereafter. We document much more heterogeneity during both the first and subsequent waves. Cakmakli and Simsek (2020) estimate a Susceptible-Infected-Recovery-Death (SIRD) model allowing for unreported cases as we do for six large countries. But as noted above, our method of moment estimation only uses the number of infected cases and does not use data on removed (recoveries plus deaths) that are particularity unreliable. The rest of the paper is organized as follows. Section 2 discusses our SIR model with social distancing. Section 3 discusses the econometric estimation based on a moment condition derived from an agent-based model. Section 4 reports our estimates of the reproduction number for a selection of key countries and regions worldwide. Section 5 analyzes the relative importance of mandatory and voluntary social distancing. Section 6 concludes. An online appendix reports technical details and supplemental results. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) There are many approaches to modelling the spread of epidemics. The basic mathematical model widely used by researchers is the susceptible-infective-removed (SIR) model advanced by Kermack and McKendrick (1927) . This model and its various extensions have been the subject of a vast number of studies, and have been applied extensively to investigate the spread of COVID-19. A comprehensive treatment is provided by Diekmann and Heesterbeek (2000) with further contributions by Metz (1978) , Satsuma et al. (2004) , Harko et al. (2014 ), Salje et al. (2016 , amongst many others. The basic SIR model considers a given population of fixed size n, composed of three distinct groups, those individuals in period t who have not yet contracted the disease and are therefore susceptible, denoted by S t ; the 'removed' individuals who can no longer contract the disease, consisting of recovered and deceased, denoted by R t ; and those who remain infected at time t and denoted by I t . Thus, As it stands, this is an accounting identity, and it is therefore sufficient to model two of the three variables (S t , I t , and R t ) to obtain the third as the remainder. The classic SIR model is deterministic. It is cast in the following set of difference equations (for t = 1, 2, ..., T ) The parameter β is the rate of transmission, while γ is the recovery rate. For a given non-zero initial values S 1 and I 1 , and the parameter values for β and γ, the evolution of the number of infected and recovered individuals is deterministic and given by the recursive solution of (2)-(4) for given initial values. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The evolution of the epidemic crucially depends on the two key parameters β and γ. It is easy to see from equation (2)-(4) that, without any mitigating intervention, the epidemic will spread if β/γ = R 0 > 1 and will cease only after infecting (R 0 − 1)/R 0 of the population. The parameter ratio β/γ = R 0 > 1 is referred to as the basic reproduction number, also defined in a stochastic context as the expected number of secondary cases produced by a single infected individual in a completely susceptible population. The terminal condition (R 0 − 1)/R 0 is the herd immunity threshold. In the case of COVID-19, a number of different estimates have been suggested in the literature, placing R 0 somewhere in the range of 2.4 to 3.9. 6 So, the classical SIR model predicts that in the absence of intervention as much as 2/3 of the population could eventually become infected before herd immunity is reached. Such an outcome would involve unbearable strain on national health care systems and a significant loss of life. This well understood possibility triggered unparalleled mitigation and containment interventions, first by China and South Korea, then Europe, the US and all other countries around the world. Such interventions, which broadly speaking we refer to as "social distancing" include case isolation, mandated face mask wearing, banning of gatherings, closures of schools and universities, and even local and national lock-downs; all aimed at slowing down the transmission rate of the virus. It is clear that these policies, together with voluntary changes in behavior in response to the epidemic, make it harder for the virus to transmit between individuals. One way to capture the impact of social distancing in the above model is to allow the transmission rate parameter, β, to be time-varying. In the remainder of the paper we will treat β as time-varying, whilst we assume the recovery rate γ is time invariant based on clinical evidence discussed below. We will refer to β t /γ as the "effective transmission rate". Following Pesaran and Yang (2020), hereafter PY, the transmission rate β and its time-varying version β t can be represented by β t = τ t κ t , where τ t is the individual vulnerability to infection given contact (or exposure intensity) and κ t is the average contacts per day. Social distancing, be it mandated and/or voluntary, can clearly influence both τ t and/or κ t and will result in time variation in the transmission rate. PY also show that the classic aggregate SIR model (2)-(4) with time-varying transmission rate can be obtained as an approximation (for a large population n) to an individual-based stochastic network model of epidemic, where individuals randomly interact with each other. Stochastic simulation results obtained by PY also show that a single group model provides a good approximation to a multi-group alternative. In PY's setting, the effective reproduction number is given by where β t /γ is the effective transmission rate, c t = (n − S t )/n = 1 − s t is the fraction of population that have been infected (cumulation of new infected cases), and 1 − c t is the herd-immunity component of R et . In this setting, R et is the expected number of secondary cases produced by one infected individual in a population that includes both susceptible and non-susceptible individuals at time t. It is also worth bearing in mind that at the outset of epidemic outbreak, assuming a fully susceptible population, we have s 0 = 1 (c 0 = 0), which in turn ensures that R e0 = β 0 /γ = R 0 . As the epidemic evolves, the average number of secondary cases caused by a single infected individual will vary over time as a result of decline in the number of susceptible individuals (due to immunity or death) and/or changes in behavior due to social distancing. In our empirical analysis, we provide country-specific estimates of R et together with the effective transmission rate β t /γ which permits assessing the influence of herding. In the last section of the paper, we estimate the key determinants of β t across a number of European countries. Note that the herding component of R et , namely 1 − c t , is given and can not be affected by changes in mitigation policies. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) In order to estimate country-specific time-varying transmission rates, β t , we utilize the following aggregate (non-linear) moment condition derived from the agent-based stochastic model of PY: where i t = I t /n is the number of infected individuals (scaled by population), and the last term is the approximation error in n, which we ignore since n is large. Here, it is important to note that global roll-out of inoculations can be a gradually more important additional factor in reducing the transmission rate. Moreover, mutations might also play a more significant role in the moths ahead. In estimating β t we face two data-related difficulties. The first difficulty is in obtaining data on active cases, where R t is the number of recoveries and/or dead. This is not straightforward because data on R t either do not exist or are unreliable due to considerable measurement difficulties. Consider for example Europe. The recorded data on recoveries are unavailable for Spain and UK; they are of poor quality for France and Italy; and they are relatively close to our estimated recovery for Austria and Germany. To overcome this problem, we use the SIR model's recovery equation to impute data on recoveries based on the confirmed cases alone, assuming a recovery rate γ = 1/14. We obtain very similar results if we use γ = 1/21. 7 The choice γ = 1/14 is consistent with the assumptions made in designing quarantine policies based on clinical evidence and also used in calibrated behavioral epidemic models. 8 The second difficulty is with the measurement of confirmed cases, which are likely to be underreported, in part due to fact that a non-negligible portion (perhaps about a half) of the cases is asymptomatic and therefore unlikely to be detected without large-scale testing. To mitigate the problem of under-reporting, we follow the epidemiological literature (see, for example, Gibbons et al., 2014) and assume that the magnitude of under-reporting is measured by the multiplication factor (MF -the ratio of true cases to reported cases). Denoting the observed values of c t and i t byc t andĩ t , we have c t = MFc t and i t = MFĩ t . Then the moment condition in terms of observed values (c t andĩ t ) can be written as We do not know the true M F . We allow for the multiplication factor to be larger than one. We abstract from time variation in M F , and note that the magnitude of M F will not matter when only a relatively small fraction of the population has been infected as is true currently for COVID-19 worldwide. Estimation results in PY suggest a M F value of about 5 for most countries, declining slowly to about 2.5 towards the end of the sample. In figures reported below we use M F = 3 (a conservative value for the end of the sample, where herd immunity plays the most important role). In the online Appendix, we compare these results with the ones obtained for M F = 5. Estimates of the reproduction numbers are not sensitive to the choice of M F , whereas estimates of the effective transmission rate can be sensitive to the choice of M F towards the end of the sample in those countries where the reported share of infected population is relatively large. In our panel estimation results for a sample of European countries in Section 5, we report estimates for M F = 3 While some countries might have good death statistics, using COVID-19 death data pose challenges similar to those raised by cases. The use of death data also has the added disadvantage of being a lagging indicator and could differ across countries due to factors such as age composition, obesity, and the quality of care system. 8 See, for example, the medical evidence documented in Ferguson et al. (2020) which implies a value for γ in the range 0.048 to 0.071. Our results are robust to assuming γ = 1/21. and 5, but the panel estimation results are very similar also for M F = 2 and 7. 9 One additional challenge is that the reported daily data are subject to weekly distortions (e.g., the reported number of cases on Sundays is usually lower compared with the infected cases reported for other days). To deal with this calendar distortion, as is common practice, we take seven-day moving averages of the reported data used in estimation. But again we note that our results are robust if we use reported daily cases without averaging. Using the moment condition (7), we compute rolling-window estimates of the transmission rate aŝ where W is the rolling window size, which we set to 14 days. To estimate β t following (8), we need observations on per capita infected and active cases, c t and i t . Using the recorded number of infected cases, C t , and population data, c t is readily available. For all countries we estimate the number of removed (including recoveries and deaths) by R t = (1 − γ) R t−1 + γC t−1 , for t = 2, 3, ...., where the recovery rate γ is set to 1/14, and the process starts from R 1 = 0, and reported C 1 . We then compute I t by subtracting the estimated R t from the recorded C t . We are now ready to report our model estimates. In this section we report country-specific estimates of the reproduction number, R et . We refer to it below as simply the "R number". We plot it alongside the effective transmission rate, β t × 14 = β t /γ, to separately assess the influence of 9 The data from the Diamond Princess cruise ship reported by Moriarty et al. (2020) suggest about half of the COVID-19 cases are asymptomatic, and therefore M F =2 seems to be a good lower bound. 10 The COVID-19 data are sourced from the repository of the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University-available at https://github.com/CSSEGISandData/COVID-19. The population data (for year 2019) are obtained from the World Bank database, available at https://data.worldbank.org/indicator/SP. POP.TOTL. herding from social distancing, for a large sample of countries. In the next section of the paper, we will identify and estimate factors that contribute to the evolution of β t /γ, distinguishing between voluntary and mandatory social distancing for a selected group of European countries that have experienced similar starts to the outbreaks in March 2020, but with differing outcomes. While we estimate the two parameters of interest for all jurisdictions for which JHU reports case statistics, in this section we report only the results for selected countries and regions. 11 The charts on the right-hand-side report two lines. The solid (red) line is the estimated R-number, The dotted (blue) line is the effective transmission rate,β t × 14 =β t /γ. This is the variable that we model in Section 5. Recalling that the effective transmission rate,β t /γ, coincides withR et only when c t ≈ 0, but as the epidemic spreads more widely we have c t > 0, herd immunity can eventually start to play a non-negligible role and manifests itself in later stages of the epidemic with an increasing gap betweenR et andβ t /γ, depending on the magnitude of c t . Also, we expectβ t /γ to be in the range 0 to 3 (similarly toR et ), andR et to be smaller or equal to the effective transmission rate as the epidemic progresses. Thus the gap between the red and the blue lines is a function of s t = 1 − c t , the share of susceptible (not yet infected) population. We start by estimating the effective transmission rate,β t /γ, and hence the R numbers, when the seven-day moving average of new cases exceeds a threshold of 50 cases to ensure a reasonably 11 The full set of estimation results is available on the authors' websites (sites.google.com/site/alexanderchudik/, pesaran.com, sites.google.com/site/alessandrorebucciphd/). precise estimate of β t /γ. Note that at the early stages of the spread of the infection, when both c t and i t are close to zero, estimation of β t /γ becomes problematic as can be seen directly from (6). In effect it involves computing the ratio of two very small numbers, each subject to sampling errors. Note also that, since some countries (in particular China) were able to virtually eradicate the virus in some sub-periods, there will be gaps in our charts reporting the R numbers. In addition, we start to report estimated R numbers at the beginning of the sample from the day in whichR et < 3 for the first time. This is to avoid showing widely varying estimated values in the initial days of the epidemic driven by unusually large growth rates of new confirmed cases, which could reflect delays in reporting the number of infected cases. China experienced a large first wave followed by a few small and localized outbreaks ( Figure 1 ). Two points are worth highlighting. First the R number comes down very fast, in less than a month during the first wave. This is consistent with disaggregate evidence in Fang, Wang, and Yang (2020) and also clinical evidence. Second, the effective reproduction number always coincides with the effective transmission rate in the case of China, given the fact that only a very small fraction of population has been infected. The number of infected cases in China is 90, 000 × M F out of a population of 1.4 billion. This is a very small share even if we set M F to 20, which is at the upper end of the estimates reported for M F across many countries and reviewed in the Introduction. This confirms herd immunity had no role in the reduction of the effective reproduction number in the case of China. 12 When the epidemic resurfaces, the estimated effective transmission rate increases sharply, but the extremely small number of cases permitted due to aggressive containment strategies prevented any new large-scale spread of the virus. Note that under mandatory social 12 The effective reproduction number coincides with the effective transmission rate in most other Asian countries. Nonetheless, even in Asia, we observe a great deal of heterogeneity in terms of the shape of the epidemic curve. Japan and Indonesia fared better at the start of the pandemic, but did not avoid a large second wave. South Korea, in contrast, had two waves, one in March 2020 and a second toward the end of 2020, possibly reflecting its decision to avoid China-style mandatory social distancing, embracing a strategy revolving around testing and tracing with less restrictive limits on mobility and interactions (results not reported but available from the authors). distancing the population never reaches herd immunity. So infections recur if containment is relaxed and the virus has not yet been fully eradicated. Figure 1 shows that China was successful not only in containing the epidemic at the start of its outbreak, but has thus far also been able to eradicate it quickly whenever it has re-surfaced through international travel. As we shall see, most other countries have not been able to accomplish this. The bottom panel of Figure 1 reports results for the rest of the world excluding China. As we noted earlier, these estimates are based on aggregate cases, as opposed to averages of country specific estimates. In the rest of the world, the COVID-19 epidemic started later than in China and the R number comes down more slowly compared to China, never really falling below one until the end of 2020. The R number increased from May to July 2020, and then again starting at the end of August 2020. As a result, the pandemic's incidence was many, many times higher than in China in terms of cases. Indeed, our estimation results show that even an R number slightly above one can be devastating once the epidemic has spread widely. Overall, the rest of the world as a whole never managed to eradicate the epidemic to an extent comparable to China. Not surprisingly, as restrictions ease during the summer of 2020, the epidemic resurfaces and worsens dramatically. Moreover, some of the decline in the R number is due to herd immunity, which is extremely costly in terms of lives and, possibly, long term health consequences for the population. Comparing Northern and Southern Hemispheres reported in Figure 2 , we see that climate has made a difference to both the initial spread, which was faster in the northern winter, and the shape of the epidemic curve, which was more persistent in the southern hemisphere. It does not, however, make a significant difference in terms of the epidemic peak; the number of daily new confirmed cases peaked about 10-12 per 100k population in the Northern Hemisphere, whereas the peak number of new cases (per 100k population) was about 10 the Southern Hemisphere in January of 2021. In the South, the R number declined more slowly, but eventually dropped below one for several months 16 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint in the middle of 2020. In both hemispheres, the estimates suggest that the COVID-19 transmission rate was falling in February 2021. China) and Sub-Saharan Africa. Large differences can be observed not only in terms of the magnitude of the peaks in new infections, but also in the trajectory of the epidemic more broadly. South Asia experienced a protracted single peak culminating in September 2021, which is reflected in the overall R number not falling below one from the start of the epidemic until early in September 2020. By contrast, Sub-Saharan Africa experienced two definite peaks (July 2020 and January 2021). North America and Western Europe experienced three major waves. The first wave occurred in March/April in both regions. After some significant community spread of the virus, containment policies were enacted which helped to bring the R number below one in a very short period of time. In North America, containment measures were relaxed quicker, and therefore the R number did not stay below one for long, resulting in the second wave in the summer of 2020. By contrast, the R number stayed below one for longer in Western Europe, until about mid-summer, when the virus began to spread exponentially again, resulting in the second (and largest) European wave in the Fall. After the new containment measures, R number declined again, but it did not stay below one for long, resulting in the third wave of infections in January 2021 in both regions. Experience from the remaining regions is more atypical than one might expect from the epidemic models. New cases in the Middle East and North Africa and, to some extent East Asia and Pacific 13 Table A1 in the online Appendix lists countries included in each region. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint (excl China), exhibit a broad upward trend throughout 2020 with a number of local peaks; new cases data for Latin America and Caribbean appear to be subject to much more noise compared with any other regions, and there is an unusual jump in the daily new cases in Eastern Europe and Central Asia, driven by the data for Turkey. R numbers closely reflect the first derivative of the smoothed version of the new cases data in all regions; new cases subside when R falls below one and increase when R is above one. The difference between the solid red lines (R numbers) and the dotted blue lines (effective transmission rate) is virtually zero in the most successful regions in terms of the total number of cases, such as Sub-Saharan Africa and South Asia, suggesting that herd immunity played no role in these regions due to the relatively small number of overall infections. On the other hand, the gap between the two lines is largest in North America, followed by Western Europe, showing that herd immunity has started to contribute more meaningfully to mitigation of the epidemic in these regions starting in December 2020. Clearly the trajectory of the epidemics has been quite heterogeneous across regions. In addition, there are considerable differences across countries within each region, to which we now turn for selected large countries. We report estimates for the United States, Brazil, India and Russia in Figure 5 , for South Africa, Australia, Iran and Turkey in Figure 6 , and nine European countries in Figures 7-8-Belgium, France, Germany, Italy, Netherlands, Poland, Portugal, Spain, and UK. The selected countries include most of the G20 economies with the widest regional coverage globally. In contrast to China's and the rest of the world, the United States (reported in the top panel of The US case also stands out because of the three very distinct waves, with the second and the third 18 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint re-emerging after a brief fall of the R number below one. This led to a much higher number of infections per 100,000 people compared to the rest of the world. Like the United States, Brazil's estimates also show visible gaps between the R number and the effective transmission rate starting in mid-2020. The case count in Brazil is more volatile compared to the United States and the remaining countries, possibly due to differences in the data quality other than under-reporting controlled for with the multiplication factor. Unlike the US case, Brazil brought down the R number more gradually, falling below one for the first time only during the summer of 2020. This resulted in a protracted first wave that peaked in August. The R number however did not remain below one for long, and in November a second large wave took off. India also experienced a protracted first wave. Estimates of the R number in India stayed above one until late September. Nevertheless, India did not experience a large number of cases per 100k population, compared with the remaining countries. As a result, herd immunity has not played a role in India. Russia, by contrast, experienced two large waves. Similarly to the western countries, Russia managed to bring the R number down relatively fast, but not permanently, resulting in a larger second wave at the end of 2020. A two-wave epidemic trajectory is also observed in the case of South Africa and Australia (in Figure 6 ), but with a different time profile. The first wave of the epidemic peaked in July 2020 in South Africa as authorities were unable to bring the R number below one quickly enough. South Africa, as the richest country in the region, stands out with much higher infection rates compared to the rest of Africa. Australia, on the other hand, managed the virus very well. We can see two small peaks, one in March and the second in July-August 2020, each followed by a rapid decline in the R number well below one, each time almost eradicating the virus without any discernible contribution from herd immunity. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint 2020. By contrast, new cases in Turkey were detected in March and remained low for quite a few months before rising dramatically to a peak of 165 per 100,000 in December 2020. The associated R numbers for Iran and Turkey also show very different trajectories, with Turkey's R number hitting the maximum value of 3 during the December 2020 peak. The estimation results for selected European countries are reported in Figures 7-8 . We report the same sample of countries as the one used in the next section for panel estimation of the transmission rate determinants. The virus outbreak in continental Europe begins with Italy in early 2020, with the recorded number of infections accelerating rapidly from February 21, 2020 onward. A rapid rise in infections takes place about one week later in Spain, Germany and France, followed by Austria (not reported) at the end of February. As the rolling estimates show, the R number fell below one in mid-to late-April in all these countries. As lock-downs were eased during the summer, however, the transmission rates started to rise again. By the end of the 2020, the R numbers were much more dispersed, with some countries doing better than others. However, all large European countries reported in Figures 7-8 show a second wave much larger than the first one. The United Kingdom, Spain, Portugal and Netherlands exhibit distinct third waves, with larger case counts compared with their second-waves. In summary, only China and a few other countries have been successful in containing the COVID-19 epidemic well. Contrary to common perception, however, not all countries accomplished this with the same draconian mandatory social distancing as in China. So we now turn to explaining the effective transmission rates to better understand the heterogeneity that we described, focusing on selected European countries reported in Figures 7-8 , all experiencing quite similar starting dates and the initial wave of the epidemic, but quite differing subsequent trajectories. We saw earlier that the spread of the epidemic in Europe followed very similar patterns during the first wave, but diverged significantly towards the end of 2020 both in terms of epidemic peaks, level of effective reproduction numbers, and the importance of herd immunity in slowing down the 20 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. spread of the virus. It is well known that many factors can contribute to the realized evolution of the epidemics, and particularly the effective reproduction numbers, R et . Noting that Also as noted already, over our sample period, that covers the first ten months of the pandemic, the evolution of transmission rates across countries are primarily determined by changes in behaviour (average contact numbers and exposure intensities per contact), with mutation playing a secondary role, and vaccination only having a very small role in the case of a few countries towards the end of our sample. Therefore, in explaining the cross country variations in β t in what follows we shall focus on policy interventions and behavioural factors. Consistent with the simple decision theoretic model presented below, as well as a large literature on behavioral epidemic modeling, we consider three main factors: mandatory social distancing, economic incentives to comply with them in the form of economic support, and the awareness and information about COVID-19 and its rate of spread that can affect voluntary social distancing. Mandatory social distancing directly reduces the number contacts as well as the exposure intensity. A strong theoretical rationale for imposition of mandated social distancing is the presence of externalities, i.e. the fact that agents do not internalize in their cost-benefit analysis that their individual behavior contributes to the aggregate diffusion of the epidemic. 14 However, mandated social distancing imposes economic costs and infringes on individual liberty leading to personal inconveniences (Hamermesh, 2020) . This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10. 1101 /2021 Economic support to workers, households and small businesses during the pandemic can shape incentives of individuals to comply with mandatory social distancing, as it weakens the economic need to interact in work activities. Consider an individual who has a non-teleworkable job and is fired or furloughed. While this leads to an immediate loss of income, if economic support is adequate, individuals can weather the pandemic without needing to seek paid employment in exposed occupations continuing to interact in production activities. Lack of compliance with social distancing has been documented empirically by Wright et al. (2020) . Based on survey evidence, Papageorge et al. (2021) find that higher income is associated with larger changes in self-protective behavior, particularly for individuals who cannot telework. They conclude that, both in the United States and elsewhere, policies which assume universal compliance with self-protective measures or that otherwise do not account for socio-economic differences in the costs of doing so are unlikely to be effective or sustainable. It is also well understood that risk induces precautionary behavior. Behavioral models of COVID-19 diffusion show that, as the probability of getting infected rises, individuals lower consumption and leisure activities to avoid infection (see Eichenbaum, Rebelo, and Trabandt (2020) , Toxvaerd (2020), Atkeson (2021), and Gupta, Simon, and Wing (2020) ). In particular, Battiston and Gamba (2020) provide cross section evidence that the R number during a COVID-19 outbreak is lower the larger the size of the initial wave. To further clarify and motivate our empirical approach and to provide some theoretical rationale behind our modelling strategy, here we introduce a simple decision-theoretic model of social distancing. Consider an individual j from a fixed population of size n in the epidemic day t, and suppose the individual in question is faced with the voluntary decision of whether to isolate or not. Under self-isolation, an individual that does not telework incurs the loss of wages net of any COVID-19 economic support amounting to (1 − τ jt )w jt , plus the inconvenience cost, a jt , of being isolated, where w jt is the wage and τ jt is the percentage of income lost which is compensated by the government support. For those individuals who can work from home τ jt is likely to be 1 or 22 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint very close to it. But for many workers who are furloughed or become unemployed, τ jt is likely to be close to zero, unless they are compensated by transfers from the government. On the other hand, if the individual decides not to self-isolate then he/she receives the uncertain pay-off of (1−d jt )w jt −d jt φ jt , where d jt is an indicator which takes the value of unity if the individual contracts the disease and zero otherwise. The parameter φ jt represents the cost of contracting the disease and is expected to be quite high. We are ruling out the possibility of death as an outcome and also assume that if the individual does not isolate and get sick does not earn the wage. In this setting the individual decides to self-isolate if the sure loss of self-isolating is less than the expected loss of not self-isolating, namely if where I t−1 is the publicly available information that includes c t−1 , the total number of infections. We assume that the probability of anyone contracting the disease is uniform across the population and this is correctly perceived to be given by π t−1 . Hence E (d jt |I t−1 ) = π t−1 , and the condition for self-isolating in any day t can be written as (2 − τ jt )w jt + a jt < π t−1 (w jt + φ jt ), or as 2 − τ jt + (a jt /w jt ) 1 + φ jt /w jt = µ jt < π t−1 . Since π t−1 ≤ 1, then for individual j to self-isolate we must have µ jt < 1 (note that µ jt ≥ 0, with This condition clearly illustrates that an individual is more likely to self-isolate if the relative cost of contracting the disease, φ jt /w jt , is higher than the inconvenience cost of self-isolating plus the proportion of wages being lost due to self-isolation. Also, an individual is more likely to selfisolate voluntarily if the wage loss, measured by τ jt , is low thus providing an additional theoretical 23 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. argument in favor of compensating some workers for the loss of their wages, not only to maintain aggregate demand but also to encourage a larger fraction of the population to comply with mandatory social distancing. The above formulation could also captures the differential incentive to self-isolate across different age groups and sectors of economic activity. Given that the epidemic affects the young and the old differently, with the old being more at risk as compared to the young, then φ old > φ young , and the old are more likely to self-isolate. Similarly, low-wage earners are more likely to self-isolate as compared to high-wage earners with the same preferences (φ jt and a jt ), and facing the same transfer rates, τ jt . But the reverse outcome could occur if low-wage earner face a higher rate of transfer as compared to the high-wage earners. These and many other micro predictions of the theory are embedded in this specification of voluntary social distancing decision. According to this simple model the fraction of population that are willing to socially isolate voluntarily is given by where I(A) is an indicator function that takes the value of 1 if A holds and zero otherwise. It is clear that the extent of voluntary social distancing, p n,t , is positively related to the size of the economic support, τ jt , and the perceived net cost of contracting the virus, φ jt − a jt , which could rise sharply when epidemic surges and/or if better messaging by health authorities about the true costs of contracting the disease is provided. To capture voluntary as well as mandatory social distancing policies, in our statistical analyses we make use of data compiled by the Oxford COVID-19 Government Response Tracker (OxCGRT) project, which is a standard source of comparable indices measuring social distancing and other COVID-19 policies across countries. 15 In particular, we use two aggregate indices: the 'policy stringency index' (capturing the containment and closure policies) and the 'economic support index' (as a proxy variable for support to comply with the containment policies). We model precautionary behavior leading to voluntary social distancing with a threshold effect explained in more detail below. 15 Data available at https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-government-response-tracker. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint The econometric specification that we propose is based on the same framework used in Section 4 for the estimation of the evolution of the R number. Namely, by taking logs on both sides of equation (6) above and recalling that n is large, we obtain where the subscript j denotes individual countries, j = 1, 2, ..., N , and e j,t+1 is an error term, assumed to be orthogonal to i jt . We complement the structural equation (12) with the following specification for the transmission rate process where x j,t−p is a vector of regressors, lagged p periods, and the indicator variable, is an indicator variable which takes the value of unity if the threshold variable, f jt , also lagged p periods, goes above the threshold parameter τ f , which as a first-order approximation is assumed to be the same across countries. The choice of the threshold variable, f jt , is discussed below. In addition to estimating the panel regressions with a common constant term, a, we also check the robustness of our results by estimating the panel with fixed-effects where we replace a in (13) with country-specific intercept terms, a j , for j = 1, 2, ..., N . As a threshold variable, we use the 7-day moving average of the reported number of new cases This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint Substituting (13) in (12), we obtain the following estimating equation: where u j,t+1 is the following composite error term: Depending on the assumptions regarding the error terms u j,t+1 , we will report three types of standard errors for the estimates. First is the standard pooled (or fixed effects) standard errors, which assumes u j,t+1 is cross-sectionally as well as serially uncorrelated. Our second reported standard errors, labelled as "robust1", allows for serial correlation and heteroskedasticity, while our third choice for standard errors, denoted as "robust2", allows u j,t+1 to be correlated both over time as well as over countries (namely over both j and t dimensions). Further details are provided in the online Appendix. The parameters of interest are a (or country-specific a j ), ψ, κ, and τ f . Our regressors are weakly exogenous and therefore the specification in equation (14) can be estimated by least squares, since the time series dimension of the panel is very large (T = 321 to 343) and the cross-section dimension is rather small (N = 9). This is in contrast to short panels (T small and N large), where strict exogeneity is required for consistency of least squares method. Additional concern with the specification could be omitted variables and the presence of other confounding factors. The proposed specification (14) is parsimonious and encompasses the main factors considered in the literature. As contacts and susceptibility are not separately identified in our model, we use aggregate indices of mandatory, support, and voluntary distancing, and we do not attempt to disentangle measures targeting the frequency of contacts or individual vulnerability. We estimate the parameters of interest jointly using (14), as opposed to using a two-step procedure in which β jt is estimated from (12) first and its determinants are estimated as a second stage regression from (13), using the estimated values of β jt as the dependent variable from the first step. The advantage of estimating determinants of the transmission rate directly based on (14) is 26 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint that the joint approach allows us to obtain more precise estimates and hence make more accurate inferences. We focus on the nine European countries discussed in Section 4: Belgium, France, Germany, Italy, Netherlands, Poland, Portugal, Spain, and the United Kingdom-so N = 9 is relatively small. As we shall see, however, the time dimension of our panel is reasonably large (between 321 to 343 days), and we do not expect the relatively small N to be a problem for estimation and inference. The reason for focusing on these 9 countries is the fact that they experienced a similar start to the outbreak of the virus, but had differing outcomes subsequently. In this way we are able to exploit the cross country, as well as time series variations in the case data to identify the effects of mandatory and voluntary social distancing policies on the effective transmission rates, β jt /γ. Recall here that β jt /γ differs from the effective reproduction number, R j,et , given by R j,et = (1−c jt ) β jt /γ . As we noted earlier, R j,et can fall below unity not because of the effectiveness of the mitigating policies, but simply because an increasingly larger fraction of the population is getting infected, the so called herd-immunity effect. To avoid the confounding effect of herding on the outcome variable that we want to explain, we focus on modeling of β jt /γ and not R j,et . Consider first the panel regressions without threshold effect: where y j,t+1 is defined in (14) and j = 1, 2, ..., 9. The regressions are estimated with an unbalanced panel over the period February 23, 2020 to January 30, 2021, allowing for differences in the start dates of the outbreaks across the countries. The pooled estimates for this specification are reported in Table 1 . We report results for M F = 3 and M F = 5 (the multiplication factor used to correct for under-reporting) and the lag orders p = 14 or p = 21 days. As can be seen, both the social distancing policy index and the index of policy support have the expected signs, and are both statistically highly significant. The results are robust to alternative MF corrections and lag lengths. Adding This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint fixed effects slightly lowers the coefficient on the stringency index and increases the coefficient of the economic support indicator, but does not impact the overall conclusion. Using robust standard errors increases the estimated standard errors, as to be expected, but these increases are not sufficiently large to change the inference that we make. Both stringency and economic support indices remain highly significant. Another issue that could impact the inference that we draw is the heterogeneity in the slope coefficients across countries, namely the consequence of replacing ψ in (15) by ψ j . Table 2 reports country-specific estimates of the coefficients of interest, as well as the associated Mean Group (MG) estimates together with their standard errors as in Pesaran and Smith (1995) . The MG estimates are computed by taking simple averages of country-specific estimates, with their robust standard errors proposed by Pesaran and Smith (1995) . An important advantage of the MG estimates is their robustness to slope heterogeneity, in which case MG is a consistent estimator of the average slopes. An additional advantage of the MG estimates is the robustness to weak cross-sectional correlations of the country-specific estimates (Chudik and Pesaran, 2019) . The country-specific results continue to provide strong support for both explanatory factors. The reported MG estimates in Table 2 are all highly statistically significant and close to the fixed effect estimates in Table 1 , which provide further evidence of the robustness of our main results to slope heterogeneity. Table 3 adds the threshold variable to (15). The estimated threshold effects are also highly statistically significant. Interestingly, the stringency and support indices now have smaller coefficients, pointing to a possible upward bias in estimating the effectiveness of mandatory social distancing policies if the voluntary mitigating effects, driven by better information dissemination or the fear factor, are ignored. These results suggest that the role of mandatory policies might be overestimated in studies that do not explicitly allow for voluntary changes in behavior. Table 3 of our preferred specification that includes the threshold effect predicts an R number of around 5.11 (0.16) without any changes in behavior, be it voluntary and/or mandatory, with its standard error given in brackets. This is the reported estimated value of the common intercept, α, in Table 3 , which is higher than the basic reproduction number of 3 − 4 assumed in the literature. In our earlier specifications that ignore the threshold effect in 28 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; Table 1 , the estimated reproduction number lies in a range between 3.87 to 4.13 (see the common intercept in Table 1 ). The contribution of the threshold effect to reducing the R number is quite large, estimated in a tight range between -2.19 and -2.35, bringing the R number down from 5.11 to just below 3. 16 The contribution from the economic support index can also be interpreted as voluntary to the extent to which it captures the working of incentives to comply with mandatory restrictions as we discussed earlier. Our estimates of the contribution of the economic support to the change in the R number range between -0.13 and -1.05, suggesting a high degree of heterogeneity across the specifications in Table 3 , in contrast to the coefficients estimated for the other variables, which lie in a rather narrow range. 17 Together with the contribution from the threshold effect, the upper bound on the impact of voluntary changes in behavior, therefore, is estimated to be reduced R numbers ranging from 1.7 to 2.8, which is still well above unity. Recall that we need to bring the R number below one before we are confident in falling case numbers. Hence, our estimates suggest that voluntary behavior alone is not sufficient to bring the R number below one without substantial contributions from herd immunity. In contrast, the estimates of the coefficients of the stringency index lie in the narrow range of -2.14 to -2.39 (see Table 3 ). We conclude from this exercise that mandatory social distancing in conjunction with the effects from the voluntary social distancing can bring the R below one without any additional help from the herding component. The estimates of the stringency index coefficients in Tables 1 and 3 also show that omitting to control for voluntary behavior can lead to overestimation of the impact of mandatory social distancing. Finally, given the highly infectious nature of COVID-19, with its basic reproduction number estimated to be between 3 to 4, mandatory policies are likely to be effective if combined with mass vaccination programs, and the present research should be viewed as an interim report on a complex 16 Interestingly, the threshold effect kicks in at relatively very low levels of infections. In particular, the estimated threshold value is only 0.12 cases per 100,000 population. 17 Regarding the interpretation of the estimated coefficients of the stringency and economic support variables, we note here that we have divided the 0-100 Oxford stringency and economic support indices by 100 so that our scaled variables take values between 0 (no restrictions/economic support) and 1 (maximum restrictions/economic support as defined by Hale et al., 2020) . This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint continuously evolving process. This paper makes two related contributions. It first estimates effective reproduction numbers for all jurisdictions for which JHU reports COVID-19 statistics based on a moment condition that can be derived from an agent-based stochastic network epidemic model. It then explains their evolution distinguishing between herding, voluntary, mandatory social distancing and incentive to comply in a group of European countries with similar experiences at the outset of the pandemic but different outcomes subsequently. From a methodological perspective, the econometric framework that we propose permits distinguishing, at any jurisdictional level, between changes in the effective reproduction number due to herd immunity and changes due to variation in the average contact or the susceptibility to infection, which are the structural determinant of the epidemic diffusion. At the empirical level, using only JHU daily COVID-19 case statistics, the paper provides estimates of transmission rates, allowing for the under-reporting of infected cases in available COVID case statistics. Importantly, it is shown that while targeted mandated policies can be very useful in flattening the epidemic curve, especially at the onset of the epidemic, they are not necessarily sufficient. In some cases, countries with seemingly different social distancing policies achieved quite similar outcomes in terms of number of infections and reproduction numbers. There are also a small number of countries, for example Taiwan, Tanzania and Vietnam, that did not implement China's draconian measures, yet have managed so far to accomplish similar low levels of infections and transmission rates. Consistent with this result, when we turn to explain effective transmission rates in Europe we find that all considered factors played a significant role: the stringency of the mandatory policies implemented, the economic support and the threshold which captures the voluntary component of social distancing. However, our estimates suggest that voluntary social distancing alone is not sufficient to bring the R number below one and to keep it there without substantial contributions 30 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint from herd immunity. We also find that the role of the mandatory component, which is shown in the literature (including this paper) to be critical in containing and controlling the rapid spread of the virus, could be overestimated when voluntary mitigating drivers are neglected. Our main conclusions are robust to lag orders, error heteroskedasticity, error serial correlation, fixed effects and slope heterogeneity. References Acemoglu, D., V. Chernozhukov, I. Werning, and M. D. Whinston (2020 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 32 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. Consider the panel data model (15), which, for convenience, can be equivalently written as y jt = θ ζ jt + u jt , for j = 1, 2, ..., N , where ζ jt = 1, x j,t−p−1 . We allow for unbalanced panel by assuming t = 1, 2, ..., T j . Letθ be the pooled estimator. We havê Variance ofθ is given by Assuming E ζ jt u jt = 0, we obtain This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. We estimate S nT by the Newey-West method, extended to our panel setup: ζ jt ζ j,t− û jtûj,t− , in whichû jt = y jt −θ ζ jt . We set w ( , m j ) = 1 − m j + 1 , and m j = m j,nT is chosen to be a suitable increasing function of the sample size. We set m j,nT to be the integer part of (T j ) 1/3 . Allowing for correlation of errors over time, as well as across units (countries) requires a different estimator of S nT . It is useful to re-write S nT as This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint in which we use N t as the index set of cross-section units with available observations for a period t. S nT is estimated asŜ This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3576703 P r e p r i n t n o t p e e r r e v i e w e d . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.06.21255033 doi: medRxiv preprint COVID-19 and emerging markets: An epidemiological model with international production networks and capital flows Bridging the COVID-19 data and the epidemiological model using time varying parameter SIRD model The great lockdown: dissecting the economic effects Quantifying the transmission potential of pandemic influenza Mean group estimation in presence of weakly cross-correlated estimators Voluntary and mandatory social distancing: Evidence on COVID-19 exposure rates from Chinese provinces and selected countries Mathematical Epidemiology of Infectious Diseases: Model Building, Analysis and Interpretation The macroeconomics of epidemics Human mobility restrictions and the spread of the novel coronavirus (2019-nCoV) in China Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. Imperial College London COVID-19 Reports Estimating and simulating a SIRD model of COVID-19 for many countries, states, and cities Measuring underreporting and underascertainment in infectious disease datasets: a comparison of methods Mandated and voluntary social distancing during the covid-19 epidemic: A review Variation in government responses to COVID-19 Lock-downs, loneliness and life satisfaction Exact analytical solutions of the Susceptible-Infected-Recovered (SIR) epidemic model and of the SIR model with equal death and birth rates Seroprevalence of antibodies to SARS-CoV-2 in 10 sites in the United States Perspectives on the basic reproductive ratio Correcting under-reported COVID-19 case numbers: estimating the true scale of the pandemic Accounting for global COVID-19 diffusion patterns A contribution to the mathematical theory of epidemics Identification and estimation of the SEIRD epidemic model for COVID-19 Early dynamics of transmission and control of COVID-19: A mathematical modelling study. The Lancet Infectious Diseases Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2) Evaluating the effectiveness of social distancing interventions to delay or flatten the epidemic curve of coronavirus disease The epidemic in a closed population with all susceptibles equally vulnerable; some results for large susceptible populations and small initial infections Public health responses to COVID-19 outbreaks on cruise ships -worldwide Comparison of methods to estimate basic reproduction number (R0) of influenza, using Canada 2009 and 2017-18 A (H1N1) data