key: cord-0486921-tzjkg9on authors: Cerqueti, Roy; Coppier, Raffaella; Girardi, Alessandro; Ventura, Marco title: The sooner the better: lives saved by the lockdown during the COVID-19 outbreak. The case of Italy date: 2021-01-28 journal: nan DOI: nan sha: 1918832da235e419b0faa1af0842dd32a48048c0 doc_id: 486921 cord_uid: tzjkg9on This paper estimates the effects of non-pharmaceutical interventions - mainly, the lockdown - on the COVID-19 mortality rate for the case of Italy, the first Western country to impose a national shelter-in-place order. We use a new estimator, the Augmented Synthetic Control Method (ASCM), that overcomes some limits of the standard Synthetic Control Method (SCM). The results are twofold. From a methodological point of view, the ASCM outperforms the SCM in that the latter cannot select a valid donor set, assigning all the weights to only one country (Spain) while placing zero weights to all the remaining. From an empirical point of view, we find strong evidence of the effectiveness of non-pharmaceutical interventions in avoiding losses of human lives in Italy: conservative estimates indicate that for each human life actually lost, in the absence of lockdown there would have been on average other 1.15, the policy saved in total 20,400 human lives. Exponentially growing threats require strong and early policy response. In the first wave However, policy makers reasonably hesitate to take resolute measures when threats appear to be limited. This caution is reasonable, because if countermeasures work, it will seem in retrospect as if the policy response was an overreaction, possibly causing a loss of consensus (Pisano et al., 2020 ; The Economics, 2020). Once the curve is flattened, the public will likely blame the incumbent government for the tremendous economic losses caused by social distancing orders without fully grasping their essential role in halting the spread of the viral disease. This paper adds to this debate, estimating the effects of lockdown -or, in general, of the so-called non-pharmaceutical interventions -on the propagation of COVID-19 with particular emphasis on the most relevant aspect: saving human lives. Its real-world relevance, implications and its high level of socio-economic meaningfulness need no further explanation. The task is particularly challenging from a methodological perspective due to typical selection bias problems, and this explains the growing interest of econometricians in this research question. At first instance, a natural candidate to face this challenge is the SCM -first introduced by Abadie and Gardeazabal (2003) and the subsequent studies by Abadie et al. (2010 Abadie et al. ( , 2015 . 1 At its very essence, the SCM involves the comparison of outcome variables between the treated unit, i.e., the unit affected by the intervention and similar but different unaffected units, reproducing an accurate counterfactual of the unit of 1 Since its introduction, the SCM has been widely used in social sciences and applied to a broad spectrum of topics, spanning from terrorism and crime to natural resources and disasters, political and economic reforms, immigration, education, pregnancy and parental leave, taxation, as well as social connections and local development. Athey and Imbens (2017) have defined it as the most crucial innovation in the policy evaluation literature over the last fifteen years. For a recent survey see Abadie (2020) . interest in the absence of intervention, commonly referred to as the synthetic unit. This method can be conceived as a data-driven procedure to retrieve non-treated units sharing similar characteristics concerning the treated in the pre-intervention period. In the COVID- 19 context, the treatment consists of non-pharmaceutical interventions implemented at a given point in time in a specific country or, in general, in a region, so that the SCM lends itself as one of the most immediate tools to face the problem empirically. For this reason, it has been (more or less successfully) repeatedly applied in the very recent literature on Despite its popularity, however, some real-world circumstances can make this instrument unapplicable, under penalty of biased estimates. This occurrence may be due to many reasons, such as the employment of different heterogeneous measurement methodologies whereby the same phenomenon is measured in different countries. The case under scrutiny falls exactly within these circumstances. To overcome this problem, we have followed a general approach proposed by Ben-Michael et al. (2020) , augmenting the SCM with the Ridge regression model, therefore obtaining a Ridge ASCM, which can be written as a weighted average of the control unit outcomes. To the best of our knowledge, the Ridge ASCM has not yet been applied to assess the impact of non-pharmaceutical interventions for the COVID-19 pandemic. It is possible to find just a couple of papers in the environmental field (notably, forest fires in Colombia, Amador-Jimenez et al., 2020, and air pollution in China, Cole et al., 2020) . Our contribution to the literature on the socio-economic effects related to the COVID-19 pandemic is manifold. First, it sheds light on a controversial policy intervention that has been and still is largely debated. The intervention's effectiveness is evaluated in terms of the most immediate and desired effect, namely avoided deaths. Secondly, from a strictly methodological point of view, it uses one of the most recent advances of the popular SCM, i.e., the Ridge ASCM, which allows one to overcome the non-negligible limit of non-perfect pre-treatment fit, which, in turn, would generate a biased estimate from the canonical estimator. 2 Thirdly, the study focuses on Italy, which is widely acknowledged as a paradigmatic case, owing to its pioneering role -immediately after China -in facing the current pandemic disease. The first Italian COVID-19 cases were registered quite early (January 2020 or even before) and contagion has accelerated since its inception. In March 2020, Italy was the country with the highest number of cases -apart from China -rapidly becoming the European epicentre of outbreak, with 207,428 confirmed cases and 28.236 deaths as of the beginning of May 2020 (Ministry of Health). These figures represented approximately 14% of all confirmed cases and 20% of deaths in Europe, 6% of confirmed cases and just over 11% of deaths worldwide. Moreover, and even more interestingly from our standpoint, Italy was the first Western country in which the government imposed restrictions on mobility, economic activities and social interactions -the already mentioned strict lockdown. The lockdown order was officially imposed in Italy from March 9 up to the May 18, 2020, 70 days. The intervention was highly criticized at that time, possibly because it was the first among Western countries, and it was not yet completely clear the importance of acting timely, especially on the part of some media and politicians. Nonetheless, many other European countries followed the Italian model within a few weeks, including the UK that initially claimed to be against such a type of intervention. A long list of scientific contributions witnesses the relevance of the Italian case for understanding the COVID-19 spread and, consequently, the effectiveness of non-pharmaceutical interventions (see, e.g., Bonacini et al., 2020; Cameletti, 2020; Eckardt et al., 2020; Lolli et al., 2020; Palladino et al., 2020, Peracchi and Terlizzese, 2020) . In particular, the works by Modi et al. (2020) and Cerqua et al. (2020) deserve special mentioning because both use the standard SCM to assess the plausibility of official figures, concluding that the true count of deaths due to COVID-19 is supposed to be significantly higher than the official count. Given the very high relevance of the issue at stake, we contribute to the debate over the effects of lockdown measures by using the Ridge ASCM and enjoying all its methodological advances. To the best of our knowledge, this is the first paper dealing with applying the Ridge ASCM to Italy's paradigmatic case and evaluating the effects of a lockdown in terms of public health. Operatively, the task mentioned above is particularly challenging for at least a couple of other reasons. First, all the other European countries (the most similar to the one under scrutiny) were sooner or later treated. Secondly, the virus spread followed different paths among countries at different points in time. Both features make constructing a credible counterfactual series particularly difficult, and this pitfall is exacerbated by the limits the standard SCM suffers from. Our results show that the SCM cannot generate a valid counterfactual, collapsing all the weights on one single unit, i.e., Spain, so that the lockdown effect would be calculated as a naive difference between the two countries. Allowing for negative weights, we provide evidence about the effectiveness of non-pharmaceutical interventions in saving human lives and avoiding a collapse of the Italian health care system. The take-home message from this study is twofold. From a methodological point of view, it suggests that the researcher must compare the weights before choosing whether to apply the SCM or the ASCM. A substantial distance of the two may be an indirect indicator of bias in the SCM estimates. The remainder of the paper is organized as follows. Section 2 introduces the Ridge ASCM as an extension of the canonical SCM and describes the estimation exercise dataset. Section 3 presents the empirical set up paying particular attention to how the benchmark specification is chosen and presents the estimates under testing in Section 4. Finally, Section 5 proposes some back-of-the-envelope computations based on the estimates and draws some conclusions. This section is devoted to the illustration of the Ridge ASCM procedure. To be selfcontained and provide a better understanding of this econometric method, we begin with a brief description of the canonical SCM. In its very essence, the SCM aims to simulate the outcome path of a country if it did not undergo a particular policy intervention. Operatively, the synthetic control is built as a weighted average of the units in the control group (donor pool), where the weights are chosen so that the synthetic control's outcome closely matches the treated unit's trajectory in the pre-treatment period, while also satisfying some constraints such as being non-negative or adding up to one. More formally, let Y it (0) and Y it (1) represent the potential outcomes for unit i, with i = We also assume that control potential outcomes are generated as a fixed component m it plus a mean-zero additive noise ε it drawn from some distribution, The treated potential outcome is then Y it (1) = Y it (0) + τ it where τ it represent the treatment effects -which are the objects of our estimation -and are fixed parameters. Therefore, the treatment effects can be rewritten as . The error terms in the post-treatment period are collected in the vector ε T = (ε 1T , . . . , ε N T ) and are assumed to be mean-zero and uncorrelated with treatment assignment. That is, the treatment assignment where E ε T denotes the expectation taken with respect to the error term ε T . It follows that the noise terms for the treated and control units do not systematically deviate from each other. Let X it represent pre-treatment outcomes that are used as and along with other covariates, X 0 represents the N 0 × T 0 matrix of control units pre-treatment outcome and covariates. Y 0T is the N 0 vector of control unit outcomes in period T . With only one treated unit, Y 1T is a scalar, and X 1 is a T 0 -row vector of treated unit pre-treatment outcomes and/or covariates. The potential outcome for the treated unit, Y 1T (0), is computed by the SCM as a weighted average of the control outcomes, Y 0T γ, being γ = (γ 1 , . . . , γ N ) the vector of weights. The elements of γ are chosen to balance pre-treatment outcomes and other covariates. 3 To our aim, the SCM can be formalized as a solution with respect to γ of the following constrained optimization problem where the constrains limit γ to the unit simplex and where The simplex constraint in (4) ensures that the weights will be sparse and non-negative, while the hyperparameter ζ > 0 penalizes the dispersion of the weights, following a suggestion by Abadie et al. (2015) . The optimization problem in (4) can be regarded as an approximate balancing weights estimator. These weights achieve perfect pre-treatment fit, and the resulting estimator has many attractive properties including a bias bound derived by Abadie et al. The Ridge ASCM proposes modifying the problem in (4) as follows: whereγ scm i is the estimated i-th SCM weight. In this context, the canonical SCM is a special case in whichm iT is constant. Albeit fully equivalent, equations (5) and (6) This case is referred to as the Ridge ASCM. In this case, the estimator of the post-treatment andη ridge are the coefficients of a Ridge regression of control post-treatment outcomes Y 0T on centered pre-treatment outcomes X 0 with penalty hyperparameter λ ridge : The Ridge ASCM estimator is then: is large, the adjustment term will be small and γ aug will remain close to the SCM weights, Ridge ASCM and SCM weights will be equivalent and the estimation error will only be due to variance of the weights and post treatment noise. It follows that λ ridge plays a crucial role and its value must be derived optimally. Operatively, one possibility is to follow the in-time placebo check proposed by Abadie et al. (2015) . LetŶ for some t ≤ T 0 as a placebo check. We can extend this idea to compute the leave-one-out cross validation Mean Squared Error (MSE) over time periods: (9) and the cross validation procedure chooses either the lambda that minimizes (9) As mentioned above, the ASCM 4 procedure's goal is to evaluate the impact of a lockdown on the most immediate and desired effect, namely the number of avoided deaths. 5 Specifically, the outcome variable Y is the mortality rate, which is defined as the cumulative death counts per million population (dth) taken from the Epidemic Intelligence team of the ECDC (European Center for Disease Prevention and Control). Since daily reported figures for deaths tend to be challenging to compare and qualify across countries due to possible confounding idiosyncratic socioeconomic differences related to health care systems and population ageing, we also consider several covariates, X, that are expected to be linked to the outcome variable. Accordingly, among the predictors, we include variables capturing the COVID-19 dynamics, such as cumulative cases per million population (num), which are intuitive predictors of mortality rates. The second group of predictors includes variables capturing the "resilience" of each country's health system. This subset includes the number of hospital beds per hundred thousand population (hsp) under the assumption that the more developed the health system, the less fatal the COVID-19 infection will be. As for the outcome variable, the source for num and hsp is the Epidemic Intelligence team of the ECDC. Following Sá (2020) and Rocklöv and Sjödin (2020), among others, we also include socioeconomic characteristics that are likely to be (positively) related to mortality rates; accordingly, the median age (age), as well as the average household size (hld ), are added to the set of covariates. All demographic variables are taken from the United Nations report (United Nations, 2019). We also control for "mobility trends" across different categories of places and behavior changes derived from Google Mobility Reports, which collect percentage changes in visits and length of stay at different places relative to a baseline given by the median values of the same day of the week from January 3, 2020, to February 6, 2020. 6 Following Chernozhukov et al. (2021), we focus on four out of six mobility sub-indices (namely, "Grocery and Pharmacy", "Transit Stations", "Retail and Recreation" and "Workplaces"). "Parks" and "Residential" are dropped because the former does not have clear implications on the spread of COVID-19, while the latter shows an overly-high correlation with "Workplaces" and "Retail and Recreation". We distil the information content conveyed by the Mobility indicators into a synthetic index (mob) by following a "nonmodel based" aggregation scheme, as discussed in Marcellino (2006) . 7 The subsequent logical step is identifying the donor states to form the synthetic control unit. When constructing a reliable counterfactual, it is well understood that the relationship between the predictors and the outcome variable in the donor pool must be as similar as possible to the relationship in the treated unit. Accordingly, the selection of the donor pool's candidate elements should be carried out by identifying countries sharing some key similarities to the treated one. In the present context, geographical proximity is a crucial factor to be considered as the spread of the pandemic has been not homogeneous across space and over time, moving from Asia in late 2019 to Europe at the beginning of 2020 and, subsequently, to the Americas. Given our focus on the Italian case, an obvious choice to select the donor pool's elements is to focus on European countries. Accordingly, we have included all members belonging to the European Union (except Luxembourg) plus Switzerland, Norway, and the United Kingdom (28 countries in total). Since the daily evolution of the mortality rate at the individual country level reflects different diffusion patterns at a given point in time, we have normalized the time unit such that "day 1" refers to the day on which cumulative infection cases per million exceeds one in the treated country as in Cho (2020). In our case, "day 1" corresponds to February 23, 2020, with the lockdown policy enacted on March 9. Therefore, in our setup, the pre-treatment period consists of 15 daily observations. Because the treated state contrasts to the control unit after treatment, the relevant policy under scrutiny (the impact of non-pharmaceutical interventions in our context) should not be enacted in any donor pool state during the study. Accordingly, our sample's ending date is given by the date when lockdown measures have taken place in the synthetic counterfactual. To identify such an average date, we have used an ad hoc index elaborated by the Oxford COVID-19 Government Response Tracker, namely the Stringency Index, SI, which collects standardized information on several differ-ent common government responses daily for a large number of countries. More specifically, we have followed Cho (2020) and defined the ending sample as the date on which the SI peaked in each donor country. Therefore, our sample's last observation occurs on "day 149', corresponding to April 12, 2020, so our post-treatment period consists of 34 data points. As a preliminary step, we compute the Average Treatment Effect on the Treated (ATT) that is the average deviation of the counterfactual series from the actual one over the treatment period (from March 9 2020, to April 11 2020) for each specification (a0)-(c5) to identify the specifications with a negative and statistically significant gap in a way which is consistent with our priors. Since there is more than one possible specification that satisfies the conditions above, we follow the recommendation by Ferman et al. (2020) of presenting results for many different specifications, and in particular, we include the specification (a0) as a benchmark. The upper part of Table 1 presents an overview of all the ASCM specifications that we consider in the analysis, while the last row reports the associated p-value for the computed ATT for the corresponding specification. 9 The results show that all of the ATTs are negative and statistically significant at the 5 percent level (or better), calling for a criterion to combine the test statistics for the individual specifications to distil them into a summary test statistic (Imbens and Rubin, 2015) . Expressly, we assume that the test function is simply a weighted average of the test statistics for individual specifications. The same equally-weighted scheme is applied to combine each specification into a synthetic statistic (Christensen and Miguel, 2018; Cohen-Cole et al., 2009). In this vein, Figure 1 shows the treatment effects, defined as the differences between the mortality rate in Italy and the synthetic control over the evaluation period, averaged 9 In all the specifications, we select the hyperparameter λ ridge as the largest λ within one standard error of the λ that minimizes the cross-validation placebo fit CV(λ) as discussed in Section 2.1 above. The results obtained under the alternative rule of picking the minimal λ are almost identical to those reported in the main text. across all specifications (continuous black line), as well as the benchmark specification (a0) (dashed line). As expected, the mean value of the treatment effects suggests a strongly negative impact a few days after the intervention date. Moreover, there is preliminary evidence of statistical significance for these deviations from the actual path in the long-run according to the confidence region computed as ±2.5 times the (median) absolute deviation from the median as suggested by Leys et al. (2013) . While this finding gives indirect support to the effectiveness of non-pharmaceutical interventions in reducing the mortality rate, there is still a need for a criterion to select a given specification from a set of possible alternatives. It is well known that when covariates are expected to be useless in explaining the outcome, the recommended specification should use all pre-treatment outcome lags, i.e., specification (a0) (Kaul et al., 2018) . However, if the control unit should also match several socio-economically relevant covariates, attention should be paid to the specifications allowing external predictors. Since b's are a particular case of their corresponding c's variant, we can safely restrict the focus on the latter group. A logical criterion to discriminate among the five remaining specifications with time-invariant and time-varying external predictors is given by each specification's lag structure. Since the pandemic dynamics are likely to affect the mortality rate with a temporal lag, specification (c3) is not the most desirable choice. Likewise, a lag structure based on either even or odd pre-treatment dates applied previously in SCM literature (see, for instance, Eren and Ozbeklik, 2016) , is hard to rationalize in our context, suggesting that (c1) and (c2) are second-best options to alternative lag structures. This argumentation leads us to focus on the two remaining models: (c4) and (c5). In what follows, we pick (c4) as our preferred specification, while specification (c5) is used to check the robustness of our empirical findings. Accordingly, in our baseline model, the set of predictors includes time-invariant covariates, as well as the first half of the pre-treatment outcome values and time-varying covariates, averaged over the first half of the sample values, in a way similar to the empirical framework of reference in Cavallo et al. (2013) . As Figure 2 shows, the gaps for our baseline specification closely resemble not only the one for specification (c5) but also the two summary statistics reported in Figure 1 : all in all, the temporal evolution of the treatment suggests a sort of delayed effect which becomes progressively negative as the days pass by. Though suggestive, the visual evidence presented above is insufficient to ensure proper implementation of the ASCM. Its practical use calls for the fulfillment of three conditions. As for the first requirement, only the treated unit is affected by the policy change assessed over the post-treatment period (I); secondly, the counterfactual outcome can be approximated by a fixed combination of donor states (II); finally, the policy change has no effect before it is implemented (III). While the procedures to align country-specific variables to a common starting date as well as the definition of a general rule to identify the (average) treatment date for the donor set (i.e., the last observation of our sample) discussed in Section 2. Turning to point (III), the synthetic outcome is expected to closely match the treated outcome's temporal profile during the pre-treatment period. Thus, as a preliminary step, Table 2 reports the average values over the pre-treatment period of the predictors for Italy ("actual") and the counterfactual control ("synth"), where the latter is constructed with the ASCM weights assigned to the elements of the donor pool as detailed in Figure 3 . Overall, the synthetic control unit provides a much better-matched profile of Italy along the predictors compared to the simple average of all countries in the donor pool (donor), suggesting that the ASCM-based selection of weights is more appropriate as a control unit, rather than choosing subjectively the weights by means, for instance, the simple average across all the donor units. Such a requirement is necessary to ensure that the comparison of the outcome paths during the post-treatment period provides insight into the effect of the treatments: when the dynamics of the synthetic control and the treated entity tend to diverge, then the treatment presumably caused the difference; in contrast, if both paths display similarities in the treatment period, the treatment does not appear to have affected the outcome. While the cumulative mortality rate in the synthetic control unit closely overlaps the actual series prior to the pre-treatment period, there is a visible divergence a few days after the policy intervention date (vertical line in the lower panel) when the synthetic unit starts following a much steeper path than the actual counterpart series. The resulting gap, defined as the difference between the actual series and its synthetic control, turns out to be negative and statistically significant with an ATT of -132.9 and a p-value of 0.000. In more detail, we Confidence intervals as those reported in Figure 4 Under the "placebo in-space" test, the ASCM is sequentially applied to each country in the donor pool as though it is a treated state, using the remaining members of the pool as before. The resulting placebo unit is thus compared with its synthetic counterpart. Comparing the difference between the treated unit and its synthetic control to the differences among placebo countries and their controls makes it possible to evaluate better the effectiveness of policy intervention on the treated unit. As the Root Mean Square Prediction Error (RMSPE) measures the gap between the variable of interest for the treated country and its synthetic counterpart, it is possible to calculate a set of RMSPE values for the pre-and post-treatment periods for each unit considered in the analysis. Consequently, the RMSPE of the treated country after the treatment is expected to be large relative to its value before treatment. On the other hand, placebo units should not see a substantial increase in their RMSPE following the treatment. For this reason, Table 3 reports the RMSPE pre/post-treatment ratio of each donor country divided by the same quantity computed for the treated country, Italy. Whenever the entry in the table is less than 1, it indicates a relatively higher difficulty when forecasting future outcome values for Italy. The share of RMSPEs above one is then used to obtain a p-value for Italy, which measures the probability of observing a ratio as high as the one obtained for Italy if one were to pick a country at random from the potential controls . INSERT HERE TABLE 3 Overall, the RMSPE ratios turn out to be well below the unit threshold, suggesting that the actual path of the treated states tends to diverge away from the synthetic control after the intervention in a much more substantial way than all countries belonging to the donor pool. The resulting p-value is (2/29=) 0.069 as it ranks second out of 29 countries, which falls within the conventional range of statistical significance used in the relevant literature. A remarkable exception is Belgium's case, with an RMSPE slightly above the unit; nonetheless, the associated ATT has an opposite sign to the expected one in a way similar to what emerges for the other eight countries. There is no evidence of a statistically significant ATT for three entities of the donor pool, while for the remaining cases, the estimated ATT ranges from -0.7 (for Croatia) to -34.5 (for Greece and Hungary). As a further sensitivity test, we run the "in-time placebo" test, in which the donor pool remains fixed and the treated unit is always Italy, but the treatment date is re-assigned to occur during the pre-treatment period, as devised by Abadie et al. (2015) . Moreover, this placebo model's sample period must end when the actual treatment occurred (day 15, in our context) to avoid capturing its effects. Operatively, the in-time placebo test is conducted under the assumption that the treatment occurred on day 8, roughly in the middle of our pre-treatment period. Apart from the lockdown date, we apply the baseline setup's exact setting to use the same predictor variables, including lagged outcome values for the first half of the pre-treatment period. As Figure 5 shows, our synthetic Italy for a placebo treatment on day 8 closely follows the path of actual Italy, not only during the first half of the baseline pre-treatment period but also in the second part of the sample with an estimated gap barely different from zero (continuous black line) also according to the 95 percent confidence region. Similar results are obtained when the fictitious treatment date is assigned to days 10 and 12 (corresponding to two-thirds and three-fourths of the baseline pre-treatment period, respectively). According to the dotted and dashed lines in Figure 5 , significant reductions in the mortality rate for these two fake lockdown dates cannot be found over the actual pre-treatment period. Overall, the in-time placebo test assures that the placebo estimate resembles the actual pre-treatment path closely enough to give us confidence that our main findings are not through chance, ruling out the possibility that the above-discussed difference between the synthetic and actual Italy arises for reasons other than the treatment. The third sensitivity check we consider is the leave-one-out test (Abadie et al., 2015) , where the model is iterated over to leave out one selected donor country each time to assess whether one of the donor units is driving the results. Figure 6 shows all leave-one-out synthetic gaps (thin grey lines) and the mean value across all of them (dashed line). It emerges that the average gap across all permutations closely matches the baseline gap that includes all donor states in terms of ATTs (-128.2 and -132.9, respectively), giving further support to the robustness of our findings. This paper is the first contribution that uses the ASCM to evaluate the effectiveness of non-pharmaceutical interventions against COVID-19. Evidence has been provided for Italy, the first Western country which has implemented shelter-in-place orders after China. The paper shows how the ASCM helps remove bias from a naive application of the canonical SCM. Indeed, the latter estimator shrinks the donor pool to only one country, i.e., Spain, generating de facto an estimate of the effect as a bare difference in mean between Italy and Spain. Constraining SCM weights on the unit simplex may be too restrictive, especially when it is hard to reproduce accurate synthetic pre-treatment dynamics. Our empirical case falls precisely in this circumstance, and the ASCM overcomes the problem by assigning negative weights to some donor units. (2020) is scantly comparable with ours; nevertheless, a more substantial effect in Italy is quite reasonable because of structural differences between the two countries in terms of (lower) endowment of hospital beds, (older) median age of the population and (larger) household average size (for Italy). As a possible extension, one could think of extending and projecting the findings up to the last day of the policy (i.e., other 36 extra days up to May 18, 2020) and re-calculate the total effect of the policy. One can also consider relating the findings of this paper to economic damages caused by the lockdown measure, while from a strictly methodological point of view, a further possible extension consists in constructing a formal test to test the equality in the mean of the weights generated by the SCM and the ASCM. These issues are beyond the scope of the present work and are left for further research. Note: The column "RMSPE ratio" reports the post/pre-treatment RMSPE of each country of the donor pool relative to the same ratio for Italy. Whenever the entry is less than one, the relative difficulty in forecasting future outcome values after intervention for Italy is higher than the one for the country considered. ATT= Average Treatment effect on the Treated. RMSPE ratio ATT P-value aut 0. Using synthetic controls: Feasibility, data requirements, and methodological aspects Synthetic control methods for comparative case studies: Estimating the effect of Californias tobacco control program Comparative politics and the synthetic control method The economic costs of conflict: A case study of the Basque Country Bias-corrected matching estimators for average treatment effects A penalized synthetic control estimator for disaggregated data The unintended impact of Colombia's covid-19 lockdown on forest fires Robust synthetic control mRSC: Multidimensional robust synthetic control Synthetic Difference In Differences Timing is Everything when Fighting a Pandemic: COVID-19 Mortality in Spain Synthetic Difference In Differences The state of applied econometrics: Causality and policy evaluation Approximate residual balancing: debiased inference of average treatment effects in high dimensions Predictive inference with the jackknife Synthetic Control, synthetic Interventions, and COVID-19 spread: Exploring the impact of lockdown measures and herd immunity The augmented synthetic control method Identifying policy challenges of COVID-19 in hardly reliable data and judging the success of lockdown measures The lockdown effect: A counterfactual for Sweden On the role of covariates in the synthetic control method Star wars: The empirics strike back The effect of Corona virus lockdown on air pollution: Evidence from the city of Brescia in Lombardia region (Italy) Prediction intervals for synthetic control methods Catastrophic natural disasters and economic growth Local mortality estimates during the COVID-19 pandemic in Italy An exact and robust conformal inference method for counterfactual and synthetic controls Practical and robust t-test based inference for synthetic control and related methods Causal impact of masks, policies, behavior on early covid-19 pandemic in the Quantifying the impact of nonpharmaceutical interventions during the COVID-19 outbreak: The case of Sweden Transparency, reproducibility, and the credibility of economics research Model uncertainty and the deterrent effect of capital punishment The impact of the Wuhan Covid-19 lockdown on air pollution and health: a machine learning and augmented synthetic control approach Pooling multiple case studies using synthetic controls: An application to minimum wage policies Covid-19 across European regions: The role of border controls What do right-to-work laws do? Evidence from a synthetic control method analysis Cherry picking with synthetic controls Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe Did California's shelterin-place order work? Early coronavirus-related public health effects The elements of statistical learning The impact of response measures on COVID-19-related hospitalization and death rates in Germany and Switzerland Causal inference for statistics, social and biomedical sciences: An introduction Synthetic control methods: Never use all pre-intervention outcomes together with covariates Pandemic and employment: Evidence from COVID-19 in South Korea Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median Impact of meteorological conditions and air pollution on COVID-19 pandemic transmission in Italy Leading Indicators Face masks considerably reduce COVID-19 cases in Germany: A synthetic control method approach Total COVID-19 mortality in Italy: excess mortality and age dependence through time-series analysis The effectiveness of school closures and other prelockdown COVID-19 mitigation strategies in Argentina, Italy, and South Korea. ZEW-Centre for European Economic Research Discussion Paper Excess deaths and hospital admissions for Covid-19 due to a late implementation of the lockdown in Italy Estimating the prevalence of the COVID-19 infection, with an application to Italy Lessons from Italy's Response to Coronavirus Imperfect synthetic controls: Did the Massachusetts health care reform save lives A framework for synthetic control methods with high-dimensional, micro-level data: Evaluating a neighborhood-specific crime intervention Estimation of regression coefficients when some regressors are not always observed High population density catalyse the spread of COVID-19 The use of matched sampling and regression adjustment to remove bias in observational studies Socioeconomic determinants of Covid-19 infections and mortality: Evidence from England and Wales How to build social consensus around lockdown The timing and effectiveness of implementing mild interventions of COVID-19 in large industrial cities The Effects of stringent interventions for Coronavirus pandemic Patterns and trends in household size and composition: evidence from a United Nations dataset