key: cord-0139456-prgohykk authors: Ferrari, Davide; Stillman, Steven; Tonin, Mirco title: Does Covid-19 Mass Testing Work? The Importance of Accounting for the Epidemic Dynamics date: 2021-04-30 journal: nan DOI: nan sha: 68a3441ae3395baa68c77c1b6fdd27dc078bebea doc_id: 139456 cord_uid: prgohykk Mass antigen testing has been proposed as a possible cost-effective tool to contain the Covid-19 pandemic. We test the impact of a voluntary mass testing campaign implemented in the Italian region of South Tyrol on the spread of the virus in the following months. We do so by using an innovative empirical approach which embeds a semi-parametric growth model - where Covid-19 transmission dynamics are allowed to vary across regions and to be impacted by the implementation of the mass testing campaign - into a synthetic control framework which creates an appropriate control group of other Italian regions. We find that the mass test campaign decreased the growth rate of Covid-19 by 39% which corresponds to a reduction in the total additional cases of 18%, 30% and 56% within ten, twenty and forty days from the intervention date, respectively. Our results suggest that mass testing campaigns are useful instruments for mitigating the pandemic. A wide variety of interventions have been used in an attempt to stop the spread of Covid-19 across the globe (i.e. lockdown, quarantines, business closures, mobility restrictions, school closures) (Haug et al., 2020; Tian et al., 2020) . Many of these are also the main candidate policies for stopping the spread of other large-scale epidemic outbreaks, such as Ebola. One particularly low-cost intervention that has been tried in a few locations is mass testing of a population in order to identify asymptomatic carriers (Holt, 2020; Atkeson et al., 2021) . Theoretical work suggests that mass testing could reduce daily infections by up to 30 percent (Bosetti et al., 2020) , while Pavelka et al. (2021) evaluates the impact of a mass testing campaign undertaken in Slovakia in late 2020 and finds that it temporarily decreased the growth of Covid-19 by 70 percent. Importantly, many countries worldwide offer open public testing indicating that frequent mass testing is a feasible intervention (Hasell et al., 2020) . In this paper, we test the impact of a voluntary mass testing campaign implemented in the Italian region of South Tyrol on the spread of the virus in the following months. We do so by using an innovative empirical approach which embeds a semi-parametric growth model into a synthetic control framework. Specifically, we first use the synthetic control approach to create a control group which is a weighted-average of other Italian regions that best follow the dynamics of Covid-19 transmission in South Tyrol prior to the mass testing campaign (Abadie et al., 2010a; Abadie and Gardeazabal, 2003a) . 1 We then estimate on appropriately weighted data a semi-parametric growth model where Covid-19 transmission dynamics (i.e. growth rates) are allowed to vary across regions and to be impacted by the implementation of the mass testing campaign. Importantly, this approach is in the spirit of difference-in-difference models which compare changes in outcomes over time in a treated 1 We use this approach because it uses the data to derive the optimal control region for the South Tyrol. Mitze et al. (2020) and Cho (2020) use this method to evaluate the impact of masks and other nonpharmaceutical interventions on the spread of Covid-19, however these papers implement it in a traditional framework which does not account for the underlying growth dynamics of an epidemic like location to changes in outcomes over time in otherwise similar control locations and hence can isolate the impact of the mass testing campaign from national level policies regarding freedom of movement, business and school closures, hygienic measures, etc. since no other policy changed at the same time in South Tyrol. This approach has a number of benefits over what has so far been done in the literature that attempts to estimate the impact of public health interventions on the spread of Covid-19, as well as on other contagious diseases. Specifically, previous papers use either full parametric models Aviv-Sharon and Aharoni, 2020; Chénangnon et al., 2020; Pelinovsky et al., 2020) or traditional synthetic control or difference-in-difference models (Hsiang et al., 2020; Tian et al., 2020; Mangrum and Niekamp, 2020; Alexander et al., 2020; Dave et al., 2021; Singh et al., 2021; Mitze et al., 2020; Cho, 2020) . While fully parametric models based on biological growth theory have proved very useful for assessing the dynamics of disease outbreaks, their overly rigid a-priori parametric assumptions do not allow them to fit the typical variation in short-and medium-run dynamics seen in the spread of Covid-19. This is a potential concern with the findings of Pavelka et al. (2021) , especially since the testing campaign in Slovakia was a national level campaign and other policies were being implemented at the same time to reduce transmission rates. On the other hand, traditional difference-in-difference and synthetic control models, by assuming that treatment and control regions would follow parallel trends in a counterfactual reality without any intervention, are potentially biased because growth dynamics in different regions will depend on the prior contagion rate. Although there is a rich literature on estimating model-free growth curves in other fields, the use of non-parametric or semi-parametric approaches has not been sufficiently explored in the context of health policy analysis. 2 We show that the estimated impact of the mass test campaign in South Tyrol is sensitive to how one models Covid-19 transmission dynamics and flexible models better fit the data and generate more robust estimates of the impact of the mass testing campaign. Furthermore, our approach has the added benefit that it generates estimates of the disease proliferation rate both with and without intervention, thus gaining insight on the transmission pathways. Overall, we find that the mass test campaign in the South Tyrol decreased the growth rate of Covid-19 by 39% (95% confidence interval: 29-49%). This corresponds to a reduction in the total additional cases of 14%, 18%, 30% and 56% within seven, ten, twenty and forty days from the intervention date, respectively. Importantly, this large impact was achieved even though the campaign was entirely voluntary with no incentives to participate. 3 Our results are in line with the predictions made in Bosetti et al. (2020) based on an epidemiological model. We find a smaller impact than Pavelka et al. (2021) reports for the mass testing campaign in Slovakia; however, the Slovakian intervention featured a multiple round campaign in addition to concurrent interventions, including a one-week lockdown. 4 The population of South Tyrol was invited to take part in a mass testing campaign in late November 2020 using rapid antigen tests, which involve a nasal and throat swab. Authorities set up around 300 testing centers, where professional health care workers carried out the tests, with the support of volunteers from the civil protection agency, the voluntary fire services and other organizations for handling the logistics and the administration. All residents were invited to participate, with the exception of children below the age of five, people with Covid-19 symptoms, those on sick leave, those who had tested positive and isolated in the last three months, and those who had recently tested positive or were in quarantine or self-isolating. People with a prior appointment for a PCR test, those regularly tested for work reasons, and individuals in social care were also not tested. Testing centers generally operated from 8am to 6pm from Friday, 20 November to Sunday, 22 November. During this period, people could show up at any of the centers throughout the region. In some municipalities, it was possible to register online and some published suggested centers and time slots based on the address of residence. It was also possible to be tested at some pharmacies and GPs in the period 18 to 25 November. People only needed a valid ID and a European Health Insurance card. They filled in a form with an email address, where they would receive, generally within a day, an encrypted file with the outcome, and a mobile number, where they would receive an SMS with the code to open the file. In case of a negative result, people were advised to continue following prevention measures like social distancing and mask wearing. In case of a positive result, people had to isolate for 10 days if asymptomatic and contact their doctor if they developed symptoms. Participation in the mass testing was voluntary and encouraged by a massive communication campaign, providing information (with material available also in Albanian, Arabic, English, French and Urdu, as well as in simple language for kids), as well as endorsements by public figures. The goal was to identify asymptomatic cases in the population and hence reduce virus transmission. The headline of the campaign was "Together against coronavirus", using appeals like "Let's break the infection wave together and pave the way towards a gradual return to normality!". In the end, 72 percent of eligible residents volunteered to be tested (Stillman and Tonin, 2021) . In comparison, 83 percent of the eligible population in Slovakia decided to be tested when the alternative was to quarantine for 10 days. In this section, we first discuss how we create the optimal control group for evaluating the mass testing campaign in the South Tyrol. We then discuss how we implement the semi-parametric growth model, how it compares to parametric models and how we imbed a difference-in-difference type framework in this model. Finally, we discuss statistical inference. As there is no obvious a priori control group, in order to evaluate the effect of the screening intervention in South Tyrol, we constructed an optimal control group (Synthetic South Tyrol) using the synthetic control methodology of Abadie et al. (2010b) . This method constructs a weighted combination of data from the control group to approximate the behavior of the intervention group in terms of pre-intervention characteristics. Suppose we observe new cases Y it in regions i = 1, . . . , N + 1 at times t = 1, . . . , T and assume intervention occurs at time T 0 + 1 so that 1, . . . , T 0 are pre-intervention periods. Without loss of generality the first region (i = 1) represents South Tyrol, which is exposed to the intervention, so we have N remaining donor regions contributing to the synthetic control (in our case, the other 20 Italian regions or autonomous provinces in the donor pool). Let w = (w 1 , . . . , w N ) be a (N × 1) vector of non-negative weights such that N i=1 w i = 1 and w i ≥ 0 for all i = 1, . . . , N , where w i denotes the weight of region i in the synthetic South Tyrol. Let u i = (x i1 , . . . , x iT 0 , z i1 , . . . , z iT 0 ) be a (2T 0 × 1) vector containing pre-treatment values for cumulative number of cases (x it ) and number of tests (z it ) for region i. Thus, the vector u 1 contains pre-treatment values for South Tyrol and the matrix (2T 0 × N ) matrix U 0 = (u 2 , . . . , u N +1 ) with columns u 2 , . . . , u N +1 contains pretreatment values for all the remaining donor regions. The synthetic control method selects the vector of weights w that determines the best control region by solving the following quadratic optimization problem: where V is a 2T 0 × 2T 0 symmetric and positive semidefinite matrix to allow different weights to the variables in U 0 depending on their predictive power on the outcome. To stress depen-dence on V, we use the notation w = w(V) for the solution to (1). To select V, we minimize the mean squared prediction error (MSPE) of the outcome variable over pre-intervention periods as described inAbadie and Gardeazabal (2003b) and Abadie et al. (2010b) . Specifically, if Y 1 is the (T 0 × 1) vector with the values of the outcome variable (new cases) for South Tyrol and Y 0 is the (T 0 × N ) analogous matrix for the control units, then V is selected The above procedure is implemented in the function synth of the R package Synth which uses the Nelder-Mead and BFGS algorithms as default options, and then picks the solution with the lowest MSPE (Abadie et al., 2011) . We then use the weights generated by this procedure when estimating the model described in the next section. To analyze the dynamics of the Covid-19 epidemic over time, we develop a flexible semiparametric growth curve approach. Let x t denote the cumulative size of the detected infected population at time t. In classic parametric growth curve analysis, the dynamics of x t is represented by the derivative of x t , which is often assumed to have the form where ρ is the intrinsic growth rate determining the time scale of the epidemic process, p ∈ [0, 1] is the "deceleration of growth" parameter capturing different growth profiles and g is a smooth non-increasing function of x t possibly depending on other model parameters. A number of well-known growth models can be recovered from Equation (2) depending on the choice of g; although the literature is too vast to be covered here, we provide a few examples. A basic model is the exponential growth (EG) model corresponding to g(x t ) = 1. The EG model assumes that an epidemic continues to grow following the same process as in the past with the growth path completely specified by p: constant incidence (p = 0), subexponential growth (0 < p < 1) and exponential growth (p = 1); see, e.g. Wu et al. (2020) . Although this may be useful for representing the early stages of an epidemic, it is an upper bound scenario since outbreaks often slow down and reach saturation capacity after initial exponential growth. One popular extension is the generalized logistic growth (GLG) model where k represents the total size of the epidemic, i.e. the asymptotic number of infections over the whole epidemics . A slightly more flexible where the parameter a allows for deviations from the S-shaped dynamics of the classical GLG Chowell, 2017) . Despite the wide range of parametric growth models available, none of these rigid apriori specifications do a good job at fitting the rich dynamics observed in real Covid-19 data, which potentially leads to biased inferences. Instead, we estimate a more flexible model, referred to as a semi-parametric growth (SPG) model, that includes additional flexibility in two dimensions. We first add flexibility by estimating the function g in the following semi-parametric specification: where the η i s are unknown coefficients and the b j (x t ) are given basis functions, such basis spline functions. The basis expansion h(x t ) = q i=1 η i b j (x t )} allows us to approximate an arbitrary smooth function, provided that q is large enough. Overall, this specification requires fewer assumptions about the data and fits the data better in situations where the true outbreak trajectory is hard to specify in advance, as is the case for the Italian Covid-19 data. We also allow the growth rate parameter ρ to depend on other explanatory variables. Since our main inferential interest is to assess the impact of the policy treatment on pre-and post-treatment growth differences, we assume the growth rate follows a log-linear difference-in-differences type of specification: where Int t is a dummy variable for the policy intervention taking value 0 up to the intervention date and 1 afterwards, Reg i is a dummy variable taking value 1 for the geographical region exposed to treatment (South Tyrol) and 0 otherwise, and z it is a q × 1 vector of additional controls (such as the difference in the number of diagnostic tests compared to the previous day, weekly seasonality effects, etc). The overall vector describing the growth parameters is denoted by θ = (α, β, γ, δ, ξ ) . Our main inferential interest is in the coefficient δ for the interaction between the intervention and the geographical area. This parameter measures the effect of the policy intervention and is calculated in a similar way as in a traditional difference-in-differences model by comparing the change over time in the outcome variable for the South Tyrol compared to the change over time for the control region (Synthetic South Tyrol). We also assess the impact of policy intervention by examining its impact on the transmission growth rate. From equation (4), the relative change in the growth rate can be computed as Therefore, the relative change in the treatment and control groups are ∆ρ 1 = exp{β + δ} − 1 and ∆ρ 0 = exp{β} − 1, respectively. i = 0, 1 and t = 1, . . . , T , where ρ it (θ) is the time-and region-specific growth rate defined in Equation 4. Equation 5 represents a generalized additive model (GAM); e.g., see Hastie and Tibshirani (1990) ; Wood (2017) for an introduction on GAMs. Estimates for the growth rate parameters θ = ( α, β, γ, δ, ξ ) , growth deceleration p and for the smooth function h(·) are obtained by running a penalized likelihood estimator with penalty depending on a smoothing parameter for the non-parametric part of the model. We use thin-plate radial basis spline functions for the terms b i (x t ) and smoothing parameter tuned via maximum likelihood estimation. Estimates are obtained using the function gam in the R package mgcv (Wood and Wood, 2015) . See Wood (2017) and references therein for details on the estimation procedure and implementation. Approximate variances and covariances for the estimated parameters are extracted using the function vcov.gam in the R package mgcv. An estimate of ∆ρ i , the relative growth change due to intervention as described in Section 3.2, is obtained by plugging-in the parameter estimates β and δ into the expression ∆ρ i (θ). Since the parameter estimates are asymptotically normal, the standard errors for ∆ρ i can be derived using the Delta method (e.g., see Van der Vaart (2000)), obtaining SE( ∆ρ 0 ) = exp{ β} × SE( β) and SE( ∆ρ 1 ) = exp{ β + δ} × SE( β) 2 + SE( δ) 2 + Cov( β, δ), where SE( β) and SE( δ) are the standard errors for β and δ and Cov( β, δ) is an estimate of the covariance between β and δ. The analysis in this paper focuses on the second COVID-19 outbreak wave that occurred in The first step of our estimation process is to construct the control group of regions against which we will evaluate the impact of the mass testing in South Tyrol. Figure 1 shows the estimated synthetic control weights for the 20 donor regions using the approach described in Section 2. The regions Valle d'Aosta, Friuli Venezia Giulia and Veneto are by far the major contributors in the synthetic control with percent weights equal to 71.0%, 20.0% and 7.5%, respectively. Valle d'Aosta is, like South Tyrol, a small mountain region in the North of Italy, but is not contiguous with South Tyrol. To assess the goodness-of-fit of this selection, we computed the R-squared type statistic Y 1 denoting the arithmetic average of the elements of vector Y 1 . Moreover, we found a Pearson correlation coefficient between Y 1 and Y 0 w equal to 0.936. Both the R 2 statistic and correlation coefficient show a very good match between South Tyrol and the synthetic control region in the pre-intervention period. Table 1 shows the estimates for the semiparametric growth models (SPGs) described in Section 3.3 using Poisson (Pois) and negative binomial (NBin) response functions. Each model includes an indicator for being in the treatment region (Reg), an indicator for being in the time-period after the mass-testing intervention (Int), and interaction between the two, and an intrinsic growth parameter (p). We also estimate a second specification of each Table 1 also shows in the lower part analogous setups for pure exponential growth models obtained by setting h(x it ) = 0 in the semi-parametric models. For all the considered models, we compute the adjusted R-squared and explained deviance statistics as well as the Akaike information criterion (AIC) and Bayesian information criterion (BIC) for model selection. Based on both AIC and BIC model selection criteria, the best fitting model is NBin+Ctrl (negative binomial response with additional controls), which has an adjusted R-squared of 88.8% and percent of explained deviance of 93.9%. Overall, all the SPG models fit the data well with adjusted R-squared always exceeding 83%. The standard EG models perform worse in terms of the goodness-of-fit metrics and for all the combinations of response distribution and predictors compared to the SPG models. The appropriateness of the semi-parametric models is also confirmed by the statistically significant chi-squared statistics for the nonparametric function h in all considered cases. In each of the SPG models, we find that the difference-in-differences interaction Int×Reg, which indicates the impact of the mass testing campaign on Covid-19 growth in South Tyrol, is negative and highly significant. The estimated coefficient from our preferred SPG model NBin+Ctrl is -0.512 which implies a decrease of growth rate of 39% (with 95% confidence interval 29-49%). These results are robust to both the response function used and whether controls are included with the estimated growth rate in these alternative specifications ranging from 38% to 51%. If instead one uses standard EG models, the estimated impacts are generally smaller, are less robust to model choice, and are less precisely estimated. We next run a placebo test to gauge the possibility that our estimated impact of the mass testing campaign occurs by chance. We do this by matching each of the other 20 regions of Italy to synthetic versions of themselves and then estimating the SPG model (NBin + Ctrl) assuming there was an intervention at the same time as the real intervention in the South Tyrol. Figure respectively. This corresponds to an overall reduction in cases of 14%, 18%, 30% and 56%). Our findings are in line with the predictions by Bosetti et al. (2020) which uses a SEIR dynamic ordinary differential equations model to predict outbreak dynamics under various contagion scenarios. When they consider the scenario where 75% of the population is tested -which is close to the 72% observed in South Tyrol -the predicted reduction in prevalence in the population within 10 days of mass testing is between 10% and 30% where we estimate a reduction of 14% under the same time horizon. In this paper, we employ an innovative empirical approach to study the impact of mass testing on Covid-19. In particular, we combine a synthetic control methodology to define a control group and a semi-parametric growth model to model the epidemiological developments. We show that using a semi-parametric approach rather than more standard parametric exponential growth models makes a difference in the evaluation of the impact of the mass testing that took place in the Italian region of South-Tyrol in November 2020 on Covid-19 growth. In our preferred specification, we find that mass testing reduced the growth rate of Covid-19 by 39%. This suggests that mass testing can be an useful tool to contain the pandemic. Intervention date Figure 2 : Fitted semi-parametric growth models against actual data on new cases for South Tyrol and synthetic control.The first and second row correspond, respectively, to Poisson (Pois) and negative binomial (NBin) models for new cases. The first column corresponds to models without short-term control factors, while the second column shows models with the additional controls included (+Ctrl). 3413 (9) Table 1 : Fitted semi-parametric growth (SPG) and exponential growth (EG) models. using Poisson (Pois) and negative binomial (NBin) response with additional control variables (+Ctrl) and without additional controls. For the parametric terms estimates are reported with standard errors in parenthesis. For the SPG models the χ 2 statistics corresponding to null hypothesis "H 0 : h(x) = 0" are given with the approximate degrees of freedom (df). Significant results (p-value<0.01) are marked by "*". For all the models we report estimated percent relative growth change ∆ρ × 100% for South Tyrol and synthetic control groups as well as goodness-of-fit statistics. Synthetic control methods for comparative case studies: Estimating the effect of california's tobacco control program Synthetic control methods for comparative case studies: Estimating the effect of california's tobacco control program Synth: An r package for synthetic control methods in comparative case studies The economic costs of conflict: A case study of the basque country The economic costs of conflict: A case study of the basque country Mass gatherings contributed to early covid-19 spread: Evidence from us sports Economic benefits of covid-19 screening tests with a vaccine rollout Generalized logistic growth modeling of the covid-19 pandemic in asia Impact of mass testing during an epidemic rebound of sars-cov-2: A modelling study On the use of growth models to understand epidemic outbreaks with application to covid-19 data Quantifying the impact of nonpharmaceutical interventions during the COVID-19 outbreak: The case of Sweden Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts When do shelter-in-place orders fight covid-19 best? policy heterogeneity across states and adoption time A cross-country database of covid-19 testing Generalized additive models Ranking the effectiveness of worldwide covid-19 government interventions Slovakia to test all adults for sars-cov-2. The Lancet The effect of large-scale anti-contagion policies on the covid-19 pandemic The impact of mass antigen testing for covid-19 on the prevalence of the disease Estimation of covid-19 spread curves integrating global data and borrowing information Jue insight: College student travel contributed to local covid-19 spread Face masks considerably reduce covid-19 cases in germany The impact of populationwide rapid antigen testing on sars-cov-2 prevalence in slovakia Logistic equation and covid-19 Impacts of introducing and lifting nonpharmaceutical interventions on covid-19 daily growth rate and compliance in the united states Communities and testing for covid-19 An investigation of transmission control measures during the first 50 days of the covid-19 epidemic in china Asymptotic statistics Package 'mgcv'. R package version Generalized additive models: an introduction with R Generalized logistic growth modeling of the covid-19 outbreak in 29 provinces in china and in the rest of the world