key: cord-0863905-io2ztyvw authors: Lichand, Guilherme; Doria, Carlos Alberto; Cossi Fernandes, João Paulo; Leal-Neto, Onicio title: Association of COVID-19 Incidence and Mortality Rates With School Reopening in Brazil During the COVID-19 Pandemic date: 2022-02-11 journal: JAMA Health Forum DOI: 10.1001/jamahealthforum.2021.5032 sha: c9feadea760f1b2a95e248c4cf93be0bbcaf7369 doc_id: 863905 cord_uid: io2ztyvw IMPORTANCE: School closures because of COVID-19 have left 1.6 billion students around the world without in-person classes for a prolonged period. To our knowledge, no study has documented whether reopening schools in low- and middle-income countries during the pandemic was associated with increased aggregate COVID-19 incidence and mortality with appropriate counterfactuals. OBJECTIVE: To test whether reopening schools under appropriate protocols during the COVID-19 pandemic was associated with increased municipal-level COVID-19 cases and deaths in São Paulo State, Brazil. DESIGN, SETTING, AND PARTICIPANTS: This observational study of municipalities in São Paulo State, Brazil, uses a difference-in-differences analysis to examine the association between municipal decisions to reopen schools during the COVID-19 pandemic and municipal-level COVID-19 case and death rates between October and December 2020. The study compared 129 municipalities that reopened schools in 2020 with 514 that did not and excluded data for 2 municipalities that reopened schools and closed then again. MAIN OUTCOMES AND MEASURES: New COVID-19 cases and deaths per 10 000 inhabitants up to 12 weeks after school reopenings and municipal-level aggregate mobility for a subset of municipalities. RESULTS: There were 8764 schools in the 129 municipalities that reopened schools compared with 9997 in the control group of 514 municipalities that did not reopen schools. The municipalities that reopened schools had a cumulative COVID-19 incidence of 20 cases per 1000 inhabitants and mortality of 0.5 deaths per 1000 inhabitants in September 2020 (the baseline period) compared with an incidence of 18 cases per 1000 inhabitants and mortality of 0.45 deaths per 1000 inhabitants during the baseline period in the comparison group. The findings indicated that there were no statistically significant differences between municipalities that authorized schools to reopen and those that did not for (1) weekly new cases (difference-in-differences, –0.03; 95% CI, –0.09 to 0.03) and (2) weekly new deaths (difference-in-differences, –0.003; 95% CI, –0.011 to 0.004) before and after October 2020. Reopening schools was not associated with higher disease activity, even in relatively vulnerable municipalities, nor aggregate mobility. CONCLUSIONS AND RELEVANCE: The findings from this study suggest that keeping schools open during the COVID-19 pandemic did not contribute to the aggregate disease activity. eAppendix. eFigure 1. Timeline of school reopening eTable 3. Heterogeneous treatment effects estimated through differences-in-differences eTable 4. Differences in cases for school-age children aggregated by cohort eTable 5. Differences in cases for parent-age adults aggregated by cohort eTable 6. Difference-in-differences estimates aggregated by cohorts with matched sample eTable 7. Difference-in-differences model with continuous treatment In eTable 2, we provide a description of all variables used in the paper. We combine different school characteristics to capture the quality of school infrastructure, using the principal component method. We select the following variables from the 2019 Brazilian school census: the availability of bathrooms, school staff's bathrooms, shower, kitchen, garbage collection, piped water, basic sanitation, and the average number of students per class. We select the first principal component as our composite index of school infrastructure, computed at the municipality level averaging across its schools. This first component accounts for approximately 40% of the total variance of the selected variables. Our mobility data is based off Google reports, computed from mobile-phone GPS information. Google calculates daily mobility information for more than 400 municipalities in the São Paulo state. As such, we have mobility data for roughly 78% of municipalities that reopened schools in 2020, and 56% of the ones that did not. Google provides information concerning different types of mobility, including: 1) retail and recreation; 2) grocery and pharmacy, 3) parks, 4) transit stations; and 5) workplaces. We average across these five measures to generate an aggregate mobility index. We then average over that daily aggregate measure to generate a weekly municipallevel mobility index. Mobility is expressed in percentage-point deviations from February 15, 2020, when Google started collecting mobility data. As in several other developing countries, limited testing in Brazil has been linked to substantial under-reporting of COVID-19 cases. Estimates suggest that the actual number of COVID-19 cases could be three to ten times larger than that registered in official statistics 1 . Nevertheless, we do not expect under-notification to affect the findings of this study. The reason is two-fold. First, our estimates are based on a differences-in-differences strategy, which contrasts outcomes across municipalities that authorized schools to reopen and those that did not, before and after in-person activities could resume. As such, under-reporting would only bias our results if it were systematically different between groups and periods. Nonetheless, there is no reason why under-reporting should have been higher within municipalities that reopened schools, or for it to have increased after in-person activities could resume. Second, and most importantly, this study provides first-hand evidence on the relationship between school reopening and COVID-19 deaths. Deaths tend to be much more precisely reported than cases; in effect, official statistics for COVID-19 related deaths tended to closely track the total number of excess deaths in 2020 and 2021, relative to previous years 2 . As we estimate a similar relationship for both cases and deaths, we conclude our results are not driven by under-reporting. To estimate the effect of school reopening on the outcomes of interest, we rely on the differences-in-differences approach. That is, we compare the trends of the dependent variables for treated and non-treated municipalities. However, recent literature suggests that a straightforward two-way fixed-effect regression is not appropriate in the context of an application with multiple periods and staggered treatment. To circumvent this problem, we implement the Callaway and Sant'Anna procedure 3 . It can be seen as a three-step estimator. First, we divide the treatment group (the set of municipalities that authorized schools to reopen) into cohorts according to the week municipalities authorized schools to reopen. Let be a dummy variable indicating that municipality m reopened schools at period g and let be a dummy variable indicating that municipality m did not authorize schools to reopen in 2020. Let be the number of municipalities treated at time g, and N be the total number of time periods after treatment (=5 in our sample). Next, let = {0,5,6,8,9} be the set that identifies the weeks at which different cohorts were treated, in our sample. Then, in the first step of the procedure, we estimate: where Ф(. ) is the normal cumulative density function, are municipal time-invariant characteristics. That is, we use a Probit model to predict how the likelihood of being treated at ∈ relates to municipal characteristics ( ′ ). Then, we calculate the propensity score of being treated in a certain period as the predicted probability from the first-stage ( � ). Next, in the second step, we estimate cohort-time treatment effects, as follows: where is the outcome of interest in municipality m at time t, and is a weight computed as follows: In words, the second step calculates difference-in-differences estimates by comparing units that were originally treated at time g, t periods later, to units in the control group --attributing higher weight to observations that have larger probabilities of being treated based on their characteristics. Lastly, once we estimate cohort-time treatment effects, we can aggregate them in multiple ways in the third step. In particular, it is possible to compute dynamics treatment effects as: which is just the average of cohort-time treatment effects evaluated at T, which can even be negative (prior to the treatment onset , in the case of falsification tests). Alternatively, we could compute cohort-specific treatment effects as: We compute standard errors using Callaway and Sant'Anna's standard procedure. This is done by blockbootstrapping standard errors at the municipality level. The block-bootstrap procedure allows arbitrary correlations between regression errors within the same municipality over time. The fundamental identification assumption for the differences-in-differences strategy is that of parallel trends for the outcomes of interest. That is, we assume that, in the absence of school reopening, the log of each outcome would have followed the same time trend across both groups. We point out in the main text that there are significant initial differences between the two groups at the time of reopening. This does not invalidate the identification assumption, since differences-in-differences parses out any difference in outcomes across groups at the baseline period. Some factors contribute to the plausibility of this assumption. First, the authorization to reopen schools is defined at the health region level, but the actual decision to reopen schools or not is made at the municipal level. As such, there was considerable variation in reopening decisions even within each health regions in the State. Moreover, the authorization to reopen schools was subject to the endorsement of local legislators, introducing randomness in the timing of the actual return of in-person school activities even conditional on municipal reopening decisions. We present falsification tests in Figure E3 whereby we test and fail to reject the presence of parallel trends before school reopening. The Callaway and Sant'Anna estimator additionally requires that units that are treated cannot be untreated. For this reason, we drop from the analyses 2 municipalities that authorized schools to reopen, but reverse those decisions shortly after. As a robustness check, we also implement a nearest-neighbor propensity score matching. We implement the procedure sequentially, starting the process by considering a sample consisting of the first cohort and the control group (never treated municipalities). We estimate a Probit model, as before: Then, we calculate the propensity score for this sample as the predicted value from the estimate above. We include the following variables as controls: per capita income, population, number of students, school infrastructure, and the number of baseline cases and deaths. For each municipality in the treated cohort, we find the municipality in the control group with the closest propensity score, without replacement. After we match all municipalities in the first cohort, we build a new sample, including the second cohort of treatment and the control group without the municipalities matched in the first step of the process. Then, we re-estimate the Probit model for this alternative sample and match all municipalities in the second cohort. We repeat this procedure until we find a match to all cohorts of treatments. We implement this sequential algorithm for two samples. First, the full sample of the study, including 643 São Paulo municipalities. Second, for the sub-sample of municipalities with available mobility data. eFigure 1 starts by documenting the proportion of municipalities that reopened schools at different points in time, and the number of schools authorized to reopen as a result. The figure highlights that reopening decisions were staggered, with in-person school activities picking up only from November onwards; and even then, inperson school activities were far from universal in the State. eFigure 1: Timeline of school reopening Next, eTable 3 estimates the relationship between school reopening and disease activity separately for below-and above-median (a) per capita income, (b) average quality of school infrastructure, (c) senior population share, and (d) baseline Covid-19 deaths. We find no statistically significant association even for the highest risk or most vulnerable sub-samples. eTable 4 estimates whether school reopening affected school-aged children using a triple-differences strategy, whereby we not only compare municipalities that authorized schools to reopen to those that did not, before and after in-person activities could resume, but also, school-aged children (ages 7-18) to young adults (ages 19-22) within each group and period. This strategy provides evidence on the direct effects of school reopening on COVID-19 incidence, since the latter age group should not be directly affected by school reopening. The analysis relies on additional data from COVID-19 testing (publicly available from SUS), which allows identifying patients' age. The limitation is that age-specific data are only available for cases, and only for specific weeks. Next, eTable 7 estimates the relationship between school reopening and COVID-19 cases and deaths more flexibly, allowing for the possibility that the decision to reopen schools locally might also affect disease activity in neighboring municipalities. To do that, we implement a fixed-effects estimator with a continuous treatment variable, defined as the negative of the log distance to nearest municipality that reopened schools at each week (=0 if the municipality itself has authorized schools to reopen then). This specification also does not detect a systematic relationship between proximity to schools authorized to reopen and local disease activity. Análise da subnotificação de COVID-19 no Brasil Estimating Excess Deaths due to Covid-19 in Brazil using the Difference-in-differences with multiple time periods Difference-in-differences with multiple time periods Last, eFigure 2 presents robustness checks for the log transformation performed in the outcome variables. In the main analysis, we show results for the log of COVID-19 cases and deaths adding 1 to all observations, to prevent municipality-week pairs with zero cases or deaths to be dropped from the analyses. Here, we consider two alternative specifications. First, we estimate dynamic treatment effects for the log of new cases and deaths dropping observations for which new cases or death equal zero. Second, we estimate dynamic treatment effects for the outcome variables in levels, that is, without taking the log transformation, and using all observations. Our conclusions are robust to these alternative specifications. eFigure 2: Dynamic treatment effects with alternative outcomes