key: cord-0938950-vun6p2fc authors: Saavedra, Pedro; Santana, Angelo; Bello, Luis; Pacheco, José-Miguel; Sanjuán, Esther title: A Bayesian spatio-temporal analysis of mortality rates in Spain: application to the COVID-19 2020 outbreak date: 2021-05-31 journal: Popul Health Metr DOI: 10.1186/s12963-021-00259-y sha: e4b32ae498887c8c995b51d9d7ca111c78e20598 doc_id: 938950 cord_uid: vun6p2fc BACKGROUND: The number of deaths attributable to COVID-19 in Spain has been highly controversial since it is problematic to tell apart deaths having COVID as the main cause from those provoked by the aggravation by the viral infection of other underlying health problems. In addition, overburdening of health system led to an increase in mortality due to the scarcity of adequate medical care, at the same time confinement measures could have contributed to the decrease in mortality from certain causes. Our aim is to compare the number of deaths observed in 2020 with the projection for the same period obtained from a sequence of previous years. Thus, this computed mortality excess could be considered as the real impact of the COVID-19 on the mortality rates. METHODS: The population was split into four age groups, namely: (< 50; 50–64; 65–74; 75 and over). For each one, a projection of the death numbers for the year 2020, based on the interval 2008–2020, was estimated using a Bayesian spatio-temporal model. In each one, spatial, sex, and year effects were included. In addition, a specific effect of the year 2020 was added ("outbreak"). Finally, the excess deaths in year 2020 were estimated as the count of observed deaths minus those projected. RESULTS: The projected death number for 2020 was 426,970 people, the actual count being 499,104; thus, the total excess of deaths was 72,134. However, this increase was very unequally distributed over the Spanish regions. CONCLUSION: Bayesian spatio-temporal models have proved to be a useful tool for estimating the impact of COVID-19 on mortality in Spain in 2020, making it possible to assess how the disease has affected different age groups accounting for effects of sex, spatial variation between regions and time trend over the last few years. Data on mortality attributed to coronavirus disease 2019 in Spain-one of the most affected western European countries-have been very controversial. Disagreements between data-providing sources arise from the difficulty of telling apart patients who died "from COVID-19" as the main cause from those suffering other underlying health problems that together with the virus caused death, i.e., "having contracted COVID- 19 ." In addition to that, the highly decentralized Spanish Health System has been a serious handicap for a sound data collection, thus leading Spanish health authorities to reconsider the published data: even today, criteria adopted for this count are being discussed. Uncertainties in the approach to the correct diagnosis about excess mortality produced by COVID- 19 have also been suggested in other places such as England and Wales [1] . The real impact of the epidemic goes far beyond determining how many people died from COVID-19 or suffered COVID-19. In fact, as indicated in [2] , there are people who have died indirectly from COVID (for example, because they had not received adequate medical attention for other problems) and should be considered as side effects of the epidemic. There are also people who actually died from COVID-19, especially in the case of the elderly or seriously ill, but perhaps the disease only accelerated the moment of death, which would have happened anyway. In addition, the lockdown measures that were taken in the country during the first wave of the pandemic also had the effect of reducing mortality from other causes: there were fewer deaths in road accidents; it is even possible that isolation facilitated the reduction of infections due to influenza viruses, also reducing mortality from this cause. Therefore, any assessment of the impact of COVID on mortality in Spain during 2020 will include a balance between all these effects since it is not possible to estimate them separately. An attractive way to assess the impact of the epidemic on mortality is to compare the deaths observed in 2020 with those projected from a sequence of previous years. This has been understood by the authors of [3, 4] , who have evaluated the total excess of deaths during periods of the epidemic in Italy, by geographical areas, sex, and age, with respect to what was expected from 2015 to 2019 [3] , or from previous months [4] . These estimates have sometimes focused on identifying or predicting the trend in mortality rates during a given period of the pandemic ( [5] during lockdown in Spain [6] ; during a fortnight in India [7] ; for a quarter in Italy, Spain, and France [8] ; for a 4-month period in Iran). Other authors have focused on the investigation of the relationship of spatial patterns with say ecological factors in the USA [9] . The application of spatio-temporal study methods has proven to be ideal to optimize the observation of epidemics behavior, using an information exchange based on the influence of location proximity and time closeness [10] . These methods have been deemed essential to describe the spread and to make decisions about the mitigation of the pandemic [11, 12] . The purpose of this study is to compare the mortality rates observed in the year 2020 with the trend projected for this year based on the sequence of mortality rates in years 2008-2020 according to age group, sex, and autonomous communities (administratively, Spain is divided into seventeen autonomous communities). Obviously, mortality rates over time cannot be assumed to be generated by a stationary process, which would imply that rates obtained from averages of death counts in past years (e.g., 2015-2019, using a 5-year period as is usually done in mortality studies) do not estimate any characteristic parameters of the process. Therefore, this assumption does not offer an optimal way to make a projection for the year 2020. Since mortality patterns showed differences between age groups, a spatio-temporal model for the death counts was estimated for each group. In each one the effects of sex, autonomous community and year (sequence 2008-2020) were included. In addition, a specific effect for year 2020 ("outbreak") was added. Models were estimated in the Bayesian framework via the integrated nested Laplace approximation (INLA). Once the projected deaths for 2020 were estimated, the excess mortality was obtained as the difference between observed and projected deaths. All data have been obtained from the Spanish official Institute of National Statistics (INE). Population data can be found in [13] , mortality data by autonomous community, sex, and age until 2019 have been downloaded from [14] , and mortality data for 2020 are located in [15] . For each one of the age cohorts (< 50, 50-64, 65-74, and 75 over), the dataset has the form: where Pop s, a, t and Death s, a, t denote the population size and the number of deaths during 2020, respectively in the cohorts determined by sex (s), autonomous community (a), and year (t). Values of (a) correspond to the INE codes of communities. Observed mortality rates by sex, community, and time The mortality rate (deaths by 100,000 persons-year) in cohort {s, a, t} was estimated as R s;a;t ¼ 10 5 Â Death s;a;t Pop s;a;t Expected number of deaths assuming uniformity of the rates by autonomous communities, years, and sex Under the hypothesis of uniformity of the rates, the expected number of deaths is defined as where r = ∑ s, a, t Death s, a, t /∑ s, a, t Pop s, a, t (total proportion of deaths) Spatio-temporal model to project mortality from 2008-2019 to 2020 In each of the four age groups, we assume that the number of deaths in the {s, a, t} cohort is a random variable with negative binomial probability distribution with mean λ s, a, t where: Here, α is the intercept, β s, a denotes the interaction sex-autonomous community (for all a = 1, …, 17, β female,a = β 0 and β male,a = β a ), u a the spatially structured community effect modeling spatial adjacency, v a the community-specific (unstructured) effect, γ t the temporally structured effect and d a a random slope that modulates the trend in each autonomous community. In addition, f a · I t (2020) models the specific effect of COVID-19 on the excess mortality in 2020 (outbreak), being I t (2020) = 1 for t = 2020 and zero for t < 2020, and f a , the different effects of the epidemic for the autonomous community a. Model parameters were estimated using the Bayesian framework. Briefly, Bayesian methods assume a certain a priori knowledge of the parameters, which is formulated as a probability distribution (prior distribution). This distribution, together with the available data, leads to the socalled posterior distribution, whose characteristic values constitute the parameter estimates. So, the 2.5th and 97.5th percentiles of the posterior distributions yield a credibility interval (CI) for the corresponding parameter. Therefore, the estimation of each parameter of the model is presented via the mean, median, 2.5th and 97.5th percentiles of its posterior probability distribution. For the four spatio-temporal models, one for each age cohort, we use the following prior distributions, namely: for α, we use a centered Gaussian distribution with variance 10 4 (α~N(0; 10 4 )), whereas for the sex-autonomous community interactions β s, a we assume priors centered, independent and normally distributed random variables IIDN(0; 10 4 ). The spatially structured components u = {u a : a = 1, …, 17} were modeled via the corresponding adjacency matrix of the autonomous communities (h i, j = 1 or 0 according on whether the communities i and j are neighboring or not) by conditional autoregressive priors as was first proposed in [16, 17] . More concretely, we assume that: Here, u −a = u − {u a } and N a is the number of neighboring communities to {a}. For the precisions 1=σ 2 v ; a prior logGamma(1; 10 5 ) was considered. The unstructured v a was modeled as a centered, independent, and normally distributed random variable; i.e., IIDNð0; σ 2 v Þ , being 1=σ 2 v $ logGammað1; 10 5 Þ . The time effect was modeled dynamically as a random walk of order 2, i.e., γ t jγ t−1 ; γ t−2 $ Nð2γ t−1 þ γ t−2 ; σ 2 γ Þ [18] . For 1=σ 2 γ , we consider a prior logGamma(1; 10 5 ). Finally, for the random effects d a and f a , we modeled the priors as IIDN(0, 10 5 ). The probability distribution of the deaths count was based on the deviance information criterion [19] . It is a generalization of the Akaike information criterion (AIC), developed for the comparison of two models in the Bayesian framework. According to this criterion, a value of DIC is associated to each model. Then, given two models, the one with the lower DIC will be preferred. From this model, the projection of deaths for 2020 in the cohort {s, a}, based on the sequence 2008-2019 is And the rate-ratio between the real rate and project for the (a)-community is All models were estimated using the procedure known as integrated nested Laplace approximation (INLA), through the R-interface [20, 21] . Data were analyzed using the R language and environment, version 3.6.1 [22] . Figure 1 shows the annual evolution of the mortality rates observed from 2008 to 2020 for entire Spain and per Autonomous Community, according to age group and sex. Possible probability distributions for the death counts are the Poisson and negative binomial distributions. According to the deviance information criterion (see Table 1 ), the negative binomial distribution led to better fits than the Poisson distribution for the models corresponding to the age groups 50-64 years and 75 and over. For the other age groups, both distributions led to similar fits. Figure 2 displays the fitted versus observed death counts for each of the four models, when using the negative binomial distribution. As can be seen, all models present a very good fit. Table 2 shows the observed and projected counts of deaths according to age group. The total number of deaths in the 17 Spanish autonomous communities during 2020 was 499,104, while the number projected by the spatio-temporal models for the same period was 426,970, the difference being 72,134 deaths. However, this increase was very unequally distributed among the communities. Table 3 shows the fitted mortality rates for 2020 (projected and actual) and the mortality rates ratios (real/ projected) are also displayed in Fig. 3 . In the group of age less than 50 years, there was no significant increase in mortality rates. In the other age groups, Madrid was the community that experienced the highest growth rates (24.8%, 42.8%, and 39.4%) in the groups of 50-64 years, 65-64 years, and 75 and over, respectively). Increases in mortality rates were also high in the two communities adjacent to Madrid (Castilla-La Mancha and Castilla y León) and decreased in all directions but to the northeast. Catalonia also showed growth rates greater than 20% in groups aged 50 and over. In the island communities (the Canary and the Balearic Islands), the mortality rates estimated for 2020 did not show significant increases in relation to those projected in any of the age groups. The relationship between the mortality rates between the sex groups also showed spatial differences. For example, in the Basque Country and in the group aged 75 and over, the mortality rate was 31% higher in men than in women, while in Murcia it was only 20.9% higher. Figure 4 displays the evolution of the mortality rates for Canary Islands and Madrid for the age group 75 and over. These autonomous communities exhibited the greatest differences in the evolution pattern of mortality rates. The increase in the mortality rate in 2020 was not statistically significant in the Canary Islands (RR a = 1.01; 95% CI = 0.89; 1.13), while Madrid had the highest in Spain (RR a = 1.39; 95% CI = 1.23; 1.56). This study aims to assess the spatial distribution of excess mortality in Spain in 2020 in relation to the projection of mortality for that year based on the 2008-2020 sequence. For this purpose, four spatio-temporal models for the evolution of mortality rates through the 2008-2020 period, including the effects of autonomous community and sex-and the appropriate interactionswere estimated. The impact of COVID-19 in 2020 as the joint result of all effects converging on mortality-as cited in [2] -was evaluated by introducing a specific effect for 2020 (community-year interaction). The modeling of the temporal effect in two components must be highlighted, namely: a general component (γ t ) dynamically modeled by means of a random walk of order two plus a specific linear trend for each community (d a ). Through these slopes, a significant decreasing trend was detected in mortality rates for the Canary Islands (d Canary = − 0.0077; 95% CI = − 0.0113; − 0.0042) and an increasing trend for Castilla La Mancha (d La Mancha = 0.0078; 95% CI = 0.0045; 0.0112). It should be noted that a closer study of the model revealed that the effects of sex were not uniform for all the autonomous communities, and hence a sex-community interaction was introduced through the parameters β s, a . Therefore, it is reasonable to think that the model is an adequate tool for estimating the spatial effects of an epidemic during a given period. Our spatio-temporal model is estimated in the Bayesian framework by using the INLA procedure. INLA does not require iterative computations to obtain an approximation to posterior distributions, thus making it computationally efficient. Alternatively, Markov chain Monte Carlo (MCMC) methods (based on simulations and so more time and computing resources consuming) could be used, but many studies show that both perform similarly in a wide range of situations [21, 23] . The difference between the observed deaths and those projected for the year 2020 found by our models was 72, [25] . The INE has published a press release reporting that during the five first months of 2020 (the first and deadliest wave of the epidemic) the death toll attributable to COVID-19 (directly or because this disease acted as comorbidity) was 45,684 (51.5% men, 48.9% women, 87.1% older than 70 years) [26] , but the assessment for the second half of the year has yet to be made. All these organizations focus on estimating the direct mortality due to COVID-19, but do not intend in any way to evaluate the global impact considering either indirect deaths or the effects of lockdown period. Fig. 3 Ratio rates (RR) between observed and projected mortality rates for 2020 in each autonomous community. Projected mortality rates were obtained from the mortality data corresponding to years 2008-2019. Autonomous communities have been numbered as in Table 3 Fig . 4 Evolution of mortality rates for the age group of 75 year or over in Canary Islands and Madrid: observed and fitted rates according sex. The temporally effect was modeled dynamically by means of a random walk of order 2. Bands corresponding to the 95% credibility intervals All evaluations carried out by the organizations above cited agree on the fact that the effect on mortality has been greater in men and in the older age group. It has also been observed in European countries that the effect of sex on the number of deceased was higher in men that in women (relative risk = 1.60; CI 1.53-1.68) [27] . In another study with data from the first quarter 2020 in Wuhan [28] , the authors observed a significantly higher mortality in men compared to women (odd ratio = 3.4; 95% CI 1.2-9.1). The excess mortality estimates provided by our model, which includes the balance between the different components of excess mortality during the full year 2020, are consistent, albeit different, with those calculated by INE, ISCIII, and MoMo. While INE and ISCIII only show the number of deaths directly attributable to COVID19 (the INE only during the period covering the first wave of the pandemic), MoMo calculates deaths from all causes, but only considering the estimated periods of excess mortality. Always taking as reference the projected mortality rates for 2020, Madrid was the Autonomous Community that showed the highest increase in the mortality rate, followed by the neighboring community of Castilla-La Mancha. This spread seems to be related to mobility between them in the period prior to lockdown. Similarly, Castilla y León, also adjacent to Madrid, experienced one of the largest increases in mortality. Madrid, Castilla y León and Castilla La Mancha form the central plateau of the Iberian Peninsula, and a very high percentage of Madrid's population has their roots and strong familiar relationships in those adjacent communities. Madrid was also the European city where the highest excess mortality was recorded during the pandemic, comparing deaths from all causes in Europe in the first half of the year according to the UK Office for National Statistics [29] . In a study on the impact of COVID-19 in metropolitan counties in the USA, it was found that larger metropolitan areas (measured in terms of population) presented higher mortality rates [30] . However, in the present study the effect of territorial contiguity was more influential than the size of the population, since in large cities such as Valencia (Valencian Community) or Seville (Andalusia), the impact of mortality was lower than in cities and towns of Castilla la Mancha adjacent to Madrid. Catalonia also showed an increase in mortality rates in all age groups, which could be partly attributed to the high occupation rates in the railway and air corridors between Madrid and Barcelona. Catalonia presented the highest daily percentage increase in mortality in Spain (up to 33.96% of daily deaths) in the first 50 days after lockdown [5] . However, in those communities away from Madrid (Galicia, Murcia, Andalusia, and the Balearic, and Canary Islands), mortality rates did not show statistically significant variations in any of the age groups. This could be explained by the fact that when the lockdown began, the number of cases that had reached these communities, especially the island communities, was not large enough to cause an alteration in mortality rates. Other authors reached this same conclusion, based on the low mortality rates observed in the two Spanish autonomous cities in North Africa during lockdown [5] . The lockdown itself could act as a protective factor of mortality for other infectious diseases and thus compensate for the deaths caused by COVID-19. Something similar happened in Italy, where the impact of COVID-19 was much lower in the islands of Sardinia and Sicily [3] . Also, the fact that the mortality rate in Denmark did not increase during the COVID-19 pandemic compared to the mortality rates in the same period during 2015-2019 has been attributed to the protective effect of lockdown measures [31] . The excess of deaths in Spain in 2020 in relation to projected deaths for that period using the mortality data of the 2008-2019 sequence was 72,146 people, but their spatial distribution was very uneven. In central Spain, the greatest increase in mortality was detected in the adjacent communities of Madrid and Castilla La Mancha. On the contrary, the smallest increments were detected in autonomous communities more distant from Madrid, including the islands, with the only exception of Catalonia. All four models showed good fits (see Fig. 3 ) and have proved to be a useful tool for estimating the impact of COVID-19 on mortality in Spain in 2020, making it possible to assess how it affected different age groups while accounting for effects of sex, spatial variation between regions and time trend over the last few years. Mancha 725 (695; 757) 1653 (1588; 1722) 1020 (975; 1067) 796 (766; 826) 1804 (1738; 1869) 882 (853; 913) Excess mortality during the covid-19 pandemic: Early evidence from England and Wales Excess mortality: the gold standard in measuring the impact of COVID-19 worldwide? How large was the mortality increase directly and indirectly caused by the covid-19 epidemic? An analysis on all-causes mortality data in Italy Temporal dynamics in total excess mortality and covid-19 deaths in italian cities The effect of lockdown on the outcomes of COVID-19 in Spain: an ecological study Prediction of the number of deaths in India due to SARS-COV-2 at 5-6 weeks. Diabetes and Metabolic Syndrome Estimation of COVID-19 prevalence in Italy Spatial modeling, risk mapping, change detection, and outbreak trend analysis of coronavirus (COVID-19) in Iran (days between february 19 and june 14, 2020) Spatial disparities in coronavirus incidence and mortality in the United States: An ecological analysis as of may 2020 Space-time airborne disease mapping applied to detect specific behaviour of varicella in Valencia Franch-Pardo I, Napoletano BM, Rosete-Verges F, Billa L. Spatial analysis and gis in the study of COVID-19. A review Población residente por fecha, sexo y edad. INE (Instituto Nacional de Estadística) ?c=Estadistica_C&cid=1254736177008&idp=1254 735573002&menu=resultados#!tabs-1254736195450. Accessed 15 Estimación del número de defunciones semanales durante el brote de COVID-19. INE (Instituto Nacional de Estadística) Bayesian image restoration, with two applications in spatial statistics On conditional and intrinsic autoregression Bayesian modelling of inseparable space-time variation in disease risk Bayesian measures of model complexity and fit Approximate bayesian inference for latent gaussian models by using integrated nested laplace approximations Bayesian Spatial Modelling with R-INLA R: A language and environment for statistical computing. R Foundation for Statistical Computing Geospatial and temporal variation of prostate cancer incidence Situación y Evolución de la pandemia de COVID-19 en España Spain 2020 Coronavirus disease 2019 and gender-related mortality in european countries: a meta-analysis COVID-19 clinical characteristics, and sex-specific risk of mortality: systematic review and meta-analysis Comparisons of all-cause mortality between European countries and regions Longitudinal analyses of the relationship between development density and the covid-19 morbidity and mortality rates: Early evidence from 1,165 metropolitan counties in the united states. Health Place National all-cause mortality during the COVID-19 pandemic: a danish registry-based study Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations Authors would like to thank the reviewers for their invaluable comments and suggestions which have been of great help to enrich and improve the quality of the manuscript. All authors contributed to the study conception and design. Data collection and statistical data analysis were performed by P. Saavedra and A. Santana. The "Introduction" and "Discussion" sections were first written by E. Sanjuán, L. Bello, and JM Pacheco. The first drafts of the "Material and methods" and "Results" sections of the manuscript were written by P. Saavedra. All authors contributed, commented, and reviewed the successive versions of the manuscript. All authors read and approved the final manuscript. Not applicable.Availability of data and materials Data for this paper are publically available at INE (National Statistics Institute of Spain), in https://www.ine.es/jaxiT3/Tabla.htm?t=10262&L=0, https://www. ine.es/dyngs/INEbase/es/operacion.htm?c=Estadistica_C&cid=12547361 77008&idp=1254735573002&menu=resultados#!tabs-1254736195450 and https://www.ine.es/experimental/defunciones/experimental_defunciones.htm. Data and code for fitting the models can also be found at the github site of one of the authors https://github.com/angeloSdP/COVID-19-outburst-Spain. Ethics approval and consent to participate Not applicable. Not applicable. The authors declare that they have no competing interests.Author details