key: cord-0765249-7a69g6la authors: Loché Fernández-Ahúja, José María; Fernández Martínez, Juan Luis title: Effects of climate variables on the COVID-19 outbreak in Spain date: 2021-02-27 journal: Int J Hyg Environ Health DOI: 10.1016/j.ijheh.2021.113723 sha: 748f581a4ee53e7f807c72c777841e78c7a13105 doc_id: 765249 cord_uid: 7a69g6la An outbreak of the novel COVID-19 virus occurred during February 2020 onwards in almost all the European countries, including Spain. This study covers the correlation found between weather variables (Maximum Temperature, Minimum Temperature, Mean Temperature, Atmospheric Pressure, Daily Rainfall, Daily Sun hours) and the coronavirus propagation in Spain. A strong relationship is found when correlating the virus spread to the mean temperature, minimum temperature, and atmospheric pressure in different Spanish provinces. In this analysis we have used the ratio of the PCR COVID-19 positives with respect to the population size. A linear regression model using the mean temperature is implemented. Moreover, an analysis of variance is used to confirm the influence of mean temperature on the spread of virus. As a second measurement of the COVID-19 outbreak we have used the results of the antibodies tests carried out in Spain that provide an estimation of the heard immunity achieved. Based on this analysis, an estimation of the asymptomatic population is performed. All these results exhibit significant correlation with weather variables. The most affected provinces were Soria, Segovia and Ciudad Real, which are the coldest. On the opposite side, places such as Southern Spain, the Baleares, and Canary Islands showed a lower rate of spread. This might be related to the warmer climate and the insularity of these islands. Besides, the coastal influence and the daily sun hours might also influence the lower rates in the east and west regions in Spain. This analysis provides a deeper insight of the influence of weather variables onto the COVID-19 spread in Spain. This paper aims to understand the influence of the weather variables in the COVID-19 spread in Spain. Many studies have only focused on the importance of the density of population in major cities with high connectivity worldwide. Nevertheless, there are other factors that could influence the transmission. For instance, it has been seen in South Korea that only one COVID-19 infected person, known as "patient 31", transmitted the virus to 4482 persons in the first stage of the pandemic (Shim et al., 2020) . Therefore, the potential spread of the COVID might not be only related to the number of infected people and other factors might have a great influence in the strength of the transmission. Several weather variables have been mentioned among the various potential factors that might have played a crucial role in the spread of this virus. One of the first studies regarding the effect of the temperature showed that the COVID-19 survived better with mean temperatures in the range between − 10 and 15 • C (Bannister-Tyrrell, M et al., 2020) . Moreover, a robust negative association with high temperature and high humidity as factors, that reduced the COVID-19 transmission, was found over for Chinese and USA cities (Wang et al., 2020) . Following these statements, neither extreme cold nor hot conditions would favour the virus transmission. Regarding humidity, for Iran cities seems it has a negative correlation with COVID-19 propagation (Ahmadi, M et al., 2020) . However, the authors of that study say the correlations are not enough significant. Another study focused on the main Turkish cities concluded that the climate factors that best correlate with COVID-19 spread are the mean temperature on the day of positives measurement and the wind speed 14 days before the positive detection (Ş ahin, M., 2020). Nevertheless, this result should be interpreted with caution. Another study, focused on Brazil pointed that high mean temperatures (around 27.5 • C) favoured COVID-19 propagation (Auler, A.C. et al., 2020) . The same authors pointed that other demographic and social-economic factors, such as health conditions, might be crucial. Beside Brazil is located in the tropical zone with temperatures that are approximately steady along the year. In addition, the distance among large metropolitan areas are considerable. In another study, concluded that mean and minimum temperatures correlated well with COVID-19 pandemic in New York (Farhan Bashir et al., 2020) . Since the major affected cities in Spain are not such air polluted the role of air quality has been not focused in this study. Regarding a high populated country as India, air temperature was mentioned as a non-significant factor for the virus spread, as well as the lockdowns were said to have not a correlation to the reduction of the virus spread (Gupta, A et al., 2020) . However, the virus spread was reduced by a 45% in Italy during the lockdown (Gatto, M et al., 2020) . This paper analyses the influence of the main meteorological variables on the Covid spread, treating these variables as spatial-temporal processes, analyzing their correlations through time and space in the Iberian Peninsula. Although the methodology that we used is wellknown in applied mathematics and statistics, this serves to clarify and shed light on the importance of meteorological variables in the spread of the pandemic in all the Spanish provinces. Climate variables values were obtained from the Spanish Agency of Meteorology (AEMET) platform OpenData (AEMET OpenData). Appendix I shows the list of used meteorological stations. The range of dates covers from 31st January to 20th June. The percentiles for each climate variable were computed for each meteorological station representative of each Spanish province. COVID-19 positives data was collected from Carlos III, 2020 Institute website. Spain has notified 237 096 COVID-19 cases and has conducted 2 536 234 COVID-19 PCR Tests by the May 28, 2020. At the start of the pandemic, Spain notified 2968 COVID-19 cases by the March 12, 2020. The distribution of PCR positives is not uniform along the country. Finally, the population data is obtained from the Spanish National Institute of Statistics (INE, 2020). Data about wind speed is available, although wind direction data is not available. From our point of view, wind direction data is quite as important as wind speed data. Hence, if both data are not available, we decided using only one is not a right approach. Another reason is that wearing a mask outside is compulsory. Therefore, most of the virus transmission happens in indoor scenarios, in which wind speed is not as important. We also think that in an urban scenario, the wind speed and direction can be modified by the buildings or the factories. Fig. 1 shows the Cumulative Distribution Function (CDF) of these climate variables. Table 1 shows the percentiles of these variables. Appendix II also shows the descriptive analysis of the climate variables for all the Spanish provinces. The cumulative number of COVID-19 tests carried by each Autonomous Community shows the total number of PCR tests done, positive or negative. It is estimated as a "density of tests" (Testd), defined as follows: The cumulative number of COVID-19 positives by province is taken from the data provided by the organization responsible for Carlos III, 2020 Spanish Institute, which is responsible to gather and cure this data. To estimate a "density of positives", the population data from each province is needed. The density of positives by province (PCRd) is defined as follows: which gives the ratio of the PCR positive over the total population. To estimate a "daily density of positives", the population data from each province is needed. The density of positives by province (Daily PCRd) is defined as follows: which gives the ratio of the number of PCR positive in a single day over the total population. To estimate a "daily increment of the density of positives", the population data from each province is needed and the density of positives by province (Daily PCRd) for two consecutive days is required. The density of positives by province (ΔDaily PCRd) is defined as follows: which gives the evolution of the daily density of PCR positives. The Spanish Ministry of Health carried out a prevalence of COVID-19 IgG Antibodies in all the provinces and the autonomous cities (Ceuta and Melilla). The result is an estimation of the percentage of inhabitants of each province who could have been infected. Three rounds were done to obtain the percentage of population for each province who had been infected (IgGd [%]). Theoretically, the IgGd is computed following equation (2). The IgGd [%] values are given in Appendix III. By using the data from the COVID-19 density of positives and the results from the IgG antibodies research study, an estimation of the percentage of asymptomatic population can be computed. Two assumptions are made. First, it is assumed all people who were in serious or critical conditions were tested with a PCR. Second, it is assumed all the infected people has generated IgG antibodies after recovering from the virus. Therefore, the percentage of asymptomatic population (APr) is the percentage of population with antibodies minus the percentage of positives: These metrics (density of tests, density of positives, increment of the density of positives, density of antibodies and percentage of asymptomatic population) provide different insights about the impact of COVID-19 in each Spanish province. The aim is to avoid any type of bias due to the number of inhabitants. To reduce the impact of noise in data, we have computed for each day the moving averages for five days for each of these variables. Therefore, the correlation is performed between smoothed versions of these temporal series. This is particularly important for PCRd, since this kind of information is sometime delayed by the public health authorities. Then, the correlation between both temporal series is computed via the Pearson's and Spearman's correlation coefficients. That way is possible to produce regionalized variables using these coefficients for all the provinces. This section shows the results obtained in this study. First, the correlation between the COVID-19 density of positives PCRd and the climate variables is found, in subsection 3.1. Second, a linear regression model using the mean temperature as independent variable and the COVID-19 density of positives PCRd as dependant variable is presented, in subsection 3.2. Third, an Analysis of the Variance (ANOVA) between the mean temperature and the daily density of new COVID-19 positives. Daily PCRd is computed in subsection 3.3. Fourth, the results from the massive antibodies study IgGd carried out in Spain are correlated with the climate variables in subsection 3.4. An estimation of the percentage of Asymptomatic Population APr [%] is computed, and the correlation of this result with climate variables is also tested, in subsection 3.5. Last, a comparison between the PCRd and Testd is shown for each Autonomous Community in subsection 3.6. These variables are considered spatial-temporal processes X(t, x), where x stands for the spatial (or geographic) coordinates that depend on each province and t for the temporal dependency. Fixing a time t 0 , is a regionalized variable. In this case X(t 0 , x) is a discrete sample of dimension 52, which is the total number of provinces and autonomous cities in Spain. On the other side fixing x, provides a temporal series X(t, x 0 ) that provides the temporal evolution of X(t) in the province x 0 . Both aspects (spatial and temporal) are important in understanding the Covid-19 spread. Summarizing, we have concluded that the mean and minimum temperatures are correlated to the density of PCR positives for Spanish provinces. Besides, the maximum temperature has a lower correlation than the minimum temperature. Therefore, the areas where the pandemic has been stronger are the cold regions of the Spanish "Meseta" (plateau) that are characterized by a continental climate. This fact has been also observed in New York (Farhan Bashir et al., 2020) . Nevertheless, in the New York study there was not any evidence that the warm weather could suppress COVID-19 transmission. This paper shares this conclusion: climate plays a very important role in the COVID-19 transmission, mainly in the cold areas, reducing or amplifying its spread, but it cannot eliminate the outbreak, as it has been observed lately on the second wave, when mobility restrictions have been abandoned. Besides, we have also observed that atmospheric pressure has a negative correlation with the density of COVID-19 positives in all the Spanish provinces. Low atmospheric pressure is a quite significant factor and could be one of the most related with the virus transmission. In Spain the low pressures are typical of spring and autumn due to the Azorean anticyclone moving intermittently south. In winter high pressures predominate because of marine polar anticyclones, the Scandinavian anticyclone and its connection with that of the Azores. Finally, in summer the weather is controlled by the Azorean anticyclone again, resulting in warm, dry and stable weather with low surface pressures that reverse with altitude and generate summer storms. Therefore, the climate in Spain is very different from the rest of Europe, and is dominated by low pressure throughout the year, which would favour the spread of the COVID virus. We have also analyzed the correlation between PCR positives and the antigen tests, showing a very important correlation between the PCR tests that originally were only performed on people with hospital symptoms and the number of people with antibodies, which also takes into account those who are asymptomatic. Basically, the count is simple: the number of infected is approximately six times the number of PCR positive tests. This would explain partly the higher rates of mortality observed in Spain referred to the PCR tests with respect to other countries that initially made more massive use of rapid tests. This situation would also explain the higher number of positives observed in the second wave because of the greater mobility and increased testing of the population. In conclusion, the climate variables can have a great influence in the propagation of the COVID virus and could be taken into account in the process of public health decision making to define the areas with a higher potential risk of COVID transmission. A correlation analysis was performed among the different climate variables and the density of COVID-19 positives (PCRd). Two correlation coefficients were used: the Pearson's and the Spearman's correlation coefficients. Pearson's coefficient measures the degree of the linear correlation between variable, whereas Spearman's coefficient is a measure of the rank correlation. It serves to assess how well the relationship between both temporal series can be described using a monotonic function (linear or not). For two series X 1 , X 2 of size n, both coefficients are defined as follows: The Pearson coefficient ρ P is related to the slope of the regression line between these variables (X 1 , X 2 ). Spearman's correlation coefficient is the Pearson coefficient between the corresponding rank variables. Ranking is the data transformation in which the numerical values of X 1 and X 2 are replaced by their rank (r(X 1 ), r(X 2 )), when the data are sorted. • Modelling as regionalized variables: spatial variability Table 2 shows the relationship between the climate variables and the density of COVID-19 positives on 1st June when most of the pandemic had already taken place. The correlations with the different temperatures, the atmospheric pressure and the daily sun hours are negative. Besides, the correlations with mean, minimum temperatures, and atmospheric pressure are the most important. The correlation with the daily sun hours is very low. The analysis shows no correlation between the density of COVID-19 positives and daily rainfall. Daily sun hours and daily rainfall seem to be non-significant factors in the COVID-19 spread. Fig. 2 shows the evolution of the correlation of the PCRd values with the climate variables, starting from 12th March. In this case for each date, the correlation between the corresponding spatial variables is performed. The minimum temperature (Fig. 2b) has a stronger correlation with the density of COVID-19 positives PCRd, than the maximum temperature (Fig. 2a) . Mean temperature (Fig. 2c) is the temperature which best correlates with PCRd. The correlation is always negative and decreases with time from -0.66 to -0.78. Fig. 2d shows the correlation with the mean range atmospheric pressure, while Fig. 2e and f shows the evolution for the daily sun hours and rainfall. The conclusions are similar to those previously commented for the 1st June, that is, the analysis is consistent along the duration of the outbreak. Particularly, in the case of the atmospheric pressure, both correlation coefficients show similar and very close trajectories. The virus propagation is also a temporal process X(t, x 0 ) when the spatial variable (province) is fixed. To show a detailed view of the Spain geography, Fig. 3 shows a map recorded using QGIS and Natural Earth Data. The map features latitude and longitude coordinates, a scale bar, and a north arrow. Then, the correlation between the climate variables and the density of daily positives can be computed for each province. For that purpose, first the right time span to analyze the outbreak must be selected. The analysis was performed between 11th March to 20th May where most of the infections have occurred. Fig. 4 shows the absolute value of the correlation between PCRd and the maximum, minimum and mean temperatures, using maps created with the online tool Mapchart (Mapchart) . The daily increment of the density of positives has a negative correlation with the maximum, minimum and mean temperatures for all provinces. Pearson and Spearman's correlation coefficients show the same tendency, with a few changes in the value of the correlation in some cases. Catalonia, La Rioja, two provinces of Comunidad Valenciana (Valencia and Castellón) and in some cases Madrid are strongly negative show an absolute correlation coefficient value close to 0.6. Therefore, in the places that show strong correlated results it is easier to predict the daily number of cases based on their mean temperature. The same analysis was performed to study the correlation with the temporal evolution of the atmospheric pressure. The atmospheric pressure has a high dependence on the terrain elevation. Given the terrain elevation is a constant value over time, the results of the temporal analysis for all provinces are close to 0, some of them positive and others negative. In other words, atmospheric pressure does not influence the daily evolution of the pandemic in a province but can affect the differences in the number of positives observed among different provinces. The main reason for using as a metric for measuring the virus outbreak the daily increment of positives and not the daily number of positives is that this measure shows more information. The number of positives only shows the virus impact during a day, but the daily increment also shows if the virus spread is increasing or decreasing. Mean temperature is the most correlated variable with the density of positives, according to the Spearmans' correlation coefficient (Table 2) . Thus, a linear regression model, using the density of positives metric PCRd as the dependent variable can be constructed: Fig. 5 shows the correlation between COVID-19 density of positives metric PCRd and mean temperature T mean for two weeks timestep. In this plot the atmospheric pressure variable is only used to colour the points, but not for the linear regression. The correlation is plotted for the March 15th, April 1st, April 15th, May 1st, and May 15th. The negative correlation between the density of positives and the mean temperature is found in all the plots. Table 3 shows the linear equation and the RMSE error for each regression. Hence, a coefficient which enables us to establish a relationship between temperature and the COVID-19 spread was found. The slope is always negative and is closer to zero at the beginning of the outbreak. From April 15th, the slope value stabilizes • Modelling as temporal series between -0.0012 and -0.0014, that is, a decrease between 1.2 and 1.4 new PCRs per 1000 inhabitants and every degree of increase in the average temperature. The RMSE error is also very low in all the cases. ANOVA serves to demonstrate that the mean temperature is correlated with the COVID-19 propagation. The virus expands in a similar way in two different cities, provided the daily mean temperature values are the same. As the reader can expect, there is always an error margin. In this analysis, we have taken all the daily data of mean temperatures and density of COVID-19 positives between the 12th March and 20th June. To reduce the measurement error, the average mean of 10 days is computed for both the mean temperature and the daily new density of COVID-19 positives. Fig. 6 shows a plot of the daily density of positives and the moving average of the mean temperature. To produce this plot, the mean temperature varies from 11.74 • C (P10) to 18.22 • C (P90). P10 and P90 were taken from Table 1 . Within this range with step of 0.1 • C, the Daily PCRd was found for all the days between the 12th March and 20th June. For each temperature we scan the days were this temperature has occurred and we take the mean of the corresponding Daily PCRds. An average of these values along all provinces is finally taking place. Before doing these calculations, a moving average of 10 days of the daily mean temperatures and PCRd is performed to smooth the observed data and enforcing the dynamics of the virus propagation. This figure shows a negative correlation between mean temperature and COVID-19 density of positives (independently of the provinces). The equation of the linear regression is: that is, a decrease of round 8.3 cases by million every degree of increment in the mean temperature. For example, 0.00005 (5 ×10 − 5 ) Daily PCRd means 50 new daily positive cases that for 1 million habitants. Therefore, a decrease of 10 • in the mean temperature during the winter implies around 83 new daily positive cases for 1 million habitants. This might be one the factors of the increase of cases in January 2021 during the impact of Filomena storm in the Spanish peninsula. Obviously, the mobility and the social networks are also very important in the dynamic of COVID spread. For the ANOVA analysis, the null hypothesis H 0 is that the mean temperature has not an effect in the COVID-19 propagation. The result of the ANOVA test is 11.60, with 51 (number of provinces-1) and 1661 (number of datanumber of provinces) as degrees of freedom. The zero values in the data (temperatures in which there are not notified COVID-19 positives) are removed. The critical value is 1.32-1.39 for a significance level of 0.05. Therefore, the null hypothesis H 0 is rejected and the conclusion is that mean temperature has a relationship with the COVID-19 propagation. f (51 , 1661) = 11.60 > 1. 39 (11) The percentage of population with antibodies data IgGd [%] has also been correlated with the climate variables. First round analysis was carried from the 27th April to the 11th May. Second round analysis was carried from the 18th May to the 1st June. Third round analysis was carried from the 8th June to the 22nd June. Correlations with the climate variables are shown for each round: First Round (Table 4) , Second Round (Table 5) , Third Round (Table 6 ). As could be expected, correlation coefficients are quite similar when compared with the ones obtained from the COVID-19 density of positives metric PCRd. Third Round analysis correlates better than the second and first round antibodies results IgGd [%] with the climate variables. Mean and minimum temperature are more significant factors than the maximum temperature. Daily sun hours and rainfall seem not to be very significant factors, showing almost no correlation. Atmospheric pressure is the climate variable that correlates with IgGd [%] showing a bigger difference with respect to the temperature. These differences were smaller for the PCRd correlation analysis. The atmospheric pressure seems to impact more the COVID-19 transmission and the low temperatures with the new detected cases by PCR, which are more related to the severity of the infection, since it does not include the asymptomatic. Nevertheless, the impact of these two indicators must be implicit in the mean temperature (more daily sun hours, higher temperature) and in the atmospheric pressure (storms in Spain are usually associated to the entry of low pressures fronts from the North and West of the Spanish peninsula). Correlation between the temporal series for each province and the maximum, minimum and mean temperatures. Therefore, the virus could propagate along the whole year, more in winter than in summer, and the infections in summer would be more benevolent than in winter. These results suggest that during clear sky days, there would be lesser possibility of increment of positive cases, furthermore rainy winter could be very susceptible weather for Spain for the spread of this virus. The linear regression between PCRd and IgGd is: This relation tells the percentage of herd immunity is between 5 and 6 times PCRd. Therefore, an estimation of the asymptomatic people can be computed based on the number of PCR positives known. To produce this result, we have used the IgG data from the third round. Fig. 7 shows a plot of both data and the linear regression. The percentage of asymptomatic people Apr [%] is the subtraction of the percentage of population with antibodies IgGd [%] and the percentage of positives for each province PCRd [%] . Therefore, we have: According to this result, for each positive case there are more than 4 people that have stayed in mild conditions or even asymptomatic. This result does not depend on the population. These data are quite important to evaluate possible approaches that could be adopted in case of a second wave next autumn. It also helps to estimate the spread of the COVID-19 virus in a territory. Table 7 shows the correlation between the climate variables and this estimation of the percentage of asymptomatic population, based on the PCRd [%] from the 3rd June, which was used to find the Apr [%] for each province. Spearman's correlation coefficient shows that the mean temperature is the strong correlation variable with the Apr [%], followed by the minimum temperature and the atmospheric pressure. This result was also expected based on previous analysis and the linear relationships among variables. We also show the relationship between PCRd and Testd, which the quotient between the amount of PCR Tests carried out and the population of each Autonomous Community. Appendix IV shows the number of PCR tests carried out in each Spanish Community since the start of the pandemic until the May 28th. These data were released by the Spanish Ministry of Health. These data were not provided for each province individually. Fig. 8 shows the correlation between PCRd and Testd. Pearson's Climate variables provide an explanation for the COVID-19 propagation. Spain is a suitable country for this analysis since it exhibits four different types of climate: oceanic, Mediterranean, sub-tropical and continental. Temperature is an important factor to explain the COVID-19 propagation behaviour. The mean temperature is the one what best correlates with the density of positives PCRd. The correlation between the mean temperature and the density of COVID-19 positives PCRd increased from March to April, and then it is stabilized. Moreover, the minimum temperature correlates better than the maximum temperature with the density of positives PCRd. In colder places, the virus propagation can be extremely fast. Atmospheric pressure has a negative correlation with the density of COVID-19 positives PCRd in Spain provinces. Atmospheric pressure is a quite significant factor and could be one of the most related with the virus transmission. Daily sun hours and rainfall seem to be non-significant factors, although their contribution might be implicit in the mean temperature and also in the atmospheric pressure. Both values are considered not important to explain the COVID-19 spread behaviour. The ANOVA test has also confirmed the existing relationship between the mean temperature and the COVID-19 propagation. We also provide a plot for the Expected Daily PCR density depending on the mean temperature. According to our results there would be lesser possibility of increment of positive cases under good weather conditions, whilst cold weather is more susceptible for the COVID spread. Besides, the mild coastal weather conditions show an important influence in the Eastern and Western parts of Spain. Finally, we have performed the estimation of the asymptomatic population based on the results of the antibodies analysis, showing that for each PCR positive case there are more than 4 people that have stayed in mild conditions or even asymptomatic. Besides we have shown a positive correlation between the density of the PCR positives and the number of tests that have been performed in each state in Spain. In conclusion climate variables explain some important aspects in the COVID transmission and its strength but cannot be used to justify the disappearance/vanishing of the outbreak without taken into consideration epidemiological measures of social distance. Site for the distribution of Meteorological Spanish data Climate data of Spain during the COVID-19 lockdown Investigation of effective climatology parameters on COVID-19 outbreak in Iran Evidence that high temperatures and intermediate relative humidity might favor the spread of COVID-19 in tropical climate: a case study for the most affected Brazilian cities Preliminary evidence that higher temperatures are associated with lower incidence of COVID-19, for cases reported globally up to 29th Institute COVID-19. Daily update data of the COVID-19 pandemic in Spain Correlation between climate indicators and COVID-19 pandemic in New York Spread and dynamics of the COVID-19 epidemic in Italy: effects of emergency containment measures Assessment of temporal trend of COVID-19 outbreak in India Significance of geographical factors to the COVID-19 outbreak in India. Modelling Earth Systems and Environment 1-9 Estimating the impact of daily weather on the temporal pattern of COVID-19 outbreak in India. Earth Systems and Environment 1-12 Data of the population of the provinces of Spain Website that allows the user to create a custom map Relationship between the positives PCRd and tests Testd Impact of weather on COVID-19 pandemic in Turkey Transmission potential and severity of COVID-19 in South Korea High temperature and high humidity reduce the transmission of COVID-19 Supplementary data to this article can be found online at https://doi. org/10.1016/j.ijheh.2021.113723.