key: cord-0860245-3yzxljjf authors: Awasthi, R.; Nagori, A.; Singh, P.; Pal, R.; Joshi, V.; Sethi, T. title: Temperature and Humidity Do Not Influence Global COVID-19 Incidence as Inferred from Causal Models date: 2020-06-29 journal: nan DOI: 10.1101/2020.06.29.20142307 sha: 0fc7f411cc86fb0f1dad5d54358f7bc8997bea37 doc_id: 860245 cord_uid: 3yzxljjf The relationship between meteorological factors such as temperature and humidity with COVID-19 incidence is still unclear after 6 months of the beginning of the pandemic. Some literature confirms the association of temperature with disease transmission while some oppose the same. This work intends to determine whether there is a causal association between temperature, humidity and Covid-19 cases. Three different causal models were used to capture stochastic, chaotic and symbolic natured time-series data and to provide a robust & unbiased analysis by constructing networks of causal relationships between the variables. Granger-Causality method, Transfer Entropy method & Convergent Cross-Mapping (CCM) was done on data from regions with different temperatures and cases greater than 50,000 as of 13th May 2020. From the Granger-Causality test we found that in only Canada, the United Kingdom, temperature and daily new infections are causally linked. The same results were obtained from Convergent Cross Mapping for India. Again using Granger-Causality test, we found that in Russia only, relative humidity is causally linked to daily new cases. Thus, a Generalized Additive Model with a smoothing spline function was fitted for these countries to understand the directionality. Using the combined results of the said models, we were able to conclude that there is no evidence of a causal association between temperature, humidity and Covid-19 cases. Ever since the outbreak of coronavirus at the global stage and WHO declaring it as pandemic on 11th February 2020, multiple speculations started surfacing for controlling this outbreak. Prediction for active cases turned out to be the most highlighted research, as flattening the curve was a primary goal everyone was aiming for. Various Mathematical and Artificial Intelligence based models [6, 7, 8, 9] were used for this task to provide government, health advisories and policy makers substantial information that would help them make decisions. No study has been done so far that incorporates the effect of environmental factors in these models. These factors might not only help us improve the general fit of the predictive model but also provide us with a deeper insight into ways of curbing the spread. Despite the above, there is no conclusive evidence in the past literature which states whether temperature or humidity play a role in affecting the infection rate. The research community had diverse opinions on these two factors. In this study we tried to address this issue and examined the causal association of temperature and humidity with the spread of COVID-19 pandemic. Jingui et. al stated that coronavirus is positively affected with temperature [1] . The Generalized Additive Model (GAM) was used to show association between temperature and rate of infection in this paper [2] . Another significant study was done to portray the effects of temperature and humidity in death rates for COVID patients in Wuhan, China stating that humidity was negatively correlated with COVID [2] . Even in this study the GAM model was used to depict the association of humidity with the infection rate. Ye Yao et. al stated in his research that there is no significant association of COVID-19 with temperature and UV radiations in Chinese cities; exploring the association through various regression methods [3] . Investigations done by Jingyuan et. al suggested that high temperature and high humidity reduces the transmission of COVID-19 based on cross sectional regression analysis. [4] . Extending ideas from the above, our study investigates various causal effects of temperature and humidity on active COVID-19 cases for selected countries around the world. Causal Inference can state how one factor can transmit its effect on other factors, and also make connection of an effect with its cause. We have used three different causality tests for finding these critical insights and state conclusively whether temperature and humidity have an effect in the rate of infections or not. The environmental data was downloaded from https://power.larc.nasa.gov/data-access-viewer/ and the Covid-19 data was taken from https://github.com/CSSEGISandData/COVID-19. Using the same, we extracted the relative humidity and average temperature in 2M and country wise count of daily new cases for our analysis. Location Identification for the Analysis: From every 7 regions defined by WHO we selected only those countries which have more than 50,000 cases till 13th May and show considerable variations in their average temperature. In order to get robust and unbiased results we performed multiple causal tests on our data. Granger causality, Transfer Entropy (suitable for stochastic systems) and Convergent Cross-Mapping (suitable for Dynamical systems) were used for calculating the causal relationships. Further, for countries where we had a significant amount of causal relationship we used Generative Additive Model for calculating the directionality of the same. The first test of causal association was performed using the Autoregressive based Granger-Causality method. The Granger-Causality method is used to find if one time-series can be the cause for another or vice-versa. in-order to optimize lag p (i.e. past p values) of the time series we fit an AR (p) (where p varies from 1 to 15) model for every time series and then identify lag with minimum BIC (Bayesian information criteria) score) [6] . The Granger-Causality method is mainly useful for analyzing numerical time series data produced by processes having patterns which are analyzed statistically. But a key limitation of this technique is that it is restricted to stationary time-series data. We performed our analysis with R Package lmtest [13] . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 29, 2020. . Transfer-Entropy is used to measure the amount of information that gets transferred from one variable to another. Transfer-Entropy helps us formalize the amount of uncertainty reduced in the future values of a time series by knowing the past values of another. In this analysis, we calculated the Shannon entropy [17] for this task by assuming the time-series as a sequence of discrete symbols and then running the model for 300 bootstraps. Significant causality between variables based on p values (<0.05) was assumed and then the lag optimization was performed same as mentioned above. R software RtransferEntropy was used for the analysis [10]. The Transfer-Entropy and the Granger-Causality techniques outlined above are suitable only for the systems having probability distribution. But one may also encounter numerical data that may have originated from a Deterministic Dynamical System. Coupled differential equations can be used to model such systems. However, there are many cases where only a few variables which are relevant to the system are available for the researchers. To tackle this, Taken's Theorem [11] known as Phase-State Reconstruction was used, which states that the properties of dynamical system's attractor can be recovered using a single one of its variables. Using this theorem, Sugihara et al. (2012) developed the new mapping technique named Convergent-Cross Mapping (CCM) [12] , this enabled the recovery of direction of causality between any two time sequences that are generated by the same deterministic dynamical system. To identify causal association, we embedded our time series in 3 dimensions series. And from the embedded time series we created a library and randomly sampled 100 samples from each library. Pearson correlation is used to score the cross-mapping performance, where the correlation was assumed to be greater than 0.80 for significant causal association. rEDM R package used to perform this analysis [14] . We finally plotted and estimated the effect size of temperature and humidity on the countries where the variables were found to be causally related. A smoothing spline function is fitted between lagged weather conditions and log scale daily new cases. The spline curves thus obtained were plotted. Further an adjusted model with previous day's counts data and temperature and relative humidity was built for only countries that have causal relationships as calculated from the above methods. First using BIC score we calculated the minimum lag of Temperature , Humidity and New Cases time series (Supplementary figure 1) we obtained that in the lag range of 1-15 time series have almost similar BIC Score, so we picked the lag value 1 for our analysis . Then to check stationarity in time series we performed the Augmented Dickey Fuller test (ADF Test) and we found Temperature, Humidity and new cases time series as stationary in every country (Supplementary Table 1 ). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 29, 2020. . We performed an F-test to verify the Null Hypothesis i.e. Temperature and Humidity Granger Causes the new infected cases of covid-19 infected cases. For Canada and the United Kingdom temperature was found Granger causal with daily new cases with P-values 0.002 (<0.005) and 0.003(<0.005) respectively and humidity was found Granger causal with daily new cases in Russia with P-values (<0.005). For other Countries we have not found a significant P-value to explain that temperature and humidity Granger Causes daily infection spread. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 29, 2020. . https://doi.org/10.1101/2020.06.29.20142307 doi: medRxiv preprint Transfer entropy: We interpreted our results using Shannon's Entropy to depict how temperature and humidity are causally associated with daily count of cases. We found that in each country the value of p is greater than 0.05 which ensures that temperature does not have any causal effect on infection. Similarly in case of humidity the same trend can be seen. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 29, 2020. . https://doi.org/10.1101/2020.06.29.20142307 doi: medRxiv preprint We embed the time series in 3 dimensions and randomly sampled 100 samples from each embedded time series and calculated the cross mapping. To interpret the results, we aggregated the cross map performance at different library sizes (varies from 10-80, with the difference of 10), which computes a mean value at each unique. Since average cross map skill found to be less than 0, meaning prediction skills are next to nothing. (Observations should not be anticorrelated with predictions), so we set negative values to 0 when plotting. Among all countries only for India we found the causal association of temperature and daily new cases and humidity was found not causally associated for any states. . is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 29, 2020. . From the causality model, it's found that three countries Canada, India and the United Kingdom have causal associations between average temperature and covid19 new cases. And Russia has the causal associations between relative humidity and covid19 new cases. To reassert these relationships we looked at generalized additive models based spline fitting. For Canada, India and the UK, new cases were found to be increasing with rising temperature. For Russia, Initially new cases were found to be decreasing with increasing relative humidity but later the trend reversed i.e. increasing relative humidity resulted into increasing new cases. Infection of coronavirus is exponentially expanding all around the world, which is a threat for human life. India being a populous and dense country faces a higher risk of spread. However policy interventions such as lockdown at the very beginning kept the spread linear in India. While medical facilities, policies interventions have been implemented, it has been a general question since the very beginning that can temperature and humidity be of any help in reducing the spread. This has been observed in other pandemics such as Influenza spread in 2009 [15] that, with increase in the temperature and humidity the spread reduced. Also for SARS-COV1 spread it has been reported that dry and cold air can lead to the spread. For Sars-cov2 as well, many studies have happened which assert the association between weather conditions and the spread of the virus. However, in our knowledge, none of these efforts involve study of the causal effect of weather on the spread, mere associations can mislead the policy makers and can cost millions of lives. In this paper, we performed causal study between weather conditions with temperature and relative humidity with daily count of new infection. To involve large variability of the weather conditions, we have involved different geographical and political locations of the countries as shown in figure1. We employed a comprehensive statistical analysis to estimate the causal relationship between the time-series. It turned out the majority of the countries does not hold the significant causal relationships as computed using Granger Causality, transfer entropy and crossconvergent mapping, however a few were found to have causal relationships between weather conditions and new cases of covid19. Analysis was carried out separately for each of these in order to account for latent variability hidden within geographical and political diversity. Finally, we fitted a smoothing spline function using generalized additive models using Poisson regression to see direction of relationships. In general, most of the countries were found to have no-causal relationships. We looked at the different Geographies according to the division done by WHO [19] for different counties, we found that Canada, India and the United Kingdom were having an increased number of cases with increase in the temperature. The directionality found here is opposite to what is reported in literature for Influenza. Therefore, it would not be recommendable to take policy decisions on the basis of temperature and relative humidity as of now. Our study has few limitations, it's difficult to compute the exposure when it comes to weather conditions as many people are inside their houses due to lockdown and would avoid outdoor exposure. We also think that we have a very short period of observational data and with more data, findings may fluctuate to other sides. Overall we tried to put forth an exhaustive search for causal relationships between weather conditions and new cases of covid19 based on the available literature on causal time-series analysis and these findings asserts that we didn't have causal association between temperature and humidity with new cases of covid19. Association between ambient temperature and COVID-19 infection in 122 cities from China Effects of temperature variation and humidity on the death of COVID-19 in Wuhan, China No Association of COVID-19 transmission with temperature or UV radiation in Chinese cities High temperature and high humidity reduce the transmission of COVID-19 Wiener-Granger causality: a well established methodology Comparing dynamic causal models using AIC, BIC and free energy Epidemiological Characteristics of COVID-19 Ongoing Epidemic in Iraq Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy The Takens embedding theorem Spatial convergent cross mapping to detect causal relationships from short time series Package 'lmtest' Wiener-Granger causality: a well established methodology Divergence measures based on the Shannon entropy Symbolic transfer entropy WHO regions This work was partially supported by the Wellcome Trust/DBT India Alliance Fellowship IA/CPHE/14/1/501504 awarded to Dr. Tavpritesh Sethi and the Center for Artificial Intelligence at IIIT-Delhi. Tavpritesh Sethi and Pradeep Singh also acknowledge the support received from the Department of Science and Technology vide project DST/INT/ISR/P-21/2017. Authors also acknowledge Prof. Rakesh Lodha from All India Institute of Medical Sciences, New Delhi, for his valuable inputs.We also thank CSIR for supporting Mr. Aditya Nagori and UGC to support Mr. Raghav Awasthi with PhD fellowships. Finally below