key: cord-0688700-xq2k3ygc authors: Iloanusi, Ogechukwu N.; Ross, Arun title: Leveraging Weather Data for Forecasting Cases-to-Mortality Rates Due to COVID-19 date: 2021-08-18 journal: Chaos Solitons Fractals DOI: 10.1016/j.chaos.2021.111340 sha: 03425f39270b230912bec1f122c261b2a13bb853 doc_id: 688700 cord_uid: xq2k3ygc There are several recent publications criticizing the failure of COVID-19 forecasting models, with swinging over predictions and underpredictions, which have made it difficult for decision and policy making. Observing the failures of several COVID-19 forecasting models and the alarming spread of the virus, we seek to use some stable response for forecasting COVID-19, viz., ratios of COVID-19 cases to mortalities, rather than COVID-19 cases or fatalities. A trend of low COVID-19 cases-to-mortality ratios calls for urgent attention: the need for vaccines, for instance. Studies have shown that there are influences of weather parameters on COVID-19; and COVID-19 may have come to stay and could manifest a seasonal outbreak profile similar to other infectious respiratory diseases. In this paper, the influences of some weather, geographical, economic and demographic covariates were evaluated on COVID-19 response based on a series of Granger-causality tests. The effect of four weather parameters, viz., temperature, rainfall, solar irradiation and relative humidity, on daily COVID-19 cases-to-mortality ratios of 36 countries from 5 continents of the world were determined through regression analysis. Regression studies show that these four weather factors impact ratios of COVID-19 cases-to-mortality differently. The most impactful factor is temperature which is positively correlated with COVID-19 cases-to-mortality responses in 24 out of 36 countries. Temperature minimally affects COVID-19 cases-to-mortality ratios in the tropical countries. The most influential weather factor – temperature – was incorporated in training random forest and deep learning models for forecasting the cases-to-mortality rate of COVID-19 in clusters of countries in the world with similar weather conditions. Evaluation of trained forecasting models incorporating temperature features show better performance compared to a similar set of models trained without temperature features. This implies that COVID-19 forecasting models will predict more accurately if temperature features are factored in, especially for temperate countries. © 2017 Elsevier Inc. All rights reserved. OVID-19 is a virus that initiated a challenging pandemic throughout the whole world in 2020. The impact of COVID-19 is known to be devastating and can lead to several other illnesses. Though COVID-19 is believed to have originated in China and by the first week of February 2020 was mainly confined to Wuhan, China, it however led to the death of over 380,000 individuals, globally, by the end of May 2020 [1] . Even though cities like Milan in Italy, initiated schemes for total lockdown by the third week of February 2020, most countries had to institute a nation-wide lockdown to contain this virus by March 2020. The global confirmed cases and deaths were 1880 and 46, respectively, by March 1 2020, however, it is alarming From the review done in this work and the close observance of COVID-19 data for sixteen (16) months, it is evident that the effects of weather parameters -temperature or humidity -on COVID-19 cases could differ from the effects of temperature and humidity on mortalities due to COVID-19. A positive correlation coefficient between temperature or humidity and COVID-19 cases resulted from the studies in [37, 38] . In some instances, this positive correlation is attributed to no lockdown states, frequent movements in the summer, etc. However, a negative correlation has mostly resulted in studies relating COVID-19 mortalities to temperature or humidity in [32, 35, 39] , probably due to the devastating nature of the virus in the cold season. Data used for the experiment include COVID-19 cases and deaths data for 36 countries acquired from John Hopkins University's Center for Systems Science and Engineering (JHU CSSE) data repository provided in [40] . Weather data used for the experiments include temperature, relative humidity, rainfall and solar irradiance. Data on world demographics and COVID-19 global reports were acquired from [1] . Average daily temperature ( , rainfall data (mm), solar irradiance in (kw-hr/m 2 /day) and relative humidity data ( ) were acquired for 36 countries from the National Aeronautics and Space Administration's (NASA) solar and meteorological data sets viewer [41] . The four weather variables from [41] comprise daily readings taken from January 1, 2020 to June 8, 2021. COVID-19 cases and deaths data for 36 countries contain daily readings from January 22, 2020 to June 15, 2021. One of the goals of this paper is to train models that can forecast the rate of COVID-19 pandemic in clusters of countries in the world with similar weather and pandemic conditions. Many countries experienced an increase in the COVID-19 cases with the lifting of lockdowns in the summer of 2020. Plots of COVID-19 daily cases and daily mortalities with time (for the period from January 22, 2020 till June 15, 2021) for the five (5) major continents of the world based on COVID-19 data available from [41] are presented in Fig. 1 A 15-term moving averager was convolved with each time-series data to obtain the smooth curves in Fig. 1 (a) and (b). Let y be the number of days in the COVID-19 time-series set. Averaged or convoluted responses for each country were obtained by convolving each country's set of cases or mortalities data, respectively, with a 15-term, 1 stride, moving averager. The convoluted response, [ ], can be expressed as: Here, k is the number of delay terms. Responses in the form of: { [ ] }, were obtained for the cases and mortality series, defined as, { }, and, { }, respectively. Plots of daily cases with time in the five (5) continents -Africa, Asia, Europe, North America and South America are shown in Fig. 1 (a) . There are pronounced peaks for Asia, Europe and North America in the months November 2020 to January 2021 and March to April 2021. Though some countries had enforced a second lockdown, the cases peaked for three continents in the last quarter of 2020. In 2021, cases peaked for Asia, within April to May 2021, as a result of the delta COVID-19 variant that is believed to have originated in India. In Fig. 1 (b) however, there are pronounced peaks within April to May 2020 and November 2020 to January 2021 for North America. Europe peaked within April to May 2020 and November 2020 to April 2021. It is noteworthy that the daily mortality curves dropped in the summer with the lifting of lockdowns in 2020 in Europe and North America, though there were rises in the daily cases. Africa peaked within July to August 2020, and January 2021. South America peaked from July to September 2020; and within April to May in 2021. Evidently, the mortality curves are not proportionate to the cases' curves for the five continents as the year progresses, meaning that there are significant factors affecting mortality due to COVID-19. Mortalities followed a bathtub curve for North America and Europe, especially, while daily cases kept on rising with time, through January to December 2020. This makes the use of cases or mortalities as responses difficult for COVID-19 forecast. Series of box plots for the daily cases and mortalities per country are shown for 36 countries in Fig. 1 (c) and (d), respectively, for a period from January 22, 2020 through June 15, 2021. The box plots for the daily cases and mortalities show that the magnitude, range and average of the daily cases and mortalities vary per country. Using number of COVID-19 daily cases or daily mortalities as responses for prediction for a cluster of countries with varying population pose a problem as mortality is a function of the cases which are in turn function of a country's population, thereby making them unstable responses. Also, as the mortalities are dependent on the number of cases, it is ideal to take both the cases and mortalities into consideration, hence we do this by dividing the cases by mortalities. This is termed the case-to-mortality (C:M) ratios. This serves as a common metric for comparing the impact of COVID-19 amongst nations with varying demographics. We define the COVID-19 cases-to-mortality C:M response as the ratio of the daily number of COVID-19 cases c to number of daily mortalities m delayed by 7 days. C:M is similar to case fatality rate used in epidemiology and has been studied in [42, 43] . In determining the C:M ratios, the mortalities series were offset by 7 days from the cases series, to factor in a delayed effect of cases on mortalities. A 7-days delay was used because mortality peaks around 7 days later than the cases as observed from the acquired data. However, we evaluated the best delayed effect in this Section by varying delays of (a) 7 (b) 14 (c) 21 and (d) 28 days cases on mortality ratios. We demonstrate this by showing how these four different delays manifest in C:M plots for two major continents -Europe and North America. The results are shown in Fig. 2 (a) for Europe and in Fig. 2 (b) for North America. A glance at the curves in Fig. 2 (a) and (b) shows that the peaks and troughs increase in magnitude as the delays decrease from 28 to 7 days. Therefore, the magnitude of the curve with 7-days delay in both plots is more impactful than other delays, and is chosen for the rest of the experiments. It was also reported that mortality peaks a week after COVID-19 cases peak in [42] , therefore this pre-analysis conforms with the results in [42] It can be observed that rise and fall patterns in most of the curves in Fig. 1, Fig. 2 and Fig. 3 , appear to be seasonal. Furthermore, the C:M ratio plots in Fig. 2, Fig. 3 (a) and (d), tend to peak after about a month's delayed effect during the hottest periods in summer and dip after about a month's delayed effect during the coldest periods in the winter. Series of box plots for the daily C:M ratios for 36 countries are shown Fig. 3 and Portugal; thereby making this metric a stable response for predicting the rate of COVID-19 spread in some clusters of countries with similar C:M ratios. The goals of this paper are to evaluate if COVID-19 cases or mortalities are influenced by some demographic, geographical, economic and climatic factors in each country and to consider the most influential factor in forecasting new COVID-19 cases-to-mortality (C:M) rates per country. Factors that could affect the COVID-19 cases or mortality in a country include but are not limited to:  Climatic and geographical parameters, viz., temperature, rainfall, windspeed, solar irradiation, relative humidity, approximate geographical location of a country  Economic and demographic factors, viz., country's population and population density, median age per country, average age of deaths per country, total tests, number of tests per 1 million inhabitants, effectiveness of lockdown, health amenities available, country's GDP per capita.  Social factors, viz., COVID-19 awareness and cooperation of inhabitants, health sector response. It must be noted that this study does not take into account vaccination schedules for countries since the study was done when vaccines were not vastly prevalent and the delta strain was not in play. Existing works have studied the effects of the weather parameters, viz., humidity, temperature, solar irradiation, rainfall, and wind speed, on COVID-19 pandemic. A key factor is the effect of temperature on COVID-19, whose effect has been considerably researched though not adequately utilized in time series forecast of COVID-19. There are few publications that have measured the impact of the economic and social factors on the COVID-19 pandemic. In [44] [45] [46] , the authors studied the adequate or inadequate health sector response to COVID-19 in specific countries or worldwide, In [47, 48] , the authors studied the successes, challenges and results of the effectiveness of lockdown. These factors are rarely quantified in existing works. Also, there are recent publications highlighting the failure of COVID-19 forecasting models, with drastic underpredictions and over predictions, which have been misleading in policy making. The objectives of the paper are to: 1) Determine if and how some covariates arising from the COVID-19 pandemic affect COVID-19 cases, mortality and cases-to-mortality (C:M) response by conducting a series of Granger-causality tests. Other covariates or factors not studied in this paper are possible areas for future work. 2) Evaluate the impact of the four weather parameters, viz., temperature, rainfall, solar irradiation and relative humidity, on COVID-19 cases, mortality and cases-to-mortality (C:M) response by conducting a series of Granger-causality tests on data from 36 countries with varying weather conditions. 3) Establish the relationship of the four weather parameters (predictors) on COVID-19 cases-to-mortalities (C:M) ratios (response) in 36 countries via regression analysis. 4) Factor in the most impactful parameter(s) in the training of models for forecasting COVID-19 C:M response. Temperature data were used along with the C:M as input to some time-series based forecasting models. 5) Group countries according to their similar COVID-19 impact and climatic conditions in order to train a number of models, corresponding to the number of country groups, for the forecasting of COVID-19 C:M response. 6) Train models for COVID-19 forecasting using two approaches -Random forest and Deep learning. Test models for COVID-19 forecasting using new data for each country. The influence of some demographic, geographical, economic, and climatic covariates or factors on COVID-19 cases or mortality in each country or across 36 countries were studied in this paper. The effects of these covariates on each country's COVID-19 response data and across 36 countries were examined using the Granger-causality tests for defined null hypothesis. The demographic covariates are country's population, population density, median age; geographical factors include each country's average longitude and latitude information; economic covariates are GDP per Capita, total COVID-19 tests carried out and tests per 1 million population. The four (4) weather factors are temperature, rainfall, solar irradiation and relative humidity. The null hypothesis for the demographic , geographical , economic , and weather factors, are stated as follows: : A series of demographic data of 36 countries does not affect countries' yearly response of COVID-19 cases, mortalities, and cases-to-mortalities response. : A series of geographical location of 36 countries does not affect countries' yearly response of COVID-19 cases, mortalities, and cases-to-mortalities response. : A series of economic covariates of 36 countries does not affect countries' yearly response of COVID-19 cases, mortalities, and cases-to-mortalities response. : A country's weather data over time does not affect the country's daily response of COVID-19 cases , mortalities , and cases-to-mortalities (C:M) response with time. The null hypothesis was tested using the Granger-causality test [49] . The p-value, at 5% significance level (0.05), is used to accept or reject the null hypothesis, . When the p-value < 0.05, is rejected. Predictors for regression analysis constitute temperature, rainfall, solar irradiance and relative humidity. 144 sets of regression analysis of C:M responses on measurements of rainfall, temperature, solar irradiation and relative humidity from 36 countries were carried out. Regression analysis was adopted in order to quantify the impact and relationship (if any) between the predictors and responses. Previous works focused mostly on correlation analysis, which determines to what extent weather conditions impact COVID-19 cases; however, it is desirable to additionally determine, the relationship between the weather conditions and COVID-19 cases or mortalities and establish the relationship. This is most effective using regression analysis. Two methods were explored for training models for forecasting in this paper: (1) Growing random forests and (2) Deep learning. The architectures of the two models used in this paper are outlined in Table 1 . Each random forest model was grown with 200 regression trees. Random Forest employs bootstrap aggregation in combining learning outputs from all the trees. Long Short-Term Memory (LSTM) networks were used as building blocks in the deep learning architecture since the COVID-19 data is a time-series data. The LSTM networks have Sigmoid gate activation function. LSTM networks were used in forecasting of COVID-19 in [50, 51] . The input layer has an input size of 1 or 2 depending on the feature size. A causal and effect approach was followed in determining the impact of these four weather factors on C:M ratios. Hence, in this analysis, the following were factored-in: 1) Most countries first instance of confirming their first case of COVID-19 (between the last week of January and early February 2020). 2) Initial stage of actively containing the virus in various countries (in February and March 2020). 3) Inability to determine or confirm how many persons contracted the virus in the early stages of the COVID- 19 infection, resulting in erroneous recording of COVID-19 cases and mortalities [52, 53] , which varies by country (in February and March 2020). 4) The fact that if weather conditions, such as temperature, impact and furthers the spread of COVID-19, the infection will actually take root in a person after 2 to 14 days (2 weeks) of exposure to the virus, i.e., the incubation period [54] [55] [56] . 5) The fact that some patients in critical conditions die between 1 to 4 weeks after contracting the virus [57] [58] [59] . Therefore, forecasting was based on a cause and delayed-effect approach that took all factors above into consideration. More specifically, the weather parameters were offset by 4 weeks with respect to the response. Further, COVID-19 data between January 22 and March 31, 2020 were not used in this analysis. Three major experiments in this work constitute a series of Granger-causality tests on covariates; regression analysis for determining the impact of weather data on the COVID-19 responses for 36 countries; and training of models for forecasting the COVID-19 cases-to-mortality in clusters of countries with similar impact from weather conditions, determined through regression analysis. This work was carried out in a MATLAB 2020a environment on a Macintosh Operating System with 16GB of RAM. Granger-causality tests were carried out to test the demographic , geographical , economic and weather factors as described in Section 6.1. In the case of , , and tests, the yearly average cases, mortalities and C:M responses were used for each country, since the covariate data of each country is singular. In the tests, all daily cases, mortality and C:M time-series data from January 22, 2020 through June 15, 2021 were used since the weather data comprises daily average measurements. s were rejected for all p-values < 0.05. C:M responses from 36 countries were regressed on the temperature, solar irradiation, rainfall and relative humidity data for each country, with emphasis on these parameters: regression equation, sum of squares due to regression, and error and the statistical p-value of each regression line's slope. The regression analysis was based on fortnightly grouping of the weather predictors and COVID-19 response data. Training and testing data comprise COVID-19 and temperature data from February 1 to November 30, 2020, and December 7, 2020 to June 15, 2021, respectively. [ ] responses were calculated from temperature data of the 36 countries. The COVID-19 data is a time-series data and, hence, the input and target data only differ in a time-step. The most impactful weather condition from the regression analysis was factored in; the 2D input features comprise the COVID-19 and temperature data. The output data comprises the COVID-19 data delayed by a time-step. All input data for each country were standardized to have a mean of zero and a standard deviation of 1. Countries were grouped according to their (1) similar trend of C:M ratios and (2) similar weather condition, and following this order of priority. We determined the similarity in the trend of C:M ratios by normalizing all C:M ratio-series per country to values between 0 and 10 and computing an L2-norm similarity score from a difference matrix of all countries' C:M series. Given a minimum difference score, , for each country's (say c) comparison with other 35 countries, a search for countries with similarity scores below a threshold, { }, selects countries similar to c. It is remarkable that majority of the countries with similar weather conditions appear to have similar trends in C:M ratios. Hence, countries were grouped according to their similarity scores. Countries with noisy data were avoided in the forecasting experiments. The selected groups are as follows:  Group 12 = Germany, Japan In order to mark the improvements in trained forecast models due to incorporating a weather parameter, we trained models with and without temperature data features, using the same architecture, data and training parameters. There is only a difference in the input layer size, which is 2 in the case of COVID-19 and temperature data input, and 1 in the case of COVID-19 data input only. Each set of random forest and deep learning models had twelve (12) groups of models trained with (a) COVID-19 and temperature data and another twelve (12) groups of models trained (b) without temperature data, respectively. In any group comprising data from multiple countries, data from the countries were not mixed together, in order to maintain the sequence or pattern of data for each country. Hence, a model is first trained with the set of data for a country, then retrained with the next set of data from the following country, and so on, within each group. All forecasting models were trained with the same parameters. Bootstrap aggregation method was applied in the random forest training. The adaptive moment estimation (ADAM) [60] optimization method was used for the gradient descent in deep learning. Initial learning rate, learn rate drop factor, L2 regularization, minibatch size and maximum epochs were set to 0.01, 0.1, 0.0001, 15 and 250, respectively, in the deep learning approach. Data was never shuffled as that would disrupt the pattern of the training data. Results of Granger-causality tests on the eight (8) covariates on cases, mortalities and C:M responses across 36 countries for demographic , economic and geographical factors, are shown in Table 2 . Granger-causality tests of the null hypothesis for Population, Total tests / 1M and GDP per capita of 36 countries on C:M responses yielded p-values < 0.05, in the last row of Table 2 , and hence is rejected for these cases for C:M. It appears that population, total tests per 1M and GDP per capita affect C:M. An explanation for the rejection of in the three cases, is explored by plotting the distribution of covariates versus their respective yearly average C:M responses in Fig. 4 . The plots in Fig. 4 focus on the distribution across the 36 countries, rather than on each country. The stem plots show weak but significant relationships between the three covariates: population, Tests / 1Mpop, GDP per capita and C:M response. Other than a few countries with a low population, it appears that the yearly average C:M response increases as population increases. Yearly average C:M responses also increase as Tests / 1 M population increase. Other than some countries with low GDPs / Capita, probably in Africa where solar irradiation and high temperatures give some added advantage, yearly average C:M responses increase as GDP per Capita increase. These might contribute to some further explanations in this paper. Results of Granger-causality tests on the four (4) weather data on cases, mortalities and C:M responses within each country are shown for 36 countries in Table 3. is rejected for all cases where the p-value < 0.05. The last row of Table 3 shows the total number of rejected for each weather parameter for the three responses: cases, mortalities and C:M. The parameters, solar and temperature, have a very strong impact on mortalities, from the high number of rejected. The temperature parameter appears to show the strongest relationship for a number of countries for the three responses. of 26 countries and 16 countries were rejected for temperature effects on mortality and C:M responses, respectively. 26 and 16 are the two highest total numbers in the last row of Table 3 . Other than Portugal and Netherlands, temperature impacts COVID-19 mortalities in all temperate countries, as well as many tropical countries. C:M responses are affected by temperature in ten (10) temperate countries. This shows that temperature is indeed a significant parameter in forecasting the COVID-19 pandemic. It also shows that the response, C:M, is reliable. The results for the regression analysis of C:M responses on the four predictors from 36 countries are shown in Table 4 through Table 7 . These tables provide the linear regression line equation intercept C and slope for each country, Cf, is provided with the corresponding slope Sc. This helps compare the magnitude of the slopes across all 36 countries and across all four (4) predictors: temperature, solar irradiation, rainfall and relative humidity. The rest of the results portray the p-value of the slope, Pv; the regression sum of squares, SSR; total sum of squares, SST; the error sum of squares, SSE; SSR > SSEwhich is indicated by a "yes" or "no"; the average and minimum or maximum temperatures (predictors); and the minimum and maximum C:M ratios, for each country. The major results of the regression analysis are the SSR and SSE. The relationship is assumed to be due to regression when SSR > SSE. When SSR < SSE, the regression line is mostly due to random noise and could affect the slope. In the results, any country with an extraordinary large slope, where SSR > SSE, normally has a steep regression line due to a large range in C:M responses and / or small range in minimum and maximum predictor values. The annual minimum and average temperatures are presented in Table 4 . In Table 4 , the p-values of the gradients are less than 0.05 in 20 countries indicating that there is a significant relationship between temperature and the spread of COVID-19 in those countries. The regression analysis in 24 out of 36 countries in Table 4 have a positive gradient showing that temperature is directly proportional with C:M ratios in these countries. Other than New Zealand, the countries with an impactful negative slope (< -1.2) at a fixed intercept of 100, are countries with average temperatures (tropical weather countries). In most of the temperate countries, temperature is positively correlated with C:M responses. This shows that the C:M ratio increases as temperature increases, with the exception of Russia and USA, where Tests per 1M, GDP per capita or other covariates that are not tested in this paper, might be influencing the results. The Tests per 1M from source [1] are high for both USA and Russia. Additionally, USA has one of the highest GDP per Capita [1] . It is not surprising that New Zealand has a negative slope because New Zealand has the most minimal daily cases and mortalities amongst all other 35 countries, as seen in the box plots in Fig. 1 (a) and (b) . It is, therefore, much less affected by COVID-19. In tropical countries, there is a short variation in temperature across all seasons and, hence, temperature does not impact COVID-19 in tropical countries as much as in temperate countries, resulting in mostly negative slopes in the tropical countries. For the set of intercepts fixed at a height of , the corresponding slopes are significant as well. SSR > SSE in 18 out of 36 countries in Table 4 , where 15 of them are temperate countries. This is remarkable for the temperature predictor. The results for the regression analysis of C:M responses on measurements of solar irradiation from 36 countries are shown in Table 5 . C:M responses were not repeated in Table 5, Table 6 and Table 7 , since the values are the same as those in Table 4 . The p-values for 18 of the slopes in Table 5 are less than 0.01. Solar irradiation is positively correlated with C:M in 23 countries: this includes all temperate countries other than Russia, USA and New Zealand, for the same reasons explained above. SSR > SSE in 13 out of 36 countries in Table 5 and the magnitude of the slopes when the intercepts fixed at a height of are high, signifying a high impact of solar irradiation on COVID-19 in the 13 countries. The p-values for the 13 countries in Table 5 are . Solar irradiation impacts COVID-19 in the 13 countries where the minimum solar irradiation is less than 1.90 kw-hr/m 2 /day and the difference between the average and minimum solar irradiation is greater than 2.6 kw-hr/m 2 /day in the country. Other than Norway, all the 13 countries where solar irradiation have a high SSR are temperate countries, clearly showing that a high differential in solar intensity impacts C:M responses. Norway has few cases and mortalities compared to other temperate countries and has the second highest GDP per Capita amongst the 36 countries [1] . Therefore, C:M ratios increase with significant increase in solar irradiation. The results for the regression analysis of C:M responses on measurements of rainfall from 36 countries are shown in Table 6 . The minimum rainfall is below 0.05 mm for majority of the 36 countries. SSR > SSE in only six (6) out of 36 countries with p-values in Table 6 . Other than Brazil, rainfall positively correlates with C:M responses in the six countries with SSR > SSE. The frequency and high differential in rainfall intensities could have contributed to a high SSR in the six countries. Therefore, rainfall reduces the impact of COVID-19 in these countries. The results for the regression analysis of C:M responses on measurements of relative humidity from 36 countries are shown in Table 7 . The slopes in only 13 out of 36 countries are < 0.05. SSR > SSE in nine (9) countries with pvalues . Tropical countries are usually humid compared to temperate countries. In the nine countries where SSR > SSE, only Nigeria has a significant corresponding slope of 6.93 when the intercept is fixed at -100 and is positively correlated with C:M responses. The corresponding slopes for other eight countries are < |2.00|, showing that relative humidity has no significant impact in those countries. In summary, the regression analysis studies in this Section show that temperature, solar irradiation, and rainfall are positively correlated with C:M responses. Relative humidity plays a role in humid countries. The high SSR obtained for the temperature predictor for 18 countries, including most of the temperate countries, shows that temperature has the greatest impact on C:M responses. In majority of the temperate countries, C:M ratios increase with rise in temperature, and vice versa, as experienced from April 2020 till February 2021. The most impactful weather condition is, hence, temperature going by the conclusions of Section 8.2 on regression analysis. Two sets of twenty-four (24) COVID-19 forecasting models were trained using (1) Deep learning and (2) Random forest approaches. A set of 24 trained models comprises (a) 12 models with temperature data input and (b) 12 models without temperature data input. All models were evaluated in this Section using COVID-19 cases and deaths test data [40] between December 7, 2020 and June 15, 2021 pertaining to twenty (20) countries. Temperature test data comprises average daily readings from November 7, 2020 to May 15, 2021 for the countries within groups 1 to 9 listed in Section 7.3. COVID-19 data comprises a 184-day test set which eventually gets reduced to a time-series of 169 daily cases and daily mortalities due to the averaging based on Equation (1) Test sets of twenty (20) countries grouped into twelve (12) categories based on Section 7.3. were evaluated on (a) the 12 models trained with temperature input and (b) the other 12 models trained without temperature input for both deep learning and random forest models. In this evaluation, we pay particular attention to the trend of the predicted outcome, F, versus the observed outcome, O, in a plot for each country. The root means square error (RMSE) for a set of n+1 samples is calculated as follows: The results for the evaluation on Random Forest (RF) and Deep learning (DL) models, with and without temperature (temp), are shown as RMSE in Table 8 . The kinds of models are RF no-temp, DL no-temp, RF temp and DL temp. The lowest RMSE i.e., the best results per country, are emphasized in bold letters. It can be observed that the deep learning model with temperature input presents the lowest RMSE for many of the temperate countries in the evaluation in Table 8 . The rightmost column shows the total RMSE for each kind of model. The total RMSE is lowest for DL temp, followed by DL no-temp, RF temp and RF no-temp. We further examine the results of the best two approaches, DL temp and DL no-temp, by plotting the forecast from 12/07/20 through 07/15/21. Results for the evaluation on models trained with both COVID-19 and temperature input (DL temp) and those with COVID-19 data input only (DL no-temp) are presented in plots in Fig. 5 and Fig. 6 , respectively. Each sub plot in Fig. 5 and Fig. 6 portrays the observed and forecasted outcome for each of the 20 countries. The legend of the plots for the observed and forecasted outcomes are symbolized as O and F, respectively, in Fig. 5 and Fig. 6 . The root means square error (RMSE) between the predicted and observed responses are also provided per country in the plots in Fig. 5 and Fig. 6 . The RMSE values for most of the temperate countries in Fig. 5 , USA, UK, Italy, Argentina, Russia, South Africa, Switzerland, Germany, and Portugal are remarkably less than those for the same countries in Fig. 6 . RMSE for DL temp is also lower for other countries, namely, Mexico, Egypt, Botswana and Ethiopia. RMSE in thirteen (13) countries out of 20 countries are lower for DL temp in Fig. 5. In the remaining seven (7) countries, the RMSE of DL no-temp in Fig. 6 is slightly, but not significantly, lower than those of DL temp in Fig. 5 . Further, lower RMSE of Brazil, Iran, Nigeria and Uganda, shows that temperature input is not essential in forecasting models for tropical countries. Johannesburg is a temperate country so it is not surprising that there is a strong impact of temperature on COVID-19 in South Africa. The trend of the forecasted outcomes in most of the countries in Fig. 5 are also seen to follow the trend of the observed outcomes, showing that incorporating temperature features does indeed improve prediction accuracy for the temperate countries. In summary, accuracy of predictions on COVID-19 forecasting models for temperate countries can be improved by incorporating temperature data as input in the forecasting. Granger-causality tests were used to determine the influences of certain geographical, economic, demographic, and weather covariates on COVID-19 response. Other covariates not considered in this paper constitute areas for future studies. Regression analysis was used in analyzing the impact of weather parameters, viz., temperature, solar irradiation, rainfall, and relative humidity, on filtered responses of COVID-19 cases-to-mortality ratios. Results show that few covariates influence COVID-19 spread, and weather parameters have varied effects on the COVID-19 cases to mortality ratios (C:M). Out of the four weather parameters, temperature has a significant effect on COVID-19 C:M ratios. Results show a great impact in temperate countries with widely varying temperature profiles, while tropical countries appear to be unaffected by the temperature parameter. COVID-19 C:M ratios decrease as it gets colder in the temperate countries and this calls for urgent attention: the need for more vaccines for COVID-19 virus, for instance. Prediction results from COVID-19 forecasting models trained with temperature features are better than those of forecasting models trained without temperature features, showing that the temperature parameter should be factored-in for accurate forecasting of COVID-19 responses. Update (Live): Cases and Deaths from COVID-19 Virus Pandemic -Worldometer Forecasting for COVID-19 has failed The Psychology Underlying Biased Forecasts of COVID-19 Cases and Deaths in the United States A case study in model failure? COVID-19 daily deaths and ICU bed utilisation predictions in New York state Forecasting efforts from prior epidemics and COVID-19 predictions Lessons learnt from easing COVID-19 restrictions: an analysis of countries and regions in Asia Pacific and Europe Universal weekly testing as the UK COVID-19 lockdown exit strategy Does Density Aggravate the COVID-19 Pandemic?: Early Findings and Lessons for Planners Facing the COVID-19 outbreak: What should we know and what could we do? Weather Variability and COVID-19 Transmission: A Review of Recent Research Artificial intelligence vs COVID-19: limitations, constraints and pitfalls Dynamics identification and forecasting of COVID-19 by switching Kalman filters Enhanced Gaussian process regression-based forecasting model for COVID-19 outbreak and significance of IoT for its detection Early Prediction of the 2019 Novel Coronavirus Outbreak in the Mainland China Based on Simple Mathematical Model Modeling and forecasting the spread tendency of the COVID-19 in China Data Modeling with Polynomial Representations and Autoregressive Time-Series Representations, and Their Connections Forecasting the spread of the COVID-19 pandemic in Saudi Arabia using ARIMA prediction model under current public health interventions Forecasting of COVID-19 time series for countries in the world based on a hybrid approach combining the fractal dimension and fuzzy logic Dynamic tracking with model-based forecasting for the spread of the COVID-19 pandemic Prediction of the COVID-19 spread in African countries and implications for prevention and control: A case study in South Africa Reconstructing and forecasting the COVID-19 epidemic in the United States using a 5-parameter logistic growth model Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil Predicting COVID-19 in China Using Hybrid AI Model Forecasting spatial, socioeconomic and demographic variation in COVID-19 health care demand in England and Wales Deep learning methods for forecasting COVID-19 time-Series data: A Comparative study Neural network powered COVID-19 spread forecasting model Early forecasting of the potential risk zones of COVID-19 in China's megacities DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive Surveillance of COVID-19 Using Heterogeneous Features and Their Interactions Time series forecasting of Covid-19 using deep learning models: India-USA comparative case study COVID-19 Future Forecasting Using Supervised Machine Learning Models Scenario-driven forecasting: modeling peaks and paths. Insights from the COVID-19 pandemic in Belgium The temperature and regional climate effects on communitarian COVID-19 contagion in Mexico throughout phase 1 Effect of weather on COVID-19 spread in the US: A prediction model for India in 2020 Relationship between COVID-19 and weather: Case study in a tropical country Temperature Decreases Spread Parameters of the New Covid-19 Case Dynamics Eco-epidemiological assessment of the COVID-19 epidemic in China Temperature and precipitation associate with Covid-19 new daily cases: A correlation study between weather and Covid-19 pandemic in Oslo Correlation between weather and COVID -19 pandemic in India: An empirical investigation Short-term effects of specific humidity and temperature on COVID-19 morbidity in select US cities National Aeronautics and Space Administration's (NASA) solar and meteorological data sets Estimating the infection-fatality risk of SARS-CoV-2 in New York City during the spring 2020 pandemic wave: a model-based analysis The determinants of COVID-19 case fatality rate (CFR) in the Italian regions and provinces: An analysis of environmental, demographic, and healthcare factors Public Health Response to the Initiation and Spread of Pandemic COVID-19 in the United States Elderly people and responses to COVID-19 in 27 Countries COVID-19 outbreak, social response, and early economic effects: a global VAR analysis of cross-country interdependencies Ranking the effectiveness of worldwide COVID-19 government interventions The Efficacy of Lockdown Against COVID-19: A Cross-Country Panel Analysis, Appl. Health Econ. Health Policy Granger causality revisited Predictions for COVID-19 with deep learning models of LSTM Time series forecasting of COVID-19 transmission in Canada using LSTM networks Identifying policy challenges of COVID-19 in hardly reliable data and judging the success of lockdown measures Why is it difficult to accurately predict the COVID-19 epidemic? Serial interval of novel coronavirus (COVID-19) infections Rapid asymptomatic transmission of COVID-19 during the incubation period demonstrating strong infectivity in a cluster of youngsters aged 16-23 years outside Wuhan and characteristics of young patients with COVID-19: A prospective contact-tracing study Estimation of incubation period and serial interval of COVID-19: Analysis of 178 cases and 131 transmission chains in Hubei province, China Time-to-Death approach in revealing Chronicity and Severity of COVID-19 across the World Case-fatality risk estimates for COVID-19 calculated by using a lag time for fatality Estimating the infection-fatality risk of SARS-CoV-2 in New York City during the spring 2020 pandemic wave: a model-based analysis Adam: A method for stochastic optimization All owners of the free public data used in this paper are duly acknowledged.