key: cord-0936167-cqxzr8g5
authors: Jin, Weiqiu; Dong, Shuqing; Yu, Chengqing; Luo, Qingquan
title: A data-driven hybrid ensemble AI model for COVID-19 infection forecast using multiple neural networks and reinforced learning
date: 2022-04-27
journal: Comput Biol Med
DOI: 10.1016/j.compbiomed.2022.105560
sha: 34b28124ad0bff7059f5440e3197ef15bdcc08d7
doc_id: 936167
cord_uid: cqxzr8g5

The COVID-19 outbreak poses a huge challenge to international public health. Reliable forecast of the number of cases is of great significance to the planning of health resources and the investigation and evaluation of the epidemic situation. The data-driven machine learning models can adapt to complex changes in the epidemic situation without relying on correct physical dynamics modeling, which are sensitive and accurate in predicting the development of the epidemic. In this paper, an ensemble hybrid model based on Temporal Convolutional Networks (TCN), Gated Recurrent Unit (GRU), Deep Belief Networks (DBN), Q-learning, and Support Vector Machine (SVM) models, namely TCN-GRU-DBN-Q-SVM model, is proposed to achieve the forecasting of COVID-19 infections. Three widely-used predictors, TCN, GRU, and DBN are used as elements of the hybrid model ensembled by the weights provided by reinforcement learning method. Furthermore, an error predictor built by SVM, is trained with validation set, and the final prediction result could be obtained by combining the TCN-GRU-DBN-Q model with the SVM error predictor. In order to investigate the forecasting performance of the proposed hybrid model, several comparison models (TCN-GRU-DBN-Q, LSTM, N-BEATS, ANFIS, VMD-BP, WT-RVFL, and ARIMA models) are selected. The experimental results show that: (1) the prediction effect of the TCN-GRU-DBN-Q-SVM model on COVID-19 infection is satisfactory, which has been verified in three national infection data from the UK, India, and the US, and the proposed model has good generalization ability; (2) in the proposed hybrid model, SVM can efficiently predict the possible error of the predicted series given by TCN-GRU-DBN-Q components; (3) the integrated weights based on Q-learning can be adaptively adjusted according to the characteristics of the data in the forecasting tasks in different countries and multiple situations, which ensures the accuracy, robustness and generalization of the proposed model.

With the COVID-19 pandemic, the world's health system is suffering a huge impact. The effective estimation 35 (or prediction, forecasting) of the number of COVID-19 cases will be of great help for each country to plan its 36 own health policies (including vaccination, quarantine, isolation, lockdown, social distancing, etc.) and estimate 37 the economic and social losses of the epidemic 1 . Scholars have been committed to solving the problems of 38 COVID-19 incidence prediction and epidemiological modeling, and proposed epidemiological models (SIR 2 , 39 SEIR 3,4 , SIRD 5 , phenomenology 6 , etc.), time series models (autoregressive models 7,8 , exponential models 9 , 40 regression model 10,11 , Prophet model 12 , etc.), machine learning model (based on regression tree 13 , LSTM 14 , 41 polynomial neural network 15 , ANFIS 16 , SVM 17 , etc.) and other types of models 18 . 42 Classical epidemiological studies are mostly deterministic and works with large populations 18 . They are 43 constructed based on correct physical dynamic modelling, which are based on SIR dynamics models and 44 parameter estimation methods in statistics to complete the modeling of the epidemiological pathology and 45 transmission process, and then to predict the process characteristics of the disease epidemic, but the accuracy 46 of such dynamic models depends on a complete and accurate description of the dynamics process and highly 47 dependent on the results of reliable parameter estimation 18 . Therefore, although the SIR dynamics model can 48

give a long-term analysis of transmission characteristics, their reliability is limited by many changing factors 49 including population immunity status of diseases (such as vaccines), public events (such as quarantine, 50 migration, and other policy changes) and so on. For example, in Bhattacharjee et al.'s SAHQD (Susceptible, 51 infected, hospitalized, quarantined, deceased) model, a complex multi-compartment dynamics model, although 52 information on the social distancing measures and diagnostic testing rates are incorporated to characterize the 53 dynamics of the various compartments of their model, the degree of social distance restrictions and the mobility 54 within the population were neglected 19 . 55

Data-driven models, including statistic models and machine-learning models, overcome these shortcomings to a great 56 extent. Statistic models such as Autoregressive Integrated Moving Average (ARIMA), Seasonal Autoregressive 57

Integrated Moving Average (SARIMA) were adopted by previous COVID-19 infection prediction studies. K.E. 58 compartments, etc., and defining the transition relationships among them through several differential equations. 141

Since the onset of the COVID-19 epidemic, multi-generational kinetic models have emerged, whose evolution 142 route could be summarized as: SIR→SEIR 26 , SLIR 27 (considering close contacts or the latent) → SEIAIR 28 143 (considering asymptomatic infected persons), SAHQD 19 (considering quarantine policies) → SEIRMH 29 , 144 SEPIAHR 30 (considering medical-related factors) → SCUAQIHMRD 31 (considering COVID-19 hierarchical 145 treatment). 146 2) Time series analysis. The simpler exponential smoothing model arranges the data in chronological order from 147 new to old. The weights are assigned from large to small, and the weight values are exponentially decreasing. 148

In addition to exponential smoothing, which smoothes the data based on exponentially decreasing parameters, 149 there is also the common method of fitting an ARIMA 32-34 model, which consists of three parts: the 150 autoregressive process (AR), the differential part (Integrated) and the moving average process. In addition, there 151 are multivariate time series analysis based on the standard Autoregressive model (AR) 35 have been widely researched in the COVID-19 forecast. Long short-term memory (LSTM), the most 160 representative one of deep-learning, is a special kind of RNN, which is mainly designed to solve the gradient 161 disappearance and gradient explosion problems during the training of long sequences. LSTM has been proved 162 to be a powerful deep-learning network for the forecast of COVID-19 infections 40-43 . 163 A comparison of the advantages and disadvantages of the four types of prediction approaches is given in Table  164 1(B). From the literature review, it can be seen that the current research is still less on deep-learning based 165 ensemble models and more focused on broadening the amount of information covered by existing models from 166 multiple data modalities. It is meaningful to investigate how to further optimize the performance of deep 167 learning models on single-input time series through weight optimization and error compensation strategies in 168 the ensemble framework to further improve the accuracy and generalization of COVID-19 forecast models. 169 Step 1: Preprocess and normalize time series data. 191 Step 2: Train the three neural networks, TCN, GRU and DBN, with the time-series data of the training set 192 respectively, and then input the test set into the network to obtain the test result sequence, denoted as x1, x2, and 193 x3. 194 Step 3: Give a randomized initial weight of each output sequence as ω1, ω2, ω3, and set the output: 195

In order to optimize the weights of three network, reinforce learning method, Q-learning, is used. The 197 optimization goal of Q-learning was to minimize the output RMSE (namely the loss function, or the evaluation 198 function Q in Q-learning training) value: 199

Where i y is the actual value, and ' i y is the forecasting value. The Q-learning is trained with validation set.

Step 4: Calculate the error between O and the actual sequence (in validation set) and get the error sequence R. 202

Taking R as the training set, SVM is used to model the error sequence, which gives a compensation for the 203 predictive result given by the TCN-GRU-DBN-Q model. 204

Step 5 

TCN is adopted as a sub-predictor of the proposed ensemble model. In essence, TCN is an integration of fully 210 convolutional networks, causal convolution, dilated convolution, and residual connections 56 . First, generally, 211 TCN combines the 1D FCN and casual convolutions 57 . In FCN architecture, each layer is the same length as the 212 input layer, and a zero padding layer is added to keep subsequent layers the same length as previous ones, which 213 assures that the output produced by the network is of the same length as the input 56 . Second, the casual 214 convolutions, where an output at time t is convolved only with elements from the time t and earlier in the 215 previous layer, are also adopted in TCN to ensure that there is no information leakage. Third, dilated 216 convolutions enable TCN to adapt to forecasting tasks with a longer history, i.e., to expand the receptive field. 217

For each 1-D input sequence n  x and a filter   : 0,..., 1 fk −→ , the operation F on element s of the 218 sequence is defined as 57 : 219

Where the dilation factor is d, k denotes the size of the filter   : 0,..., 1 fk −→, s-d·i is the direction of the 221 past 57 . Thus, a fixed step is introduced between every two adjacent filter taps, and the larger the dilation is, the 222 wider the input ranges (i.e. an output at the top level) could be, which increase the receptive field to a great 223 extent. Fourth, a residual block (shown in Figure 3 (B) and (C)) is added to the model to allow layers to adapt 224 to the modifications to the identity mapping. It contains a branch leading out to serial transformations () x F : 225

Where o accounts for the outputs added to the input x of the residual block. 227 

GRU is one of the predictors of the ensemble model in this study. The classical LSTM solves the problem of 234 long-term dependencies of Recurrent Neural Networks (RNN). However, its complex structure reduces the 235 efficiency. Therefore, GRU was proposed in 2014 as a simpler design of RNN with the accuracy of original 236 RNN maintained and its efficiency improved 58,59 . There are only two gate structures, reset gate and update gate 237 (a combination of the forgetting gate and the input gate), in a GRU network, which reduces the parameter 238 numbers significantly and improves the model efficiency to a great extent. The ratio between the transmission 239 and retention of information in the past moment is determined by the update gate and the reset gate jointly, and 240 the mathematical expressions of the reset gate and update gate are shown in the following formulas: 241

The update gate t z is: 242

The reset gate t r is: Meanwhile, referring to the stored historical data, the reset gate could be calculated as follows: 252

Where W and U are the weight matrices. The information content could be retained or forgotten, which is 254 determined by calculating the Hadamard product of hj) which could be described as (for a binary RBM 61 ): 266

where V and H are the numbers of visible and hidden units, respectively. Thus, supposing the v or h is fixed, 

Where the p(v|h, θ) is fixed when parameter set θ is obtained from an RBM, and p(h|θ) could be replaced by a 278 As an online learning approach, reinforcement learning (RL) is different from supervised/unsupervised learning. 283

During the process of interaction with the environment, the model obtains the optimal decision through trial-284 and-error, and then obtains the optimal result 65 . As a widely-used RL algorithm in feature selection, driver-less, 285 route planning, and other fields, the Q-learning algorithm was proposed by Watkins et al. in 1989 64 . Considering 286 its good convergence and strong decision-making ability, the Q-learning method is applied as an ensemble 287 learning method in this study, i.e., the Q-learning method is used to integrate three deep networks. 288

The steps of the ensemble method based on reinforcement learning are shown as follows: 289

Step 1: Build the state matrix S and the action matrix a, where the state matrix S denotes the weights of the three 290 deep networks in the ensemble model, and the action matrix a is the weight adjustment action. 291

where w1 is the weight of the TCN network, w2 is the weight of the GRU network, and w3 is the weight of the 294 DBN network. ∆wi (i=1,2,3) in action matrix a represent the weight change of the deep networks. 295

Step 2: Construct the Loss function L, the reward R, and the function Q for evaluation. In this study, the 296 optimization goal was to minimize the output RMSE value. Therefore, the evaluation function Q is defined as: 297

Where i y is the actual value, and ' i y is the forecasting value.

Step 3: Train the agent (namely the ensemble model) based on the training sets of three kinds of deep network. 300

According to the current state S, the agent performs an action a. During this process, the action is selected based 301 on the ԑ-greedy policy as 66 : is the exploration probability. 304

Step 4: Calculate the loss function L, get the reward R, and develop the next step strategy. 305

Am is the measured wind speed data in the training set, ˆ( ) Am is the forecasted COVID-19 infection 308 data in the training set. 309

Step 5: Calculate the evaluation function Q, and update the Q- where λ is the discount parameter, and γ is the learning rate. 312

Step 6: Repeat steps 3 to 5 until the iteration stop condition is satisfied. The state matrix S currently is the 313 optimal weight of three deep networks. 314

Step 7: Input the test set into three well-trained deep networks to obtain the final prediction results. Then, the 315 prediction results of the three deep networks are multiplied by the weight, and then they are ensembled together 316 to obtain the final prediction result. 317

In this study, SVM is applied as a tool for error prediction, which further improves generalization ability and 319 the accuracy of the final ensemble hybrid model. As a classical soft computing learning algorithm, SVM is 320 widely-adopted in regression analysis, classification, pattern recognition and forecasting 68 . Based on the theory 321 proposed by Vapnik 68,69 , suppose a data series   

Here, 2 /2 k is the regularization term, the quantity of features in the training dataset is l. In order to control the 332 difference between the empirical error and regularization term, the error penalty factor C is introduced in the 333 object function. ɛ represents the loss function determined by approximation precision of the training set. To ensure the accuracy and reliability of the data, this process is being constantly refined. Every day between 346 6:00 and 10:00 CET, a team of epidemiologists screens up to 500 relevant sources to collect the latest figures. 347

The data screening is followed by ECDC's standard epidemic intelligence process for which every single data 348 entry is validated and documented in an ECDC database. An extract of this database, complete with up-to-date 349 figures and data visualizations, is then shared on the ECDC website, ensuring a maximum level of transparency 70 . 350 This study used 300 daily data of national cumulative infection numbers from India, the United Kingdom (UK), 351 and the United States (US) from February 19, 2020 to December 14, 2020 (the data were accessed at 19:00 CST, 352 January 19, 2021), and divided the data into three at a ratio of 3:1:1, which are training set (2020/2/19~2020/8/16, 353 180 days), validation set (2020/8/17~2020/10/15, 60 days), and test set (2020/10/16~2020/12/14, 60 days). The 354 specific conditions of each data set are shown in the Table 2 For the validation of the ensemble model, we adopted the Day Forward-chaining, a nested cross-validation 358 method that is suitable for time-series data 71,72 . Day forward-chaining method is essential to keep the sequence 359 of time-series data and prevent the possible information leakage that will caused by the k-fold cross validation. 360

The concrete split of data is shown in Table 2 

We adopted four widely-used and well-acknowledged indices to comprehensively evaluate the prediction 366 performance of the proposed model. They were the mean absolute error (MAE), the mean absolute percentage 367

For the first three indices (MAE, MAPE%, RMSE), the lower the values were, the better the prediction effect 369 of the model was. As for the PCC, it is a commonly-used statistic to reflect the degree of linear correlation 370 between two variables. The value range of PCC is [-1,1], and the closer the absolute value of PCC is to 1, the 371 stronger the linear correlation between the two variables is. In this study, PCCs were calculated to evaluate the 372 correlation between the predicted number and the actual number. The calculation of RMSE is defined before, 373 and the other three indices could be calculated as follows: 

The number of neurons in the input layer needs to be determined experimentally to ensure its matching with the 383 prediction task. We used the data of the number of infected people in India for experiments and took different 384 numbers of input neurons (3, 5, 7, 9) for experiments to determine an optimal number of input neurons. As for 385 the experimental set-up, the essential parameters used in model training are given in Table 3 . 386 investigated models with different input neuron numbers (shown in Table 4 ). As can be seen from the Table 4 , 398 as the number of input neurons increases, the MAPE% value of the model shows a trend of first decreasing and 399 then increasing. According to the four indices, when the number of input neurons is 5, the prediction effect of 400 the model is the best. Therefore, in subsequent experiments, we set the number of input layers of each model as 401 5, that is, the data of the first 5 days are used to predict the data of the next day. 

The model proposed in this paper is mainly composed of three parts: predictor, optimizer, and error correction. 410

The RMSE is used as the error when the model is trained. By analyzing Figure 6 and Table 6 , the following conclusions can be drawn: 417

(1) It can be seen from the results in the table that the prediction accuracy of TCN, GRU, and DBN is not much 418 different on the same data set, but there is still a certain difference in the prediction accuracy between each other, 419 and the performance of TCN, GRU, and DBN varies on different data sets, which indicates the deficiency of a 420 single model in the generalization ability. For example, on the UK data set, TCN (MAE: 2266.439) and GRU 421 (MAPE%: 11.988, RMSE: 3022.743, PCC: 0.707) has better performance than DBN, but DBN has the best 422 performance on the Indian dataset (MAE: 3485.223, MAPE%: 8.861, RMSE: 4563.702) , which shows that the 423 performance of a single predictor is not stable enough and is related to specific data sets. 424

(2) Basically, the proposed hybrid ensemble model obtained by using Q-learning to integrate the TCN, GRU 425 and DBN achieved higher prediction accuracy than a single component on the three data sets, which shows that 426 the integration method proposed in this paper is effective, as can be seen from Table 6 . By adjusting the weight 427 of each predictor in the component, the model can adjust the weight adaptively according to the characteristics 428 of the data set, integrate the advantages of each sub-predictor, and achieve the improvement of the overall 429 accuracy. This feature of the model also improves the robustness of the model, which enables the proposed 430 model to achieve high-precision prediction results on various data sets. Additionally, comparing different data 431 sets, we can see that the accuracy change of the integrated model on the three data sets is lower than its single 432 components (TCN, GRU, and DBN), which indicates that the integrated model has a better stability than single 433 model. 434

(3) In order to further improve the accuracy of the integrated model, it is feasible to adopt the method of error 435 compensation. As can be seen from the results in the table, the accuracy of the model has been further improved 436 by adding the error correction mechanism based on SVM, and the optimal performance has been achieved on 437 each data set. 438 (4) In the ablation study, the weights of three sub-model are 1/3. Namely the effect of Q-learning in weight 439 optimization was cancelled. Compared to TCN-GRU-DBN-Q model that the weights of each output sequence 440 (X1, X2, X3 given by TCN, GRU, DBN, respectively) are determined by Q-learning, the model output O in 441 ablation study could be denoted as: 442

It can be seen that compared the TCN-GRU-DBN-Q model, the ensemble model constituted by equal weights 444 of TCN, GRU, DBN has a bigger forecast error, where the MAE, MAPE%, RMSE in each task are generally 445 higher, and the PCC is generally lower than the model with Q-learning weight optimization. Meanwhile, it can 446 be seen that in some tasks the model in ablation study performed less well than some sub-models (e.g., UK: Table 6 as well. The MAE, MAPE%, RMSE, and 451 PCC values were calculated as the average value of three models trained with the split of data shown in Table  452 2(B). It can be seen that the metrics of the model remain largely stable, which proves the generalization ability 453 of the proposed model. 454

In summary, it can be known that the model integration method proposed in this paper is effective. Each 455 component of the model improves the performance of the model. With their cooperation, the integrated model 456 can provide high-precision COVID-19 prediction results. 457 459 † The greener the unit is, the better performance the model has on the referred index;

# Ablation study was conducted with 0.333*TCN-0.333*GRU-0.333*DBN, namely the effect of Q-learning on weight allocation was ablated; 461 * Forward-chaining validation was performed with the data allocation shown in 

In order to verify the accuracy of the model, as mentioned in the previous section, this paper uses actual case 466 data from three countries, including the United States, India, and the United Kingdom, for experiments. The 467

prediction results of the model are shown in Figure 7 and Table 7 . As can be seen from Figure 7 and Table 7 , on all data sets, the model proposed in this paper can accurately 474 predict the increase in the number of new crown cases in the data set. It has achieved the highest prediction 475 accuracy in the USA, and its MPAE value has reached 7.06. This shows that the model proposed in this article 476 has a strong practicability and can play a good role in assisting decision-making in the prevention and control 

In order to prove the advancement and superiority of the proposed TCN-GRU-DBN-Q-SVM algorithm, this 485 paper compares it with two classic models (LSTM 73 and ANFIS 74 ), three state-of-the-art models 486 N-Beats 76 , and WT-RVFL 77 ), and time series analysis methods (ARIMA). In addition, in order to prove that 487 residual prediction can effectively improve the model's comprehensive prediction and data analysis accuracy, 488 the proposed TCN-GRU-DBN-Q-SVM was compared with TCN-GRU-DBN-Q. Figure 8 and Table 8 show 489 the comparison results of the models. 490 

Model 5 is ANFIS, Model 6 is VMD-BP, Model 7 is WT-RVFL, Model 8 is ARIMA.Based on the experimental results, the following 517 conclusions can be drawn: 518

(1) The prediction result of TCN-GRU-DBN-Q-SVM is better than that of TCN-GRU-DBN-Q algorithm, which 519 proves that the residual prediction modeling based on SVM can effectively improve the overall prediction ability 520 of the model. The possible reason is that residual prediction analyzes the deviation information between the 521 predictor and the real data to further correct the prediction results of the model and improve the accuracy 522 comprehensively. 523

(2) Although VMD-BP and WT-RVFL algorithms can achieve good prediction results, their prediction 524 performance is difficult to surpass the classic model (ANFIS, LSTM, ARIMA), the recently proposed model 525 (N-Beats) and the model proposed in this paper. The possible reason is that the decomposition algorithm has a 526 certain boundary effect in the modeling process, which affects the model's ability to analyze and identify the 527 original time series to a certain extent. The TCN-GRU-DBN-Q-SVM model in this paper has the following points that are still worth improving. First, 541 as an ensemble model, the higher the variability of the sub-model, the higher the accuracy of the integrated 542 model obtained in general, and a larger integrated model can be explored in the future, although it will take 543 more time to train the model and to get the forecasting results, but it is feasible for COVID-19 morbidity 544 prediction, a task that does not require high real-time performance (generally forecast in days). Second, 545 integration with other modeling approaches (e.g., time series models such as ARIMA, STAR models, dynamic 546 models such as SAHQD model 19 , SEIAIR model 28 , even the SCUAQIHMRD model 31 ) can also be performed, 547

where reinforcement learning can still provide an integration framework. Third, the data of other modalities, 548 such as Mobility Report released by Google 78 , search interest 51 , local weather data, human contact data 33 , etc., 549 could also considered as meaningful input into the model 79 , although the inaccessibility and incompleteness of 550 some data modalities may limit the power and generalizability of the model with integrated multi-modalities. 551

Additionally, as a deep learning model, there are some inherent drawbacks. First, the "features" decoded by the 552 forecast model are abstract information, which are not sufficiently interpretable. Second, as with all data-driven 553 models, the model relies on the accuracy of the data provided for model training, and the prediction step is not 554 as long as dynamics modeling. However, the corresponding advantages are obvious: deep learning is more 555 adequate for mining the information contained in the time series data, and does not rely on complicated (often 556 multi-compartment, multi-stage, and multi-parameter) dynamic modeling that requires delicate analysis of 557 epidemic situation ranging from the policy changes to the pathogenicity and crowd immunity, and accordingly 558 requires less parameter estimation and manual feature extraction 80 . 559

It must be noted that in the above elaboration, it is easy to see that for the task of forecasting the number of 560 COVID-19 infections, there are various forecasting models, but their respective advantages and disadvantages 561 are also apparent. Researchers need to make trade-offs and choose the appropriate prediction scheme according 562 to the needs of prediction. The TCN-GRU-DBN-Q-SVM model involved in this study can give better case 563 number forecasts than many existing models for scenarios where information such as temperature data, 564 transportation data, and network human behavior is missing but only daily incidence case number information 565 is available, but the forecast step size is limited. 566

In a word, the forecast models must keep up with a rapidly changing situation 80 . The modelling of pandemic is, in essential, a difficult trade-off. First, the data-driven models were easier to acquire compared to 568 those dynamic model-based forecasting models. However, how to determine the ensemble strategy can vary 569 when the training time requirements, hardware space, and the amount of training data change, although the 570 ensemble model can achieve higher forecast performance than single ones. Second, the deep-learning models 571 may sacrifice the interpretability of forecast model since the "features" they learn could be abstract and obscure, 572 but the accuracy of deep-learning, if the data quantity permits, is generally higher than interpretable methods 573 ranging from time-series models to regression models where the artificial feature engineering may be performed 574 to acquire the features in data's statistical, spectral, and temporal domain. Last but not least, the modality (or 575 the source) of data should be flexibly and smartly determined since the ease of quantification and the 576 accessibility to data may reduce when the modality and source of data become multiple and extensive. 577 J o u r n a l P r e -p r o o f . 578 The COVID-19 pandemic is posing a huge challenge to international public health. Accurate and effective 582 prediction of the number of cases is significant for the health resource planning and the epidemic situation 583 evaluation. Unlike SIR models, the data-driven machine learning model does not rely on accurate physical 584 dynamics modeling and can adapt to complex changes in the epidemic situation (for example, vaccination, 585 quarantine, isolation, lockdown, social distancing, etc.), and can be sensitive and accurate in predicting the 586 development of the epidemic. In this work, a novel TCN-GRU-DBN-Q-SVM ensemble hybrid model is 587

proposed for COVID-19 infection prediction. First, three widely used networks, TCN, GRU, and DBN, are used 588 as single predictors. Second, three predictors are ensembled by reinforcement learning method (Q-learning) 589 with different weights. Third, an error predictor built by SVM, is trained with validation set, and the final 590 prediction result could be obtained by combining the TCN-GRU-DBN-Q model with the SVM error predictor. 591

The strengths of our model could be concluded as follows. First, we use multiple predictors to work collectively. 592

The integrated weights based on Q reinforcement learning can be adaptively adjusted according to the 593 characteristics of the data, which ensures model's capability of various forecasting work in different countries 594 proposed deep learning network to ensure the accuracy of the model. Fourth, the proposed model is data-driven, 598 and the amount of data required is easily met. 599

In the future, we will further consider whether the proposed integrated model can integrate more information 600 modalities. For example, Wensen Huang et al. evaluated predictive value of regional outbreaks of online medical 601 behavior data, including online medical consultation (OMC), online medical appointment (OMA) and online medical 602 search (OMS) for the 2019 coronavirus disease in Shenzhen, China from January 1, 2020 to March 5, 2020 81 . If this 603 type of information model can be integrated with data-driven predictive models, and natural language processing 604 (NLP) and other algorithms can be used to extract more information and merge into the model, it will have a certain 605 significance for further improving the predictive effect of COVID-19. The integration with other modeling 606 approaches is also very interesting. Sumit Mohan et al. have proposed a hybrid ARIMA and Prophet model to predict 607 daily confirmed and cumulative confirmed cases in India, which is a inspiring step 82 . 608

However, as we discuss in this paper, the exact modeling approach, data modality, and integration strategy needs to 609 consider a number of different factors, including data sources, model generalization capabilities, model accuracy, and 610 hardware conditions during model training, which is a difficult trade-off 80 . Researchers must put more effort to make 611 machine learning, time series analysis, dynamical modeling, and other data science models more useful for COVID-612

19 forecasting to provide more accurate and reliable information for the outbreak prevention and control. 613

Beware of the second wave of COVID-19

COVID-19 Trends and Forecast in the Eastern Mediterranean Region With a 628

ch: a platform for short-term forecasting of 630 intensive care unit occupancy during the COVID-19 epidemic in Switzerland

Forecasting COVID-19-Associated Hospitalizations under Different 633

Extended SEIR Compartmental Model

Analysis and forecast of COVID-19 spreading in China, Italy and France

Real-time forecasts of the COVID-19 epidemic in China from

SutteARIMA: Short-term forecasting method, a case: Covid-19 and stock 642 market in Spain

Naive forecast for COVID-19 in utah based on the South Korea 644

and Italy models-the fluctuation between two extremes

Short-term forecasting COVID-19 647 cumulative confirmed cases: Perspectives for Brazil

A machine learning forecasting model for COVID-19

Online Forecasting of Covid-19 Cases in Nigeria Using 653

Real-time forecasts and risk assessment of novel coronavirus (COVID-19) 655 cases: A data-driven analysis

Time series forecasting of COVID-19 transmission in Canada using LSTM 658 networks

Finding an accurate early forecasting model 660 from small dataset: A Case of 2019-nCoV novel coronavirus outbreak

Optimization method for forecasting confirmed 663 cases of COVID-19 in China

A python based support vector regression model for prediction of COVID19 665 cases in India

A review on COVID-19 forecasting models

Inference on the dynamics of COVID-19 in the United 670

Forecasting the 672 dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using 673 statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA)

Short-range forecasting of coronavirus disease 2019 (COVID-19) during early 677 onset at county, health district, and state geographic levels: Comparative forecasting approach using 678 seven forecasting methods (Preprint)

Deep learning-based forecasting model for COVID-19 681 outbreak in Saudi Arabia

Modeling and forecasting number of confirmed and 684 death caused COVID-19 in IRAN: A Comparison of

Predicting COVID-19 in China Using Hybrid AI Model

Deep-COVID: Predicting COVID-19 from chest X-ray 689 images using deep transfer learning

COVID-19: Short term prediction model using daily 692 incidence data

Analysis of COVID-19 using a modified SLIR model with nonlinear 694 incidence

Modeling for COVID-19 with the contacting distance

Estimating the Effects of Public Health Measures by SEIR(MH) Model of

Epidemic in Local Geographic Areas. Front Public Heal

Reconstruction of the full transmission dynamics of 701 COVID-19 in Wuhan

Modeling and Evaluation of the Joint Prevention and Control 703

Mechanism for Curbing COVID-19 in Wuhan

Forecasting Malaysia COVID-19 Incidence based on Movement Control Order 706 using ARIMA and Expert Modeler

Predicting COVID-19 incidence in French hospitals using human 709 contact network analytics

Spatial prediction of COVID-19 epidemic using ARIMA techniques in 711

Modelling COVID-19 incidence in 713 the African sub-region using smooth transition autoregressive model

Regional forecasting of COVID-19 caseload by non-parametric 716

P r e -p r o o f regression: a VAR epidemiological model

Forecasting the Trend of COVID-19 Considering the Impacts of 719

Hybrid grey exponential smoothing approach for predicting 722 transmission dynamics of the COVID-19 outbreak in Sri Lanka. Grey Syst

Grey forecasting models based on internal optimization for Novel Corona virus

Predicting COVID-727 19 incidence through analysis of Google trends data in Iran: Data mining and deep learning pilot study

A novel deep interval type-2 fuzzy LSTM (DIT2FLSTM) model 730 applied to COVID-19 pandemic time-series prediction

A spatiotemporal machine learning approach to forecasting 733

COVID-19 incidence at the county level in the USA

Predicting increases in COVID-19 incidence to identify 736 locations for targeted testing in West Virginia: A machine learning enhanced approach

An adaptive, interacting, cluster-based model for predicting the 739 transmission dynamics of COVID-19

An integrated framework for building trustworthy 741 data-driven epidemiological models: Application to the COVID-19 outbreak

Bayesian framework for multi-wave covid-19 epidemic analysis using empirical 744 vaccination data

Transmission Dynamics of Large Coronavirus Disease Outbreak 746 in Homeless Shelter

Spatial-temporal relationship between population mobility and COVID-749 19 outbreaks in south carolina: Time series forecasting analysis

COVID-19 in 215 countries and territories using machine learning: Model development and validation

Artificial neural network modeling of novel coronavirus (COVID-755 19) incidence rates across the continental United States

Trends and Prediction in Daily New Cases and 758

Deaths of COVID-19 in the United States: An Internet Search-Interest Based Model

A machine learning model 761 for nowcasting epidemic incidence

Mining Techniques: A case study of Pakistan

Influence of social determinants of 767 health and county vaccination rates on machine learning models to predict COVID-19

Predicting the incidence of COVID-19 using data mining

Fully Convolutional Networks for Semantic Segmentation

An Empirical Evaluation of Generic Convolutional and Recurrent 774

Networks for Sequence Modeling

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Learning Phrase Representations using RNN Encoder -778

Rate-coded restricted Boltzmann machines for face recognition

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13

Deep architecture for traffic flow prediction: Deep belief networks 783 with multitask learning

Training Products of Experts by Minimizing Contrastive Divergence

Deep Belief Networks 788 using discriminative features for phone recognition

Learning From Delayed Rewards

Optimal path planning approach based on Q-learning algorithm for mobile 792 robots

A novel axle temperature forecasting method based on 794 decomposition, reinforcement learning optimization and neural network

Neural network Reinforcement Learning for visual 797 control of robot manipulators

Machine learning in medicine: a practical introduction

Support-Vector Networks

European Centre for Disease Prevention and Control

On the use of cross-validation for time series predictor evaluation

Out-of-sample tests of forecasting accuracy: An analysis and review

Transfer Learning for COVID-19 cases and deaths forecast using LSTM network

A COVID-19 forecasting system using adaptive neuro-fuzzy inference

Forecasting Brazilian and American COVID-19 cases based 814 on artificial intelligence coupled with climatic exogenous variables

N-BEATS: Neural basis expansion analysis for 817 interpretable time series forecasting

Modelling and forecasting of COVID-19 spread using wavelet-coupled 819 random vector functional link networks

Community movement and covid-19: A global study using google's community 822 mobility reports

Dajun Zeng D. Data science approaches to confronting the COVID-19 824 pandemic: A narrative review

Better modelling of infectious diseases: Lessons from 827 covid-19 in China

Turn to the Internet First? Using Online Medical 829

Behavioral Data to Forecast COVID-19 Epidemic Trend. Inf Process Manag

Predicting the impact of the third wave of 832

COVID-19 in India using hybrid statistical machine learning models: A time series forecasting and 833 sentiment analysis approach

*These authors contribute to this work equally and should be considered as co-first authors. A Data-driven Hybrid Ensemble AI Model for COVID-19 Infection Forecast Using Multiple Neural Networks and Reinforced Learning Weiqiu JIN 1,2,* , Shuqing DONG 3,* , Chengqing YU 3

# Correspondence to: LUO Qingquan, E-mail: luoqingquan@hotmail.com

*These authors contribute to this work equally and should be considered as co-first authors

The authors would like to thank European Centre for Disease Prevention and Control for the public database they 619 provided on https://data.europa.eu/euodp/en/data/dataset/covid-19-coronavirus-data-daily-up-to-14-december-2020. 620 621

The authors have no actual or potential conflicts of interest related to this manuscript. 623 624

Declare none.