key: cord-0930599-lna8qyfd authors: Liao, Zhifang; Lan, Peng; Fan, Xiaoping; Kelly, Benjamin; Innes, Aidan; Liao, Zhining title: SIRVD-DL: A COVID-19 deep learning prediction model based on time-dependent SIRVD date: 2021-09-13 journal: Comput Biol Med DOI: 10.1016/j.compbiomed.2021.104868 sha: 941c8578cb5def8a6a60cb196b8263dcdd45f2a9 doc_id: 930599 cord_uid: lna8qyfd COVID-19 is one of the biggest challenges that human beings have faced recently. Many researchers have proposed different prediction methods for establishing a virus transmission model and predicting the trend of COVID-19. Among them, the methods based on artificial intelligence are currently the most interesting and widely used. However, only using artificial intelligence methods for prediction cannot capture the time change pattern of the transmission of infectious diseases. To solve this problem, this paper proposes a COVID-19 prediction model based on time-dependent SIRVD by using deep learning. This model combines deep learning technology with the mathematical model of infectious diseases, and forecasts the parameters in the mathematical model of infectious diseases by fusing deep learning models such as LSTM and other time prediction methods. In the current situation of mass vaccination, we analyzed COVID-19 data from January 15, 2021, to May 27, 2021 in seven countries – India, Argentina, Brazil, South Korea, Russia, the United Kingdom, France, Germany, and Italy. The experimental results show that the prediction model not only has a 50% improvement in single-day predictions compared to pure deep learning methods, but also can be adapted to short- and medium-term predictions, which makes the overall prediction more interpretable and robust. methods, and the error is less than 2%. From the above research, deep learning can provide reliable single-day prediction results. However, this kind of methods have two problems. The first one is that these methods by using pure numerical fitting cannot correctly capture the trend of epidemics in the spreading process [13] . The another one is that the temporal change pattern of the number of infected people is very simple. It is difficult for these methods to find right patterns of long-term changes, leading to predictions that are only valid for a short period. A model that can provide forecasts for longer periods is undoubtedly more meaningful for policymaking and strategic planning. Considering that epidemic mathematical models have background knowledge of infectious diseases and deep learning technology has strong predictive capabilities, combining both technologies may improve the interpretability of deep learning methods and generate more robust predictions [15] . This paper proposes a COVID-19 prediction model -SIRVD-DL (Susceptible, Infected, Recovered, Vaccinated, and Deceased -Deep Learning) by using deep learning method based on time-dependent SIRVD to combine these two methods to make the model more explanatory and provide effective predictions. Firstly, this model modifies the existing SIRVD model of vaccination status to enable it to dynamically measure model parameters. The model parameters are smoothed by using the moving average method to deal with data noise and anomalies and predicted by the deep learning method. Finally, the predicted parameters are substituted into the SIRVD model to obtain the final predicted number of infections. We have conducted related numerical experiments and are interested in addressing the following three important research questions.  RQ1: Is the time-dependant SIRVD-DL model reasonable and effective?  RQ2: How is the prediction performance of the proposed SIRVD-DL model? What is the improvement compared to single deep learning methods? How does the model perform in the short and medium-term forecasts?  RQ3: Is the moving average method proposed in this paper effective for data smoothing? What is the difference between using this method and not using this method? The rest of the paper is organized as follows: in the second section, we propose the SIRVD-DL prediction model. In the third section, we conducted numerical experiments and analyzed the experimental results to illustrate the effectiveness of our model. Then, in section 4, we made some discussions and suggestions. Finally, the last section is the summary of the paper. The SIRVD-DL prediction model is mainly divided into three parts: data preprocessing, constructing the SIRVD model, and deep learning model. The overall workflow of the model is shown in Figure 1 . Step 1: Data preprocessing. First, obtaining data sets of COVID-19 including the number of confirmed cases, recovered cases, deceased cases, and vaccination. Then performing data preprocessing and transform the data into the required format for the SIRVD-DL model. Step 2: Constructing the SIRVD model. In this step, an existing SIRVD model is modified to adapt to the dynamic changes of time. Then we build the curve of parameter changes over time by measuring the model parameters. Step 3: Deep learning model. Based on the measurement of model parameters obtained J o u r n a l P r e -p r o o f from the last step, the moving average method is used to smooth the data. Then a time series vector is constructed as the input. Model training and evaluation are carried out through deep learning methods such as LSTM and its variants to select the best model parameters. By using the model parameters, the SIRVD-DL model is built to predicting the number of COVID-19 infections. SIRVD-DL aims to evaluate the changes in the parameters of the epidemic to build the best model to predict the development trend of the epidemic. In the rest of this section, we will describe the contents of each part in detail. Since the outbreak of COVID-19, many governments have released various public data sources and incorporated real-time observation data for the latest analysis and predictions. This article collects relevant research data from two data sources. One is a time-series data collected by the Johns Hopkins University System Science and Engineering Center (CSSE), which is widely used by many researchers, including the global cumulative confirmed cases, cumulative cured cases, and cumulative deaths. The data set provides variables such as country, province, longitude, latitude, and the number of cases corresponding to the date. Table 1 shows some sample data of global confirmed cases. The second data source is from the website -Our World In Data. It is updated daily and includes data on confirmed cases, deaths, hospitalizations, testing, and vaccinations as well as other variables of potential interest. The data to be used in this article is the COVID-19 vaccination data [16] collected from official reports by the "Our World" data team. Table 2 shows some sample data of vaccination data. In the table, location represents the name of the country (or region within a country), iso_code is a three-letter country code, the date is the date of the observation, total vaccination is the total number of doses administered, and people_vaccinated is the total number of people who received at least one vaccine Dose, people_fully_vaccinated is the total number of people who received all doses prescribed by the vaccination protocol. After obtaining the above two data sets, we need to transform the data into the format required by the SIRVD model. The infected cases ( ) can be obtained by the formula: In order to get the change of the number of susceptible people S(t) , we can use the following formula: Where ( ) , ( ) , ( ) , ( ) , and ( ) are the functions of the changes of the population status with time, respectively representing the number of susceptible, infected, recovered, vaccinated, and deceased individuals in the population when the total population number is at time . ( ) , ( ) , ( ) , and ( ) can be obtained from the abovementioned two datasets. The population can be obtained from the data source provided by Our World In Data. It is all based on the last revision of the United Nations World Population Prospects. After preprocessing the data, the next step is to construct a time-dependant SIRVD model. The basic SIRVD model [17] is based on the epidemic model SIR. The SIRVD model describes the interaction of the virus with the host during transmission, and divides the population into 5 types: susceptible, infected, recovered, vaccinated, and deceased. SIRVD adds two new states based on the SIR model, vaccinated and deceased. Among them, vaccinated means individuals in the population who have been vaccinated against the disease and have the ability not to be infected, while deceased means the individual who died because of the disease [18] . The SIRVD model modified based on SIR can be represented by the following ordinary differential equations: Among them, is the rate of infection, which is that the transmission from susceptible to infected; is the recovery rate, which is the transmission from infected to recovered; is the death rate, which is the transmission from infected to deceased; is the vaccination rate, which is the transmission from susceptible to vaccinated; finally, is the susceptibility rate, which is the transmission from recovered to susceptible. It should be noted that the transmission process of the virus is described by the nonlinear term In the basic SIRVD model, the five parameters of infection rate , recovered rate , death rate vaccination rate , and susceptibility rate are assumed to be constant, which ignores their time-dependent characteristics because they are constantly changing in the process of virus transmission. In order to accurately and effectively predict the development trend of the disease, we propose a time-dependent SIRVD model, in which the five parameters are all functions of time t. This time-dependant SIRVD model can better track the spread of the disease, control, and predict future trends. The time-dependant SIRVD model regards the infection rate , the recovery rate , the death rate , the vaccination rate , and the susceptibility rate as functions that change with time t: ( ), ( ), ( ), ( ), and ( ). The rewritten differential equations are as follows: Among them, the five variables( ( ), ( ), ( ), ( ), and ( )) still satisfy the Eq. (8). If we assume that the total population is constant, then the sum of the increase or decrease of the state of each population is 0. Since the data of COVID-19 is updated daily, we can change Eq. (9-13) into difference J o u r n a l P r e -p r o o f equations. Similarly, the five variables( ( ), ( ), ( ), ( ), and ( )) still satisfy the Eq. (8). During the spread of COVID-19, the reinfection rate can be assumed to be approximately equal to zero because the human body produces antibodies against the virus [19] . Therefore, Eq. (15) can be rewritten as: That is, the expression of γ(t) can be obtained: In the same way, the other two time-dependent parameters ( ) and ( ) can be easily derived from the above difference equations Eq. (16) and Eq. (17) . After getting the rate of recovery and the rate of death, bring into Eq. (14), we can get the time-dependant parameter ( ) which is expressed by Eq. (24) . Through the time-dependant SIRVD model, the measurement and evaluation values of the parameters can be obtained and arranged in time series. There is a problem that the curve of a time series of data -the number of infected people does not have a related sequence pattern [13] . To solve the problem, our model firstly predicts the estimated parameters to discover the development trend of the epidemic and then builds the time-dependant SIRVD model by using the values of predicted parameter. Deep learning technology is an effective time series prediction method [20] . In the next section, we will introduce some specific deep learning time series prediction methods, the prediction process of SIRVD-DL, and the evaluation metrics of the model. Deep learning (DL) has a good performance in time series data analysis and prediction, and can automatically learn the time correlation and structure of data, such as seasonality, periodicity, and trend [14] . The recurrent neural network is a kind of neural network which is specially used to deal with sequence-structure data. It contains different time distributions of hidden states, which makes it possible to store a lot of information about the past. They are most commonly used in predictive applications because of their ability to process variablelength sequence data [21] . But the recurrent neural network is not able to overcome the vanishing gradient or gradient explosion, and because the only previous time step involves the hidden layer activation function, it can only store short-term memory, long-term historical information can not be passed [22] . In order to solve this problem, Hochreiter et al. proposed a long and short-term memory neural network LSTM. The model includes a memory storage unit and three logic gated units to control the flow of data, and can filter out which values to store, forget or erase, and solve the problem of vanishing gradient or gradient explosion in the longterm dependence of the recurrent neural network. There are many variants of LSTM, which are divided into Vanilla LSTM, Stacked LSTM, Bidirectional LSTM, and GRU according to their structure, which will be described in detail below. Vanilla LSTM is the most basic LSTM and the most widely used [22] . Figure 2 shows the architecture of Vanilla LSTM. The rounded rectangle represents a Cell state, the green circle represents the input of the time series, and the purple circle represents the output of the hidden layer. The output of the Cell state at time and the output of the hidden layer are linked to the next cell state, and the data flow in the cell State. The cell state contains a memory block of hidden units, which are used to control the flow of information from input to output ports. The first sigmoid function is the forget gate, which forgets the previous cell state information. The next sigmoid and the first tanh function are input gates, which represent information stored in the cell state or information that should be forgotten. The last sigmoid function is the output gate, which determines the information passed to the next hidden state. Eq. (25) (26) (27) (28) (29) are the specific mathematical form of LSTM: Among them, represents the weight matrix, represents the bias matrix, they exist in the forget gate , input gate , cell state and output gate . J o u r n a l P r e -p r o o f Figure. 2 Vanilla LSTM structure Stacked LSTM [23] , also known as a deep LSTM, is an extension of the vanilla LSTM we described above. In Stacked LSTM, there are multiple hidden layers containing multiple memory cells. Multi-layer superposition increases the depth of the neural network. Each layer has some information and passes it to the next layer to form an accurate model with a higher level and deeper representation. The stacked LSTM structure is shown in Figure 3 . For each time step, it provides a separate output instead of providing a single output for all time steps. The standard RNN only processes the input in one direction and processes the information that it has in the future. This problem is solved by implementing the bidirectional topology of LSTM. Bidirectional LSTM [24] extracts complete time information at time by considering past and future information. This method divides the hidden neurons of the standard RNN into a forward state and a backward state. The neurons in the forward state are not connected to the neurons in the backward state, and the neurons in the backward state are not connected to the neurons in the forward state. The basic structure of the three time steps of the two-way LSTM expansion is shown in Figure 4 . Without a reverse state, this structure is similar to a standard J o u r n a l P r e -p r o o f one-way RNN. With this structure, there is no need to include additional delays as in standard RNNs. Figure. 4 Bidirectional LSTM structure GRU GRU(gated recurrent unit) can be described as a variant of LSTM [25] , which is similar to LSTM. GRU is mainly used to solve the vanishing gradient problem in typical RNNs, thereby improving the learning of long-term dependencies in the network. Figure 5 shows the structure of GRU. The GRU block also uses the tanh and sigmoid functions to calculate the necessary values. But unlike the LSTM block, the GRU block does not have a separate storage unit. There is no separate forget gate for this type of block, and the input/update gate is responsible for controlling the flow of information. Due to the difference in these two structures, it has fewer parameters and a simpler design, which ultimately makes it more computationally efficient and easier to train. In addition to the update gate, the GRU block has a reset gate. In a GRU block, four values are calculated: update gate, reset gate, candidate activation, and output activation. Each gate and candidate activation has its own weight and bias, and the current block input and the previous activation value are used as inputs for calculating these values. In the first step, the sigmoid function is used to calculate the value of the gate. Eq. (30-33) are the specific mathematical form of GRU: = ( Table 3 shows the number of layers, the number of units, and the total number of parameters for each method. We use the estimated model parameters ( ), ( ), and ( ) as the prediction targets of the deep learning method. Since the model parameters are constantly changing during the development of the epidemic, in order to measure this change, we construct a time window of size as a dimension of the input time-series data into the deep learning model. Let the model parameter to be predicted as , and is one of the items in the set { ( ), ( ), ( )}, then the time series can be obtained as ( 0 , 1 , 2 , … , −2 ). Generally, the model parameters evaluated by SIRVD will have data noise, which will affect the subsequent use of deep learning methods to extract temporal features. To deal with this problem, a moving average with a small sliding window is used to smooth the model parameters. It divides the model parameter curve into two parts: baseline and residuals. Specifically, let the length of this small sliding window as and the step length as 1. For each point , the corresponding point on the baseline is * , and the value of * is the average value of ( − +1 , − +2 , … , ). The difference between and * is called residual. Baseline B and residuals R can be calculated by the following formula. * = 1 ∑ − +1 =1 By using the moving average method, most of the noise data are eliminated while the baseline retains its underlying shape. The residual contains random noise and is not considered as model input. Thus, the shape of a single piece of input data can be obtained as × 1. In the time range of length , let the output dimension of the model be , and the input data latitude that can be constructed is ( − − ) × . Finally, the optimal hyperparameters are obtained by training and evaluating the model and then predicting ̂( ),̂( ), and ̂( ). The predicted number of infections can be obtained by using Eq. (16). The prediction algorithm is shown in Algorithm 1. By comparing actual and predicted values, the accuracy of the model can be evaluated. This study uses four evaluation metrics to make fair and effective comparisons. They are: root mean square error ( ), normalized root mean square error (N ), determination coefficient ( 2 ), and average absolute percentage error ( ). The calculation method is shown in Eq. (37-40). 3 Numerical results The experiments in this paper are conducted on open source libraries such as Numpy [26] , Pandas [27] , Tensorflow (Google) [28] and Keras [29] . Python [30] is a high-level generalpurpose programming language that can interact with deep learning libraries. The deep learning model structure used in this paper is built with it. Hyperparameters are values that define the architecture of the deep learning model. The correct value of hyperparameters is the key to achieving high-quality models. In order to determine the best combination of hyperparameters, this paper uses a grid search algorithm that takes a set of possible parameters for each adjusted hyperparameter. Then, after determining each possible hyperparameter combination [31] , each combination is used to train the deep learning model. In order to avoid the possibility of errors due to the initial random setting of the weights, each set of hyperparameters is used for three pieces of training, and then each implemented model is evaluated. The hyperparameter combinations are shown in Table 4 . As shown in Figure 6 , the relevant data set in India includes the cumulative number of confirmed cases, the number of susceptible people, the number of infected people, the number of recovered people, the number of vaccinated people, and the number of deaths. We applied the time-dependant SIRVD model to this data set and obtained the changes in model parameters from January 15, 2021, to May 26, 2021, through evaluation, as shown in Figure 7 , including infection rate, recovery rate, vaccination rate, and death rate. It should be noted that the susceptibility rate ( ) is not included here. Since the human body will produce antibodies to prevent future re-infection of the COVID-19 virus [19] , the susceptibility rate ( ) is assumed to be equal to zero. It can be seen from Figure 7 (a) that the infection rate is at its maximum when Day = 90. Compared with the middle position of the rising period in Figure 6 (c), this is the fastest rising time, which shows that the model parameters measurement is valid. In Figure 7 (b), the recovery rate has a trough between Day = 60 and Day = 90. This is because a large number of infections leads to insufficient medical resources, which greatly reduces the success rate of recovery. Similarly, in the same interval, the vaccination rate in Figure 7 (c) reached a peak. The death rate in Figure 7 (d) gradually increased after the peak of infection, indicating that the number of death cases would reach the peak. In general, the model parameters evaluated based on the time-dependant SIRVD model are a good measure of real-time changes in the development of the epidemic. And, the most important thing is that it can be found from Figure 7 that each model parameter obtained by the measurement fluctuates up and down with time series periodicity, which is the basis for the subsequent parameter prediction. In order to verify the effectiveness of the proposed SIRVD-DL model prediction, we compared it with the method of using deep learning to predict in existing studies. Among them, Stacked LSTM and bi-directional LSTM use the methods of Arora et al. [32] , Gru and vanilla LSTM use the methods of Nabi et al. [13] . Table 5 shows the comparison of SIRVD-DL and Vanilla LSTM, Stacked LSTM, BiDirectional LSTM, and GRU four prediction models on single-day prediction. The test data set is India's data from April 18, 2021, to May 27, 2021. It can be seen from Table 5 that SIRVD-DL is the best based on each evaluation metrics in single-day forecasting, with the lowest RMSE of 385128.55, NRMSE of 0.012373, R2 of 0.995, and MAPE of 0.92. MAPE is within one percent, which is an improvement of 51% compared to the best existing single deep learning prediction methods. Figure 8 shows the difference between the predicted number of infections and the actual number of infections of the SIRVD-DL model. At the same time, in order to test the effect of the model in the short-term and mediumterm prediction, we did 3-day, 7-day, 14-day, 21-day, and 28-day experiments. The experimental results are shown in Table 6 . It can be seen from the table that SIRVD-DL predicted a MAPE of 2.51% on the 3rd, 5.07% on the 7th, 10.93% on the 14th day, 17.89 on the 21st day, and 26.57% on the 28th day. Compared with other methods, SIRVD-DL performs better in short-term and medium-term prediction. As the number of forecast days increases, the coefficient of determination R2 is always maintained at a relatively high level, which shows that the prediction of the model is effective. It is worth noting that with the increase in the number of prediction days, R2 appears to be a negative number in other methods. This is because we used the sklearn.metrics.r2_score function in Sklearn to calculate. A negative number indicates that the prediction error of the fitting function is greater than the mean function, which indicates that the prediction performance of the model is not good. In general, the SIRVD-DL model performs well in single-day forecasts and short-term and medium-term predictions, which shows that the proposed model is effective. Moreover, the short-term and medium-term predictions have very important guidance and reference significance for helping governments balance the load of medical infrastructure and to implement such control measures as mandatory implementation or lifting of blockades. In order to predict the number of infections, we first used the moving average method to smooth the parameters evaluated by the SIRVD model, which is shown in Figure 9 . The baseline extracted by our algorithm eliminates most of the anomalies and noises while preserving its underlying shape. This processing method makes the changes of model parameters more periodic and more suitable for time series prediction. As shown in Figure 10 , it is the residual error after model parameter extraction. The residual error contains random noise, so prediction is not considered. After the data is smoothed, we apply standardization again to get a standardized baseline. Such a baseline serves as the input of our deep learning model. We have carried out comparative experiments using moving average for data smoothing and not using moving average respectively. As shown in Figure 11 , the x-axis shows the three predicted model parameters ̂( ), ̂( ) and ̂( ). The blue bars are the performance results of the two methods on the training set. The green bars are the performance results on the test set. It can be clearly found that after the moving average is used for processing, the predicted MAPE has a significant drop, which shows that the moving average method has a greater improvement in the prediction of the improved model parameters. Figure 12 shows the actual parameter prediction effect. Figure. 11 The performance of the model on the training set and test set with and without moving average for data smoothing Figure. 12 Model parameters ( * ( ), * ( ) and * ( )) after data smoothing and predicted model parameters (̂( ), ̂( ) and ̂( )) COVID-19 has changed the daily lives of most people around the world. Researchers across the world have done much work on the transmission mode of COVID-19 that has an important guiding role in helping medical workers and government decision-makers. A large J o u r n a l P r e -p r o o f number of studies have proposed different COVID-19 prediction models, among which artificial intelligence-based methods are currently the most interesting to the global scientific community [20] . Although the use of artificial intelligence methods for prediction has high accuracy, it is difficult to use deep learning technology to discover the time change pattern of the number of infected people [13] . In order to solve this problem, We proposed the SIRVD-DL model to combine deep learning methods with mathematical models of infectious diseases to predict the parameters of the infectious disease model. The advantage of the SIRVD-DL prediction model is that the introduction of an infectious disease mathematical model makes the overall prediction more interpretable and robust. From our experimental results, it can be found that compared with other methods [32] , SIRVD-DL not only improves the single-day forecast by 51% but is also suitable for short-term and medium-term forecasting. This is more practical for policy formulation and strategic planning. In order to demonstrate the universality of our method, we also used data from eight countries including Argentina, Brazil, South Korea, Russia, the United Kingdom, France, Germany, and Italy as verification. The end data for COVID-19 data in 8 countries is the same, which is May 27, 2021. The start date of COVID-19 data for Argentina is December 29, 2020, Brazil is January 16, 2021, South Korea is February 25, 2021, Russia is December 15, 2020, and the United Kingdom is May 2021. On the 26th, France, Germany, and Italy are all on December 27, 2020. The prediction results of these eight countries using SIRVD-DL are shown in Table 7 . From the table, we can see that SIRVD-DL performs well on the data sets of other countries. The average MAPE is within 1.5% and the coefficient of determination is 0.995. This shows that our model is extensive and can be used in other countries and regional COVID-19 predictions. It should be noted that our model still has some limitations. One limitation is the accuracy of the data. The data we use is official statistics, and there will be some differences from the actual data in the real world. At the same time, the model does not consider the situation of asymptomatic infections and survivors being infected again. In addition, the reinfection rate of vaccination and the rate of vaccine efficacy are not considered in this paper because they are difficult to obtain and may be inaccurate, but they will impact on the model. Another limitation of our research is that the deep learning method we use for parameter prediction may not be optimal, and there may be better ways to solve the model and predict the parameters. We will improve our model in future work. We proposed the SIRVD-DL model to establish a virus transmission model and predict the trend of COVID-19. The model combines the mathematical model of infectious diseases and deep learning technology to overcome the problem that single deep learning methods cannot capture development trends. It can reflect the epidemic model parameters in real-time under the current situation of large-scale vaccination and predict the epidemic trend of infectious diseases. Experimental numerical results show that the SIRVD-DL model can evaluate model parameters such as infection rate, recovery rate, and death rate. The single-day prediction accurate rate improves 51% compared to the best existing single deep learning prediction methods. At the same time, the model performs very well in short-term and medium-term predictions. The single-day prediction error is 2.51%, the 7-day prediction error is 5.07%, the 14-day prediction error is 10.93, and the 28-day prediction error is within 30%. Publicly available datasets can be found here: https://github.com/CSSEGISandData/C OVID-19. The vaccination dataset is available at: https://github.com/owid/covid-19-data/tr ee/master/public/data/vaccinations. And our code and experimental data are publicly avail able at: https://github.com/Rambo55555/SIRVD-DL. Three months of COVID-19: A systematic review and metaanalysis Epidemiological characteristics of the COVID-19 outbreak in a secondary hospital in Spain Containing papers of a mathematical and physical character TW-SIR: time-window based SIR for COVID-19 forecasts Generalized SIR (GSIR) epidemic model: An improved framework for the predictive monitoring of COVID-19 pandemic A SIR model assumption for the spread of COVID-19 in different communities On the fractional SIRD mathematical model and control for the transmission of COVID-19: the first and the second waves of the disease in Iran and Japan A novel compartmental model to capture the nonlinear trend of COVID-19 Wrong but useful-what covid-19 epidemiologic models can and cannot tell us Can mathematical modelling solve the current Covid-19 crisis? Time series forecasting of COVID-19 transmission in Canada using LSTM networks Comparative analysis and forecasting of COVID-19 cases in various European countries with ARIMA, NARNN and LSTM approaches Forecasting COVID-19 cases: A comparative analysis between Recurrent and Convolutional Neural Networks Forecasting of COVID-19 cases using deep learning models: Is it reliable and practically significant? Mathematical models and deep learning for predicting the number of individuals reported to be infected with SARS-CoV-2 A global database of COVID-19 vaccinations A novel adaptive deep learning model of Covid-19 with focus on mortality reduction strategies A model and predictions for COVID-19 considering population behavior and vaccination COVID-19 immunity passports and vaccination certificates: scientific, equitable, and legal challenges Application of Artificial Intelligence-Based Regression Methods in the Problem of COVID-19 Spread Prediction: A Systematic Review Empirical evaluation of gated recurrent neural networks on sequence modeling Long short-term memory Speech recognition with deep recurrent neural networks Bidirectional recurrent neural networks Learning phrase representations using RNN encoder-decoder for statistical machine translation A guide to NumPy pandas: a foundational Python library for data analysis and statistics, Python for High Performance and Scientific Computing Tensorflow: A system for large-scale machine learning Deep learning with Keras The python language reference manual, Network Theory Ltd Random search for hyper-parameter optimization Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India The authors declare no competing interests.J o u r n a l P r e -p r o o f  The time-dependent SIRVD can measure and evaluate the parameters in the epidemic.  The combination of mathematical model of infectious diseases and deep learning is useful.  The COVID-19 prediction model of deep learning based on time-dependent SIRVD performs better than single deep learning method and is suitable for short and medium term forecasts  The moving average method is effective for the COVID-19 data smoothing. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.