key: cord-0119645-s6ttwg0k
authors: Zhao, Yun; Wang, Yuqing; Liu, Junfeng; Xia, Haotian; Xu, Zhenni; Hong, Qinghang; Zhou, Zhiyang; Petzold, Linda
title: Empirical Quantitative Analysis of COVID-19 Forecasting Models
date: 2021-10-01
journal: nan
DOI: nan
sha: 4be3fb98fee2b0352a250b8beba0e045b3cba6d2
doc_id: 119645
cord_uid: s6ttwg0k

COVID-19 has been a public health emergency of international concern since early 2020. Reliable forecasting is critical to diminish the impact of this disease. To date, a large number of different forecasting models have been proposed, mainly including statistical models, compartmental models, and deep learning models. However, due to various uncertain factors across different regions such as economics and government policy, no forecasting model appears to be the best for all scenarios. In this paper, we perform quantitative analysis of COVID-19 forecasting of confirmed cases and deaths across different regions in the United States with different forecasting horizons, and evaluate the relative impacts of the following three dimensions on the predictive performance (improvement and variation) through different evaluation metrics: model selection, hyperparameter tuning, and the length of time series required for training. We find that if a dimension brings about higher performance gains, if not well-tuned, it may also lead to harsher performance penalties. Furthermore, model selection is the dominant factor in determining the predictive performance. It is responsible for both the largest improvement and the largest variation in performance in all prediction tasks across different regions. While practitioners may perform more complicated time series analysis in practice, they should be able to achieve reasonable results if they have adequate insight into key decisions like model selection.

The COVID-19 pandemic has turned the world upside down. It has affected every aspect of people's life, posed numerous threats to global health, and overwhelmed the health care systems in a majority of countries around the world. On March 2021, COVID-19 was flagged as a global pandemic by the World Health Organization. As of 15 May 2021, COVID-19 had resulted in more than 32 million confirmed cases in the United States, and 160 million total reported cases worldwide [1] . Simultaneously, the ongoing pandemic has caused over 585,000 and 3,369,000 deaths in the United States and worldwide respectively [1] . The pandemic has triggered devastating social and economic impacts all over the world. Nearly half of the world's 3.3 billion global workforce are at risk of losing their livelihoods.

In fact, the increasing demand for health care has produced large flows of patients, leading to hospital bed shortages * *Equal contribution and strain situations in hospitals [2] . Thereby, for COVID-19 and future pandemics, it is crucial to construct methods to forecast the spread of confirmed and death COVID-19 cases accurately, as they can provide guidance for medical institutions to allocate their resources effectively. Policymakers can also benefit from reliable forecasts to carry out appropriate social intervention strategies to slow down its spreading [3] , [4] . Epidemic forecasting has been considered as a challenging task for a long time. The forecasting of COVID-19 is even harder as various constantly changing factors, such as social and cultural differences, intervention policies, healthcare facilities, influence the transmission rate and mortality rate to a large extent. For COVID-19 forecasting, there are a large number of research works utilizing different kinds of epidemic models, which can be broadly categorized into three groups: traditional statistical analysis models (e.g., Auto Regressive Integrated Moving Average (ARIMA, [5] , [6] ) and Seasonal ARIMA (SARIMA, [7] , [8] ), deep learning based models (e.g., Long Short-Term Memory (LSTM, [9] ), Transformer [10] and convolutional neural networks [11] ), and compartmental models such as SIR (Suspected-Infected-Recovered, [12] ), SEIR (Suspected-Exposed-Infected-Recovered, [13] ) and SEIRD (Suspected-Exposed-Infected-Recovered-Deceased, [14] ). Existing COVID-19 forecasting approaches differ substantially in methods, assumptions, forecast horizons and estimated quantities. Furthermore, these forecasting models confront great challenges in predicting varying situations and tasks accurately, since the circumstances in different regions, including economy, government policy and vaccine coverage, differ tremendously from each other. Different models can make very different projections of COVID-19 cases. This can result in a large amount of criticism, and leave governments and healthcare officials with some very difficult choices for how to carry out appropriate policies [15] , [16] .

To this end, we experiment with three models from different categories: SARIMA, SEIR-HCD [17] and Transformer-based Attention Crossing Time Series (ACTS, [10] ), for COVID-19 daily newly confirmed and mortality case prediction across different regions in the United States with forecast horizons of 7-day or 28-day. We perform hyperparameter tuning for each model and use different lengths of historical data for training. We evaluate the predictive performance through commonly used evaluation metrics in time series forecasting, including the Accuracy, Mean Absolute Percentage Error (MAPE), Weighted Absolute Percentage Error (WAPE), Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Root Mean Squared Logarithmic Error (RMSLE), all specified in Section III-D. Our goals are three-fold. First, we wish to quantify the relative impacts of three dimensions, model selection (i.e. SARIMA, SEIR-HCD, and ACTS), hyperparameter tuning, and training time series length (specified in Table I) , on the performance improvement and variation across different regions. Second, we seek to understand the relationship between the predictive performance and performance variation caused by each dimension. Third, we want to know which dimensions have larger influence on the performance, such that practitioners are able to pay more attention to those key factors when performing time series analysis in practice. Our experimental results suggest that model selection is the dominant factor that contributes most to both the performance gains and penalties. Furthermore, there is a positive correlation between predictive performance and performance variation. That is, dimensions that bring about higher performance also run the risk of greater performance loss.

The main contributions of this paper are highlighted as follows:

(1) To the best of our knowledge, this paper is the first effort to conduct a thorough empirical analysis quantifying the predictive performance of time series forecasting of COVID-19. (2) Our experimental results indicate that model selection brings about the most performance improvement and variation in all forecasting tasks throughout different regions. Furthermore, performance improvement and performance variation of each dimension are clearly positively correlated. In other words, a dimension that brings a larger performance improvement also results in a larger performance variation. (3) We provide general guidance for practitioners in time series forecasting through quantitative analysis of different dimensions. In particular, our results can guide practitioners regarding which dimension should be prioritized and to be cautious about the risk-benefit trade-off between performance variation and performance improvement from each dimension. The remainder of this paper is organized as follows. Section II describes related work. The three models and evaluation metrics we use are described in Section III. Section IV presents the datasets and experimental settings. Empirical results are discussed in Section V. Finally, our conclusions are presented in Section VI.

With the emergence and spread of COVID-19, several scientific domains around the world are facing huge research challenges to slow down or arrest the increasing trends of the spread of this disease. Hence, in order to better understand and manage this epidemic, various modeling, estimation, and forecasting methods have been proposed.

There are a large number of research works utilizing statistical methods to forecast COVID-19 cases (confirmed, recovered and deaths). The ARIMA model is one of the most popular statistical models for times series forecasting, aiming to describe the autocorrelation among time series data. [5] employed this model to conduct short-term forecasting for cumulative COVID-19 confirmed, death, and recovered cases on top 15 countries in the world. In [18] , the ARIMA model and exponential smoothing methods were joint applied in analyzing the trends of the COVID-19 outbreak in India. As an extension of ARIMA, SARIMA is capable of modeling a wide range of seasonal data. It was used to forecast the cumulative COVID-19 cases in top 16 countries [19] . Also, it helped [7] predict mortality rates of COVID-19 patients.

Other studies apply mathematical models to simulate the epidemics. Epidemiological models that divide the entire population into different compartments are called compartmental models, which utilize differential equations to simulate the disease transmission process. The most commonly used ones among compartmental models are the SIR and SEIR models, which are used to analyze the spread of COVID-19. In [13] , the SEIR model was introduced to simulate the dynamics of COVID-19. In [20] , the SIR model was applied to predict the daily infected cases in Algeria. Due to special features of COVID-19 such as its relatively long incubation period, and the high dependency of epidemic trends on artificial factors (e.g., medical resources and quarantine measures), many researchers have proposed the extension of the above two models to better adapt to the characteristics of COVID-19. StochSS Live! [14] performed inference with Approximate Bayesian Computation algorithms and simulated the COVID-19 cases in two U.S. counties based on the SEIRD model. In [21] , the authors proposed a variant of the SEIR model by taking into account the untested/unreported cases of COVID-19. The SEIR-HCD model has been proposed to extend the SEIR model according to characteristics of COVID-19 by adding three additional compartments: H (Hospitalized), C (Critical) and D (Dead) [17] . It was employed to analyze the spread of COVID-19 in France.

For deep learning based approaches, an LSTM-based model was used to forecast the COVID-19 transmission in Canada, Italy, and the United States [9] . DeepCovid [22] incorporates deep learning and temporal correlations between consecutive forecasts to perform short-term forecasting. In [23] , a stacked auto-encoder model is proposed to fit the transmission dynamics of the epidemics and applied to real-time forecasting on confirmed cases in China. ACTS [10] applies detrending and leverages inter-series attention mechanisms on embeddings of time-series segments to obtain the predictions from different regions in the United States.

The models mentioned above achieve their best performance in different situations. They also have their own limitations such as the assumption of linear pattern of time series data in SARIMA models, fixed transmission rate in compartmental models, and lack of interpretability in deep learning approaches. Due to these restrictions, this paper is aimed at presenting a comprehensive study using three models from each category, and taking other factors into consideration such as available historical time series data for training and hyperparameter optimization. Essentially, three models namely SARIMA, SEIR-HCD, and ACTS are applied to forecast the time series of the number of newly confirmed and death cases in the United States across different regions.

Performance metrics are essential for evaluating how well the model predictions fit the data. Choosing an appropriate metric for different tasks is crucial for establishing robust and useful models. Commonly used performance measures for time series forecasting include the Accuracy, MAPE, WAPE, MAE, MSE, RMSE, and RMSLE. Each metric has its own strengths and weaknesses in practice. For instance, the benefit of using RMSLE, MAE, MAPE, and WAPE as statistical indicators is that they are more robust to outliers than other metrics. The MSE and RMSE tend to penalize large prediction errors harshly and they are influenced a great deal by extreme values. Accuracy can provide people with the most intuitive feeling about how close is the predicted value to the actual value. Lower MSE, RMSE, MAE, WAPE, MAPE or RMSLE values, and Accuracy closer to 1 represent more accurate forecasting performances. Due to different characteristics among these metrics, it is tough to use a single metric to determine the quality of the model. Hence, we utilize all of these seven metrics to evaluate the predictive performance of three models mentioned above on COVID-19 forecasting of confirmed and death cases.

ARIMA is one of the most widely used approaches for time series forecasting. Specifically, it makes the prediction by utilizing the lags of time series and lagged forecast errors. An ARIMA model combines the differencing with an autoregression (AR) and a moving average (MA) model. It is characterized by three parameters: p, d and q, where p is the order of AR model and represents the number of time lags; d is the number of nonseasonal differences required to make the data stationary; q is the order of MA model and represents the number of lagged forecast errors. Altogether, using the backward shift operator B, the ARIMA model can be written as:

Here, y t is the trajectory value at time t, ε t is normally distributed with zero mean, and φ i 's (i = 1, 2, · · · , p) and θ j 's (j = 1, 2, · · · , q) are all unknown scalars.

As an extension of ARIMA, SARIMA admits seasonal components. Taking seasonality into account, SARIMA contains non-seasonal ARIMA parameters p, d, and q and seasonal ones s, P , D, and Q. Specifically, the SARIMA model is defined as:

Here, Φ i 's (i = 1, 2, · · · , P ) and Θ j 's (j = 1, 2, · · · , Q) are parameters to be estimated; P , D, and Q are seasonal counterparts of p, d, and q, respectively; s is the time-length of a single seasonal period. In our context, s = 7 (days).

The statistical measures, including Accuracy, MAPE, WAPE, MAE, MSE, RMSE, and RMSLE, are used to evaluate the predictive performance of the SARIMA model.

Classic epidemiological compartmental models such as SIR and SEIR have been widely applied to simulate the spread of diseases in a population. Since COVID-19 has a relatively long incubation period (5-14 days), during which there may be carriers who do not show any symptoms of the disease, we use an extended SEIR model, namely SEIR-HCD, with three additional compartments, which considers seven population compartments: susceptible (S), exposed (E), infected (I), recovered (R), hospitalized (H), critical care (C), and death (D). The disease transmission flow of the model is sketched in Fig. 1 . The model is comprised of a system of ordinary differential equations (ODE):

R 0 is the basic reproduction number for the coronavirus (i.e. the number of secondary infections each infected individual produces), t inf is the average infectious period of COVID-19, t inc is the average incubation period of an infected agent, t hosp is the average hospitalized period (i.e. average length of hospital stay before the patient recovers or becomes critical), and t crt is the average critical period (i.e. average time for a hospitalized patient to enter into a critical state since initial check-in). β, γ, and δ refer to ratios of asymptomatic infected individuals, hospitalized patients who switched to a critical state, and critical patients that result in fatalities, respectively. We employ the L-BFGS-B optimization algorithm [24] , [25] and optimize the model by finding the above 8 parameters: R 0 , t inf , t inc , t hosp , t crt , β, γ, and δ.

The statistical measures, including Accuracy, MAPE, WAPE, MAE, MSE, RMSE, and RMSLE, are used to evaluate the SEIR-HCD model performance.

The Transformer model has been proven to have great potentials for time series forecasting [26] , [27] . The ACTS model [10] is a new neural forecasting model based on Transformer that performs forecasts by comparing and utilizing similar patterns across time series detected from different geographic regions. It consists of three major components: detrending, attention module, and joint training. a) Detrending: ACTS employs a learnable Holt smoothing model to detrend long-term trends of the raw time series and leave the remaining time series (i.e. the residual). Linear extrapolation is used to generate forecasts based on long-term trends. The residual time series are then fed to the following attention module.

b) Attention module: The attention module is composed of two components: embeddings and inter-series attention. The attention mechanism investigates the relationship among different regions that have been undergoing the pandemic.

Embeddings. The residual time series after detrending is normalized by min-max normalization, which can be considered as a way of smoothing. Consequently, the first and last values of the normalized time series will always be 0 and 1, respectively. Then, a convolution layer is applied to encode the normalized time series into segment features, followed by an average pooling layer (segment embeddings) to model the similarity in different regions at different time periods. Likewise, another convolution-pooling layer is employed to encode the following incidents over H days (i.e. forecasting horizon) after each segment into development embedding. It represents the succeeding development after encoded segments and will serve as references for the target region forecasting.

Inter-series attention. Following the embeddings, the dotproduct attention is used to compute and combine the values of segments. Specifically, segment embeddings are linearly mapped to query vectors q i t and key vectors k t i and development embeddings are projected to value vectors v i t . The equations are given by:

Then, the model takes q i0 T , the last segment of target region i 0 , to compute its similarity with keys from all other regions and time segments to obtain a weighted sum of values:

where 

All of the above forecasting models are evaluated using the following seven performance metrics:

where y t are the true values,ŷ t are the predicted values,ȳ t = n −1 n t=1 y t , and n is the testing sample size.

The COVID-19 time series data are publicly available at JHU-CSSE [1] . We focus on the univariate time series data of daily confirmed and death cases from each of the five states in the United States, including California (CA), New York (NY), Texas (TX), Minnesota (MN), and Hawaii (HI). The dataset we used covers the reports up to May 15, 2021. We performed 7-day ahead and 28-day ahead forecasts on the newly confirmed and death cases of the above five states, given different lengths of historical data for training. The specific historical date ranges for different forecasting tasks in each state are shown in Table I. Table II provides the summary statistics of each univariate time series used in this study (i.e. all the available historical time series data on confirmed cases and deaths for each state). Due to the characteristics of time series data for confirmed and death cases such as high variance and skewed distribution, different data preprocessing pipelines and experimental settings for each model are described below.

The original COVID-19 time series data displays the nonstationary and high-variance behaviors, which can be reflected from the ACF and the PACF plots of daily confirmed and death cases for each state. We apply the Box-Cox transformation to stabilize the variance. Then, we take lag-one difference twice for the confirmed data and lag-one difference once for the death data to remove the trend. The plots of the original training data and the processed data are shown in Table III . Table IV suggests that all of the processed time series are stationary through the augmented Dickey-Fuller (ADF) test. Next, we apply grid search for hyperparameter tuning of the SARIMA model. The hyperparameter search space is listed in Table V . The maximum likelihood estimation (MLE) is employed to fit the model. Diagnostic checks are performed via residual plots and Q-Q plots to evaluate the effectiveness of the SARIMA model. Finally, the inverse Box-Cox transformation is applied to the forecasting results and different metrics are used to compare our results and the reported ground truths. 

For the SEIR-HCD model, we performed 7-day ahead and 28-day ahead forecasting of cumulative confirmed cases and death cases in CA, NY, TX, MN, and HI. The daily forecasting results on day T were then obtained by taking the difference between cumulative cases on day T and day T − 1. We used scipy.integrate.solve ivp in the SciP y library to solve the set of ODE systems with initial conditions S(0) = (N − n inf )/N , I(0) = n inf /N , and E(0) = R(0) = H(0) = C(0) = D(0) = 0, where N represents the population size of each state and n inf represents the number of infected people at t = 0. We took n inf = 1. The search ranges of the parameter estimation of the model are shown in Table VI . The solutions were then used to fit to the training data. Furthermore, we assumed that the days that are closer to the prediction periods are more heavily weighted. We chose the period for optimization to be 21 days. The L-BFGS-B algorithm was applied to minimize the mean squared logarithmic error function calculated from the above ODE system solutions. 

The forecasting performance of all three models was evaluated in terms of Accuracy, MAPE, WAPE, MAE, MSE, RMSE, and RMSLE, which are commonly used for time series forecasting in the literature [10] , [29] , [30] . 

In this section, we quantify the impacts (performance improvement and variation) of each dimension (i.e. model selection, hyperparameter tuning, training TS length) on the predicted performance over our testing dataset (i.e. either keep the data in the last 7 days or last 28 days of each state for validation), for the following 4 prediction tasks across five different regions: 7-day ahead forecasts on confirmed cases (7-C), 28-day ahead forecasts on confirmed cases (28-C), 7-day ahead forecasts on death cases (7-D), and 28-day ahead forecasts on death cases (28-D) . First, we quantify the percentage that each dimension contributes to the performance improvement and variation in terms of Accuracy. Furthermore, we investigate the dimension that has the largest influence on the predictive performance. Finally, we analyze the relationship between performance improvement and variation through different evaluation indicators contributed by each dimension.

For the hyperparameter tuning and training TS length across different regions, we define the baseline settings as the set of parameters that achieves the average performance for each model, and the length of TS = 200, respectively. For model selection, we choose SEIR-HCD, which exhibits the overall median performance among the three models. Then we quantify the performance improvement in terms of the Accuracy score of each dimension. For the SARIMA model, there are some combinations of the parameters that make MLE fail to converge. Hence, we remove all of the forecasting results for which some of the data was missing. In consideration of the ranges of the Accuracy from negative infinity to 1 by definition, we disregard the scores with infinite values, and then we normalize the remaining valid Accuracy scores between 0 and 1. Fig. 2 shows the percentage that each dimension contributes to the improvement in the Accuracy score over baseline by tuning only one dimension at a time while leaving others at baseline settings for each prediction task. We observe that model selection provides the largest performance gain (27.38%, 27.96%, 27.15%, and 19.04% of averaged performance improvement in the Accuracy score on the respective 7-C, 28-C, 7-D, and 28-D forecasting tasks across all regions), followed by TS length (21.98%, 16.44%, 17.87%, and 14.79% of averaged performance improvement in the Accuracy score on the respective 7-C, 28-C, 7-D, and 28-D forecasting tasks across all regions) and hyperparameter tuning (15.71%, 10.33%, 11.96%, and 9.62% of averaged performance improvement in the Accuracy score on the respective 7-C, 28-C, 7-D, and 28-D forecasting tasks across all regions) in decreasing order of improvement.

To validate whether the Accuracy score is representative of performance across all the regions in all prediction tasks, we use 6 other performance metrics described earlier to measure the degree of performance improvement of each dimension over all regions. We employ the same result processing techniques on other metrics as the Accuracy score by removing invalid forecasting results, and normalize the valid results. In addition to Accuracy, the performance improvement over all of the other metrics is defined as the reduced percent error between the tuned and the baseline settings. Table VIII shows Table VIII indicate that model selection brings the biggest performance improvement regardless of the metrics we use, followed by training TS length and hyperparameter tuning. In other words, the relative contribution to performance improvement for each individual dimension based on other metrics is consistent with the Accuracy score, suggesting that the Accuracy score is a representative metric.

Next, we evaluate how much each dimension contributes to the performance variation in the Accuracy score. By tuning one dimension at a time while leaving others at baseline settings, we obtain a range of performance scores in terms of Accuracy. Performance variation is then defined as the difference between the maximum and the minimum score for each dimension. Higher variation in a single dimension implies that a poor choice in that dimension could produce remarkable performance loss. Fig. 3 shows the proportion of performance variation in the Accuracy score attributed to each individual dimension. We observe that model selection brings about the largest variation in performance (27.38%, 35.50%, 22.10%, and 19.23% of averaged performance variation in the Accuracy score on the respective 7-C, 28-C, 7-D, and 28-D forecasting tasks across all regions), followed by TS length (21.98%, 13.97%, 15.15%, and 12.87% of averaged performance variation in the Accuracy score on the respective 7-C, 28-C, 7-D, and 28-D forecasting tasks across all regions) and hyperparameter tuning (15.71%, 9.28%, 10.65%, and 8.75% of averaged performance variation in the Accuracy score on the respective 7-C, 28-C, 7-D, and 28-D forecasting tasks across all regions). The important takeaway here is that even though model selection is the largest contributor to performance improvement, if not carefully chosen, it can lead to larger performance degradation compared to other dimensions.

To validate the representativeness of the accuracy score across all regions in different prediction tasks, 6 other evaluation metrics are used to evaluate the performance variation on each dimension in consideration of all regions. Table IX shows the averaged performance variation of each single dimension across all regions on different tasks. The results suggest that the proportion of performance variation in different metrics of each dimension follows an order that is consistent with the performance improvement in Table VIII . Therefore, for different metrics, the dimension that brings about greater performance improvement may also contribute to larger variation in performance. For every step that researchers take when performing COVID-19 forecasting, they should always be aware of the trade-off between benefits (improvement in performance) and risks (variation in performance) when adjusting each dimension.

The COVID-19 pandemic is exponentially spreading around the world. Reliable forecasting on the number of confirmed and death cases provides pertinent information to decisionmakers about the expected situations and the prevention measures that need to be taken. In consideration of the disparate impacts of social, economic, and environmental factors in different regions, we quantitatively analyze the predictive performance via disparate performance metrics in consideration of a wide range of configurations for COVID-19 case (confirmed and death) forecasting across different regions.

Our study focuses on understanding the relationship between predictive performance and performance variation for each individual dimension in common time series forecasting tasks.

There are a few key takeaways from our results. First, for time series forecasting, if a dimension brings more performance improvement, it is also likely to bring greater performance degradation without good decisions in such dimension. Second, choosing the correct model is the most crucial step that brings the largest impact in terms of the performance improvement and variation.

Clearly, more forecasting models from diverse categories and different regions around the world should be considered to provide a more general conclusion. Other dimensions such as training time and robustness to incorrect input data, are worth further exploration. 

An interactive web-based dashboard to track covid-19 in real time

Rationing intensive care-physician responses to a resource shortage

Short-term forecasting covid-19 cumulative confirmed cases: Perspectives for brazil

Data-based analysis, modelling and forecasting of the covid-19 outbreak

Forecasting the dynamics of covid-19 pandemic in top 15 countries in april 2020: Arima model with machine learning approach

Spatial prediction of covid-19 epidemic using arima techniques in india

Predicting mortality rate and associated risks in covid-19 patients

Forecasting covid-19 impact on rwi/isl container throughput index by using sarima models

Time series forecasting of covid-19 transmission in canada using lstm networks

Inter-series attention model for covid-19 forecasting

Deepcovid: Predicting covid-19 from chest x-ray images using deep transfer learning

Implications of heterogeneous sir models for analyses of covid-19

Seir modeling of the covid-19 and its dynamics

Epidemiological modeling in stochss live!" Bioinformatics

Model uncertainty, political contestation, and public trust in science: Evidence from the covid-19 pandemic

Validity and usefulness of covid-19 models

Epidemic analysis of covid-19 outbreak and countermeasures in france

Trend analysis and forecasting of covid-19 outbreak in india

Forecasting the dynamics of cumulative covid-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-regressive integrated moving average (arima) and seasonal auto-regressive integrated moving average (sarima)

Predicting the covid-19 epidemic in algeria using the sir model

Epidemic model guided machine learning for covid-19 forecasts in the united states

Deepcovid: An operational deep learning-driven framework for explainable real-time covid-19 forecasting

Artificial intelligence forecasting of covid-19 in china

Algorithm 778: L-bfgsb: Fortran subroutines for large-scale bound-constrained optimization

Scipy 1.0: fundamental algorithms for scientific computing in python

Deep transformer models for time series forecasting: The influenza prevalence case

Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting

Pytorch: An imperative style, high-performance deep learning library

Deep learning methods for forecasting covid-19 time-series data: A comparative study

Automated detection and forecasting of covid-19 using deep learning techniques: A review