key: cord-0825451-5dqob1qv authors: Yan, Kejia; Yan, Huqin; Gupta, Rakesh title: The predicted trend of COVID-19 in the United States of America under the policy of “Opening Up America Again” date: 2021-06-06 journal: Infect Dis Model DOI: 10.1016/j.idm.2021.05.005 sha: 4be3b2a47409d7e2c9b750f70987ab986e111dd8 doc_id: 825451 cord_uid: 5dqob1qv COVID-19 virus has been spreading worldwide for more than a year. At present, the situation of the new crown pneumonia epidemic remains full of tension and uncertainty. It is of concern is that the worst outbreak in the world is in the United States. The total number of confirmed new cases of COVID-19 and the total number of new deaths in the United States have entered their second and third cyclical peaks since the White House announced the “Open America Again” guidelines on April 16, 2020, and the start of the US presidential election season in August 2020. This paper combines the generalized exponential model (EXPM) with Chebyshev polynomials to develop a special generalized growth model (GGM) to predict the total number of daily new confirmed cases and the total number of new deaths in the United States for three periods under a 14-day sensitivity regression model. In this paper, the US epidemic is divided into three periods from early January 2020 to early January 2021, and three forecasts are made for the three periods.The first two prediction periods have already occurred and the predictions match well with known results, and the third period predicts that the total number of new confirmed cases of COVID-19 and the total number of new deaths in the United States will fall to a minimum level by next July, when the supply of COVID-19 vaccine has already begun. The results suggest that the “Open America Again” policy and the events of the 2020 US presidential election season have contributed to the worsening of the COVID-19 in the United States. Since human-to-human transmission of pneumonia was detected in Wuhan City in China at the end of December 2019, COVID-19, or coronavirus, has spread with astonishing speed around the world . On 13 January 2020, Thailand reported the first imported case of COVID-19 outside China. At the end of January 2020, the WHO and the US declared that the COVID-19 outbreak was a public health emergency. On 2 February 2020, the Philippines reported the first death from COVID-19 outside China. On 20 February 2020, South Korea was the first country to confirm more than one hundred cases of COVID-19 outside China. Six days later, on 26 February 2020, South Korea further confirmed more than one thousand cases of COVID-19. Fourteen days later, on 11 March 2020, confirmed cases in Italy reached more than 10,000, which made Italy the first country outside China to reach 10,000 cases. Twenty days later, on 31 March 2020, the number of confirmed cases in Italy reached over 100,000 cases. One month later, more than 1 million cases had been confirmed in the United States of America (WHO, 2020) . Coronaviruses, as positive-sense enveloped RNA viruses, belong to the Coronaviridae family, which can infect birds, mammals and humans with acute resolved or fatal pneumonia (Beck et al., 2020; Cavanagh, 2007; Lim et al., 2016; Weiss and Leibowitz, 2011) . The Coronaviridae family also includes two other famous coronaviruses, severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) (Kuiken et al., 2003; Zaki et al., 2012; . Even though the current mortality rate of COVID-19 is around 2.20% (based on the data from WHO) which is lower than SARS-CoV (9.6%) and MERS-CoV (34.4%) (WHO, 2004; WHO 2019) , compared with the infection rate of SARS-CoV and MERS-CoV in more than 10,000 people, as of 3 January 2021, COVID-19 has officially infected 83,326,479 people and caused 1,831,703 deaths (WHO, 2021) . The main reasons for this overwhelming increase are that the spread of the coronavirus occurs much faster than people expect, and also because the US is suffering from a severe lack of preparedness for an epidemic (WHO, 2004; WHO 2019; Marzia et al., 2020; . By 18 May 2020, New York, the most affected state, had reported 361,266 confirmed cases with 28,480 deaths (CDC, 2020) . New York City's mortality rate at that time was over 7.88%, higher than the world average of 6.75%. This implies that medical resources are not best allocated in New York. However, at a time when the number of new confirmed cases per day had reached more than 20,000 and the number of deaths per day had reached more than 1,000, Americans were being urged to return to work because of a looming economic recession. Based on the data of the US Department of Labor (U.S. Bureau of J o u r n a l P r e -p r o o f Labor Statistics report, 2020), the unemployment rate jumped to 14.7% in April, the highest in the history of recording. While it is certain that the first turning point in the total number of confirmed new cases and new deaths in the United States occurred in April and early May, the premature "Opening Up America Again" allowed the outbreak to lead to a complete outbreak, the second high point. Policy confusion and social unrest are important factors in the spreading backlash against COVID-19 in many states. As elections approach, incumbents who can run for re-election impose less stringent epidemic prevention restrictions, particularly in the US and India (Pulejo, and Querubín, 2021) . Protesters and crowds gathered for the presidential election campaign not only exacerbated the social rifts, but also directly increased the difficulty of controlling the outbreak. When people are concentrated in large numbers in crowded and inadequately ventilated environments, large-scale outbreaks are inevitable (Gonsalves and Yamey, 2020; Ran et al., 2021) . Indeed, there is urgent need for a COVID-19 vaccine, and for accurate prediction of the trend of the outbreak. The fear people have of the virus is not only that a vaccine for COVID-19 is not yet available, but also that the final scale of the disease for the forthcoming period has not been appropriately forecast. If there is no relative accurate measurement of the range of the final scale of the coronavirus disease, the policymakers and the medical system will be unable to allocate medical resources to maximum effect, causing the rhythm of economy recovery to be even further delayed. In order to provide guidance for policymakers and the medical system about the prevention and control of this pandemic, this study analyse the data of COVID-19 for the United States of America, as released by WHO, using Chebyshev polynomial functions to construct generalized exponential models of the trend of coronavirus disease transmission under different growth patterns. Moreover, this study forecast the scale of the coronavirus disease in the US for the next 180 days up to 1 July 2021, and find that the curves of the proposed model well simulate the US's official data curves. The empirical data of the coronavirus disease (COVID-2019) was taken from the World Health Organization's situation reports (https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports/), and Microsoft Excel and Python programming language were used to build and analyse time-series database. The World Health Organization released the first outbreak report on novel coronavirus (2019-nCoV) on January 21, 2020. Since then, WHO has issued daily or weekly COVID-19 outbreak reports. In the US, the first five new cases of coronavirus pandemic was confirmed on 20 January 2020. In this paper, the total number of daily new confirmed cases was selected from 20 January, 2020 to 3 rd January, 2021, and the total number of observed cases was 350. Until 3 rd January 2021, the total confirmed cases of in the US had reached 19,974,413 cases. The curve of the new confirmed cases of COVID-19 in the United States to date can be divided into three periods: period 1 was between January and May 2020, with a peak incidence of 35,386 cases on 11 April 2020; period 2 was between June and September 2020, with a peak incidence of 74,354 cases on 19 July 2020; period 3 was beginning from October 2020, with a peak incidence of 402,270 cases on 20 Dec 2020. In the US, the first two new deaths of coronavirus pandemic was confirmed on 3 rd March 2020. The data in this paper were selected for the total number of new deaths per day between 3 rd March 2020 and 3 rd January 2021, with a total 307 observations. Until 3 January 2021, the total deaths of COVID-19 in the US had reached 345,253 cases. The curve of the total new deaths in the US COVID-19 can be clearly divided into three periods: the first period is from March -June 2020, where the peak incidence is on April 17, with 6,409 cases; the second period is from July -October 2020, with a peak incidence of 1,486 cases on August 14; and the third period starts from November 2020, and the highest peak is 5,703 cases on January 1, 2021. The second peaks of the total new confirmed cases occurred in July 2020 and the total new deaths occurred in August 2020 might be caused by the policy of Donald Trump's "opening up America again". On 16 April 2020, the US President Donald Trump announced plan to reopen virus-ravaged U.S. economy, published guidelines for reopening the coronavirus-ravaged U.S. economy, asked states take a phased approach to let Americans return to work as conditions allow. Trump's guidelines (Whitehouse, 2020) include a three-stage prescription for restarting normal life in America: In Phase 1, restaurants, movie theaters, sports venues, and places of worship would be allowed to reopen and resume work if possible; in Phase 2, schools could reopen and non-essential travel could resume, but most employees would be encouraged to continue to work remotely; and in Phase 3, workplaces would be restored for unrestricted staff. The rush to return to work without effective disease control methods caused this spike in infections. The third peak in the total number of new confirmed cases occurs in December 2020, while the total number of new deaths in January 2021 may be due to the US presidential election season. Since August 2020, the United States entered into a presidential election season. To observe three cyclical trend changes in new confirmed cases and new deaths, regression models will be built periodically in this paper. The basic empirical data include the total number of new confirmed cases and the total number of new deaths. Since total confirmed cases and total deaths can be accumulated from the basic J o u r n a l P r e -p r o o f empirical data, this paper will focus on building regression models for new confirmed cases and new deaths to obtain the predicted values of total confirmed cases and total deaths. The predict period will begin from 4 January 2021 until to 2 July 2021 and last 180 days. For doing a comparison, this paper not only predicts the total number of new confirmed cases, the total number of new deaths, the total number of confirmed cases, and the total number of deaths, but also compares the empirical data with the predicted values. The generalized exponential model provides an upper bound for future scenarios. This model has been used to characterize and forecast the early epidemic growth patterns across a diversity of disease outbreaks in many diseases such as Ebola, foot-and-mouth disease, HIV/AIDS, influenza, measles, plague and smallpox (Viboud et al., 2016; Wu et al 2020) . If Y t represents a growth function, r t is a relative rate function at time , and if the pair (Y t , ) satisfy ordinary differential equation (ODE) function, then the model will be suitable for describing a growth procedure (Koya and Goshu, 2013) , thus a generalized growth model (GGM) can be defined as: t describes the incidence growth phase over time t; Y t describes the cumulative number of disease cases at time t; r (r>0) is the growth rate which controls the characteristic time scale of the dynamic, and ∈ [0,1] is the exponent that allows the model to capture different growth conditions from constant incidence to exponential growth for three cases: Case 1: for a constant growth rate incidence, = 0 Case 2: for a sub-exponential growth,0 < < 1, its solution equation is: Case 3: for an exponential growth, = 1. Then the GGM model will be written as the exponential growth model (EXPM) with the solution: (3) The outbreak will decrease transmission rates when proper epidemiological control and treatment is in place. At such time, logistic type models, such as the generalized logistic model (GLM) and the generalized Richards model (GRM) will be used to characterize the performance of the outbreak. Compared with the GGM model which has only one free parameter r, logistic type models have more free parameters (Chowell, 2017) . For example, a GRM model can be examined as: Where parameter r represents the intrinsic growth rate, K represents the size of the epidemic, and represents the extent of deviation. As Burger et al. (2019) summarized, differential equations can be categorized by the selections of the parameter r, p, a, k. To simplify the equation, when = 1, then a GRM model becomes a generalized growth model (GGM). It is first essential to determine the value of the pre-required parameter 3 in the analysis. However, parameter 3, as the final asymptotic total number of infections during the whole epidemic, is very hard to predict for COVID-19 at this stage. Similarly, an appropriate time scale of the epidemic's growth process r is very hard to set under the uncertainty of an epidemic prevention policy when the White House is eagerly engaged in "opening up American again". The volatility for total numbers of the new confirmed cases and total numbers of new deaths became very high since then. Thus, in order to avoid the estimation of epidemic growth process r, we will combine the exponential growth model (EXPM) and a polynomial function together to forecast the early epidemic growth patterns. Assume variable x t is defined as a continuous function of the time variable t: 5 = 6' , or = 7 − 1 (5) Function Y x can be defined as: Y x = 8 ! + 8 5 + 8 9 5 9 + ⋯ + 8 ; 5 ; = 8 ! + 8 * 6' , + 8 9 * 6' , 9 + ⋯ + 8 ; * 6' , ; (6) Assume that the variable y t is a COVID-19 related variable, such as the total numbers of confirmed new cases and new deaths. To avoid a zero value of the variable, y t may be calculated in a logarithm function. We will define the variable Y t as: , ? = % @ ' (7) Then combine equations (6) and (7): For a discrete time system, if the empirical sample time is E ∈ {0,1,2, … , n}, n < N, 5 E = 5 E , then: E = 8 ! + 8 5 E + 8 9 5 E 9 + ⋯ + 8 ; 5 E ; + L E (9) Or it can be directly represented by the time variable E as: Here the variable L E is the residual item or error of the regression. It is the difference between the observed value and the predicted value. For improving the accuracy of calculation, the function Y x will use a Chebyshev polynomial. When the time point E is used as the Chebyshev polynomial roots within the full time period, 5 E ; can be replaced by the N − O % basic Chebyshev polynomial. Because it is easy to transmit a Chebyshev polynomial to a traditional polynomial, we assume these is not big difference between the both polynomial types. Time series models are significant in predicting the impact of the COVID-19 outbreak and in taking the necessary measures to respond to this crisis (Ceylan, 2020) . In dealing with the time series data, many researchers have applied the autoregressive integrated moving average (ARIMA) model to predict the trend of epidemic outbreaks. Ribeiro et al. (2020) used an ARIMA model to forecast the short-term trend of COVID-19 cumulative confirmed cases in Brazil, and they suggest that a 5-day lag may have some sensitive impacts on regression results. Chakraborty and Ghosh (2020) used ARIMA models to forecast the COVID-19 trend for Canada, France, India, South Korea, and the UK, and they suggest that the lags may be 1 or 2 days. Ahmar et al. (2020) predicted the short-term effects of confirmed cases of COVID-19 in Spain by using the Sutte ARIMA method, and suggest that the lags may be 0, 1, or 2 days. Ceylan (2020) used an ARIMA model to predict the epidemiological trend of COVID-19 prevalence in Italy, Spain, and France, and suggest that the lags may be 0 or 1 day. Because the ARIMA model is dependent on the order of time series stationarity, it is not suitable to analyze longer lags of time variables in a regression. Different to the ARIMA models, we will analyze regression models independently when the size of the empirical window is less than the size of the full sample window by 14 day observations. The reason for choosing 14 days is that the effects of isolation and social distancing will take 14 days to show up, just as the "opening up American again" requires a lag of time to characterize the impact of policy on the epidemic. To analyze how well the regression models are fitted to a country's epidemic outbreaks, the root mean squared error (RMSE) is usually used in an assessment (Burger et al., 2019; Ceylan, 2020) . Many other similar assessments, including the mean absolute error (MAE) and mean absolute percentage error (MAPE), are used to evaluate the effectiveness of adopted models (Ribeiro et al., 2020; Chakraborty and Ghosh, 2020) . We will do a similar thing, and moreover, we will consider the values of the coefficient of determination (P 9 ). Figure 1 lists the structure of the time windows for forecasting. The size of the full sample window is N, which will be combined by three sub windows, including an empirical sample window, a test window and a forecasting window. The size of the empirical sample window is n. The regression model will be based on the empirical sample window. At the time point E ∈ {0,1,2, … , n}, the COVID-19 related variable ? E ∈ {? ! , ? , … ? Q } will be introduced to the regression model. When the time point E ∈ {n + 1, … , N}, the COVID-19 related variable ? E ∈ {? Q6 , … , ? R } will be the prediction value from the regression model. The size of the test window is m. Since the total number of new confirmed cases and new deaths in the J o u r n a l P r e -p r o o f last 14 days are quite sensitive to building regression models and predicting future trends, this paper will perform a sensitivity test by examining the effect of the test window on the prediction timem = 0,1,2, … ,13. The size of the expected forecasting window is NN. Since the purpose of this paper is to test the prediction results for three time periods, different time windows will be chosen. Table 1 lists the structure of the three empirical sample windows and three forecasting windows for the total new confirmed cases in the US. Table 2 lists the structure of the three empirical sample windows and three forecasting windows for the total new deaths in the US. The structure of the empirical analysis is the same for all three periods. The structure of the time windows for empirical analysis and future forecasting. The coronavirus pandemic has swept through the world each country and every day over a year. It is difficult to predict the exact number of new confirmed cases or new deaths which will occur today or tomorrow. For the purposes of this research, we will determine an appropriate value of q by using the following assessments. First, the value of q is selected if it is fitted to an appropriate model that has a greater value of the coefficient of determination P 9 than the other models when q is changed and satisfied the conditions of q ≤ 9 n − m − 1 . Second, the value of q is selected if it is fitted to an appropriate model that has a smaller value of the root mean square error (RMSE), the mean absolute error (MAE) and mean absolute percentage error (MAPE), J o u r n a l P r e -p r o o f than the other models when the value of q is changed. Third, because the degree of freedom is also the power of the independent variable, we will choose a lower value of q if both of the values of P 9 or P are the same. Fourth, for Chebyshev polynomial, when the number of power q increases, the degree of accuracy will also increase, we will choose a reasonable value of q by observing the trend of curve, it is required that the prediction value should not be less than zero or close to an infinite value. In order to forecast the total numbers of new confirmed cases which are likely to occur every day during the period between 20 January 2020 and 19 May 2020, A suitable polynomial power N needs to be found. Thus, after testing the models for each parameter of N from different degrees of freedom, parameter N = 29 was found to be the best choice when the values of PZ[\, Z \, and Z ]\ are the smallest and the values of P 9 , P and ^. P 9 are the largest. Table 3 lists the results of testing the degree of freedom in regression (1) * represents that the Chebyshev polynomial is the best model; (2) the listed models are representatives from the set of models. When the power of the polynomial N is defined, the analysis of the sensitivity of the window size will be tested as `= 0,1,2, 3,4,5,6,7,8,9,10 ,11,12,13 . The sample data from within the last 14 days has strong sensitivity for building the regression models for forecasting future trends. Table 4 lists the test results by using the selected 29-order Chebyshev polynomial based on the total confirmed new cases for the period one. When the power of the polynomial is defined as N = 29, the sensitivity analysis window size is tested as `= 0,1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . All of the 14 models' values of PZ[\, Z \, and Z ]\ are quite small and the values of P 9 , P and ^. P 9 are quite large. Figure 2 depicts the prediction curves of the total confirmed new cases from the selected 29-order Chebyshev polynomial for the period 1 when `= 0,1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . Since the first five COVID-19 new cases were confirmed on 20 January 2020 in the US, the first daily peak of the total confirmed new cases were 35,386 cases that occurred on 11 April 2020. After decreasing in May and increasing in June, the second daily peak of the new confirmed cases were 74,354 cases that occurred on 19 July 2020. It is clearly that all of the 14 prediction curves have covered the real rebound trend (reference sample) of the total new confirmed cases from June to July 2020. When we compare the 14 prediction curves from the 14 regression models, we see that the lowest curve is from the prediction model when `= 4, and the highest curve is from the prediction model when `= 8. The other 12 prediction curves from the models `= 1,2, 3, 4, 6, 7, 8, 9, 10, 11, 12 are located in the middle between the two curves of `= 0,13 . The distribution of curve `= 0 O 13 represents the fact that the pandemic backlash has begun as more U.S. states respond to "Open America Again". September 2020. It can be seen that the mean predicted total number of confirmed cases in period one is very close to the total number of confirmed cases in the reference sample. For the second period, the empirical sample window is between 20 January 2020 and 3 September 2020, and the forecasting window is between 4 September 2020 and 3 January 2021. Following the approach of the first prediction, we make the optimal choice of the original data in Tables 5 and 6. Table 5 lists the test result for choosing the best model of the Chebyshev polynomial based on the total confirmed new cases during the period two. The parameter N = 69 is the best choice when the values of PZ[\, Z \, and Z ]\ are the smallest and the values of P 9 , P and ^. P 9 are the largest. Table 6 lists the test results by using the selected 69-order Chebyshev polynomial based on the total confirmed new cases for the period two. When the power of the polynomial is defined as N = 69 , the sensitivity analysis window size is tested as `= 0,1,2,3,4,5,6,7,8,9,10,11,12,13. All of the 14 models' values of PZ[\, Z \, and Z ]\ are quite small and the values of P 9 , P and ^. P 9 are quite large. (1) * represents that the Chebyshev polynomial is the best model; (2) the listed models are representatives from the set of models. Figure 4 depicts the prediction curves of the total confirmed new cases from the selected 69-order Chebyshev polynomial for the period two when `= 0,1,2,3,4,5,6,7,8,9,10,11,12,13. The second daily peak of the total confirmed new cases were 74,354 cases that occurred on 19 July 2020 in the US. After the second peak, the total number of new confirmed cases showed a downward trend in August and an upward trend after August. Until 20 December 2020, the daily total confirmed new cases of COVID-19 in the US increased reaching the peak of 402,270 cases. It is clearly that all of the 14 prediction curves have revealed the real rebound trend (reference sample) of the total confirmed new cases from September to December 2020. Given the high accuracy of the first two forecasts, for the third period, the empirical sample window is between 20 January 2020 and 3 January 2021, and the forecasting window is between 4 January 2021 and 2 July 2021. Similarly, the optimal choice of model for the Chebyshev polynomial is given in Tables 7 and 8. are the smallest and the values of P 9 , P and ^. P 9 are the largest. Table 8 lists the test results by using the selected 66-order Chebyshev polynomial based on the total confirmed new cases for the period three. When the power of the polynomial is defined as N = 66 , the sensitivity analysis window size is tested as `= 0,1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . All of the 14 models' values of PZ[\, Z \, and Z ]\ are quite small and the values of P 9 , P and ^. P 9 are quite large. (1) * represents that the Chebyshev polynomial is the best model; (2) the listed models are representatives from the set of models. Figure 6 depicts the prediction curves of the total confirmed new cases from the selected 66-order Chebyshev polynomial for the period three when `= 0,1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . The third daily peak of the total confirmed new cases were 402,270 cases that occurred on 20 December 2020 in the US. From the prediction results, we can see that the prediction values of the total confirmed new cases of COVID-19 in the US will continuously increase when `= 7, 8, 9, 10, 11, 12, 13 or change to decrease when `= 0, 1, 2, 3, 4, 5, 6. Figure 7 depicts the prediction curves of the total confirmed cases of COVID-19 from the selected 66-order Chebyshev polynomial model for the period three. On 1 January 2021, the sample total confirmed cases of COVID-19 in the US was 19,346,790 cases. When `= 12 or `= 13, the prediction values of the total confirmed cases of COVID-19 in the US will reach the highest values of 106,109,318 cases or 112,300,253 cases until to the day of 1 July 2021. Since the COvid-19 vaccine has been mass-produced and the U.S. presidential election is over, there is reason to be optimistic in selecting the lowest predicted value of the model `= 2, 1 as the best models to predict the future values of the total confirmed new cases and total confirmed cases of COVID-19 in the US. According to the prediction results, the total confirmed new cases of COVID-19 in the US will decrease reaching the lowest level on 1 July 2021. When `= 1 and `= 2, the predicted range of the total confirmed new cases of COVID-19 in the US is between 3028 and 21388 daily. Moreover, the prediction values of the total confirmed cases of COVID-19 in the US will have reached between 38,102,565 and 40,271,338 cases by the day of 1 July 2021. For the first period, the empirical sample window is between 3 March 2020 and 19 May 2020, and the forecasting window is between 20 May 2020 and 2 September 2020. Table 9 lists the test result for choosing the best model of the Chebyshev polynomial based on the total new deaths during the period one. The parameter N = 19 is the best choice when the values of PZ[\, Z \, and Z ]\ are the smallest and the values of P 9 , P and ^. P 9 are the largest. Table 10 lists the test results by using the selected 19-order Chebyshev polynomial based on the total confirmed new cases for the period one. When the power of the polynomial is defined as N = 19, the sensitivity analysis window size is tested as `= 0,1,2, 3,4,5,6,7,8,9,10,11,12,13. J o u r n a l P r e -p r o o f (1) * represents that the Chebyshev polynomial is the best model; (2) the listed models are representatives from the set of models. Figure 8 depicts the prediction curves of the total new deaths from the selected 19-order Chebyshev polynomial for the period one when `= 0,1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . Since the first two COVID-19 new cases were confirmed on 3 March 2020 in the US, the first daily peak of the total new deaths were 6,409 cases that occurred on 17 April 2020. After decreasing in May and increasing in June and July, the second daily peak of the total new deaths were 1,486 cases that occurred on 14 August 2020. Figure 9 shows the prediction curves of the total new deaths of COVID-19 from the selected 19-order Chebyshev polynomial model for the period one. It is clearly that the curve of the sample total deaths represented in reference sample is almost covered by the prediction curve of `= 11, which is between the prediction curves of `= 12, 13 and the prediction curves of `= 0, 5, 6. The reference sample was 182,162 cases on September 1, 2020, while the predicted number of deaths for ` = 11 was 191,534 cases, and the predicted curve of deaths was close to the total sample deaths. For the second period, the empirical sample window is between 3 March 2020 and 3 September 2020, and the forecasting window is between 4 September 2020 and 3 January 2021. Table 11 lists the test results for choosing the best model of the Chebyshev polynomial based on the total new deaths during the period two. The parameter N = 35 is the best choice when the values of PZ[\, Z \, and Z ]\ are the smallest and the values of P 9 , P and ^. P 9 are the largest. Table 12 lists the test results by using the selected 35-order Chebyshev polynomial based on the total new deaths for the period two. When the power of the polynomial is defined as N = 35, the sensitivity analysis window size is tested as `= 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . (1) * represents that the Chebyshev polynomial is the best model; (2) the listed models are representatives from the set of models. Figure 10 depicts the prediction curves of the total new deaths from the selected 35-order Chebyshev polynomial for the period two when `= 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . In the US, after the second peak occurred on 14 August 2020, the total new deaths trended to decrease in September and October and increase in November and December 2020. Until 1 January 2021, the daily total new deaths of COVID-19 in the US increased and reached 5,703 cases. It is evident that all 14 predicted curves cover the area of the total number of new deaths in the sample from September to December 2020. Figure 11 depicts the prediction curves of the total deaths of COVID-19 from the selected 35-order Chebyshev polynomial model for the period two. Significantly, the total sample mortality curve represented by the reference sample lies between the predicted curves of the lowest and highest predicted values when `= 1 and `= 6. The mean predicted value of total deaths as of January 1, 2021 is 320,373, which is very close to the reference sample total of 335,789 deaths. As the accuracy of the first two forecasts is high, the empirical sample window is between 3 March 2020 and 3 January 2021, and the forecasting window is between 4 January 2021 and 2 July 2021. Identically, the optimal model choice for the Chebyshev polynomial is shown in Tables 13 and 14. of P 9 , P and ^. P 9 are the largest. Table 14 lists the test results by using the selected 63-order Chebyshev polynomial based on the total confirmed new cases for the period three. When the power of the polynomial is defined as N = 63, the sensitivity analysis window size is tested as `= 0,1,2,3,4,5,6,7,8,9,10,11,12,13. (1) * represents that the Chebyshev polynomial is the best model; (2) the listed models are representatives from the set of models. Figure 12 depicts the prediction curves of the total confirmed new cases from the selected 63-order Chebyshev polynomial for the period three when `= 0,1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 . The third daily peak of J o u r n a l P r e -p r o o f the total new deaths were 5,703 cases that occurred on 1 January 2021 in the US. From the prediction models, we can see that the prediction values of the total new deaths of COVID-19 in the US will continuously increase when `= 0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 or change to decrease when `= 2, 1. Figure 13 depicts the prediction curves of the total deaths of COVID-19 from the selected 63-order Chebyshev polynomial model for the period three. On 1 January 2021, the sample total deaths of COVID-19 in the US was 335,789 cases. Until to the day of 1 July 2021, when `= 12 or `= 13, the prediction values of the total deaths of COVID-19 in the US will reach the highest values of 1,654,950 cases or 1,941,782 cases, which are 4 times more than the sample total deaths on 1 January 2021. After the COvid-19 vaccine has been mass produced and the US presidential election is over, we have reason to be optimistic in choosing the predicted value of the model, i.e. ` = 2,1 is the best model to predict the future value of total confirmed new cases of COvid-19 in the US. According to the prediction models of `= 2, 1, the total new deaths of COVID-19 in the US will decrease and reach the lowest level in July 2021. When `= 2 or `= 1, then the prediction values of the total new deaths of COVID-19 in the US will reach 1,958 and 3,348 cases on 1 July 2021 and the prediction values of the total deaths of COVID-19 in the US will have reached between 771,862 and 930,619 cases on 1 July 2021. At the present time the coronavirus disease situation remains tense and uncertain. There is great concern that in the US the situation of the COVID-19 pandemic is the worst in the world. But even worse, when the number of new confirmed cases reaches more than 20,000 daily and the number of deaths reaches more than 1,000 daily, Americans were heading to back to work at a time of heavy economic recession. Since the White House published "opening up America again" guidelines on 16 April 2020, the number of total confirmed new cases and the number of total new deaths had consequently entered the second periodic peaks in July 2020 and August 2020. Furthermore, since the 2020 US presidential election season was beginning in August 2020, the number of total confirmed new cases and the number of total new deaths had then entered the third periodic peaks in December 2020 and January 2021. It is a matter of great urgency that the final scale of the disease for the forthcoming period is forecast appropriately. A reliable forecast is very important for policymakers to allocate medical resources and step the economic recovery appropriately. The generalized exponential model (EXPM) is a special generalized growth model (GGM), it has been used to characterize and forecast the early epidemic growth patterns across a diversity of outbreaks of many diseases due to its simplicity and reliable performance (Wu et al 2020) . This paper has combined the EXPM model with the Chebyshev polynomial together and built a special GGM model to predict the daily numbers of total new confirmed cases and total new deaths in the US for three periods under a 14-day sensitivity regression patterns. The current and future situations of the COVID-19 pandemic in the US are presented as 14 scenarios (m=0 to 13). This study provides a range of predictions about the prevalence of COVID-19 in the US under the uncertain effects of "opening up American again" and the 2020 US presidential election season. The prediction J o u r n a l P r e -p r o o f results for period one and period two have approved that the curves of the prediction values can cover the curves of the sample total new confirmed cases and the sample total new deaths. Based on the prediction results for the period three, from the perspective of optimism and COVID-19 vaccine, the total confirmed new cases and total new deaths of COVID-19 in the US will decrease and reach the lowest level in July 2021; until to 1 July 2021, the daily total confirmed new cases will decrease to 3,028 cases, the daily total new deaths will decrease to 1,958 cases, the total confirmed cases will increase to 38,102,565 cases, and the total deaths will increase to 771,862 cases. SutteARIMA: Short-term forecasting method, a case: COVID-19 and stock market in Spain Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model Comparative analysis of phenomenological growth models applied to epidemic outbreaks Coronavirus avian infectious bronchitis virus CDC, 2020. Coronavirus Disease (COVID-19) Cases in the U.S. Available on May 17, 2020 at the website Estimation of COVID-19 prevalence in Italy Sample-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: A data-driven analysis Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts Political interference in public health science during covid-19 Comparing COVID-19 and the 1918-19 influenza pandemics in the United Kingdom Solutions of rate-state equation describing biological growths Human coronaviruses: a review of virus-host interactions Delayed access or provision of care in Italy resulting from fear of COVID-19 Electoral concerns reduce restrictive measures during the COVID-19 pandemic The changing patterns of COVID-19 transmissibility during the social unrest in the United States: A nationwide ecological study with a before-and-after comparison Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil Coronavirus pathogenesis Guidelines: Opening up Amenrica again Summary of probable SARS cases with oneset of illness from 1 Middle East respiratory syndrome coronavirus (MERS-CoV). World Health Organization Coronavirus disease (COVID-2019) situation reports. World Health Organization Coronavirus disease (Covid-19) Situation Report-111. World Health Organization Weekly epidemiological update -5 Generalized logistic growth modeling of the COVID-19 outbreak in 29 provinces in China and in the rest of the world Serial interval in determining the estimation of reproduction number of the novel coronavirus disease (COVID-19) during the early outbreak Quantifying the association between domestic travel and the exportation of novel coronavirus (2019-nCoV) cases from Wuhan, China in 2020: a correlational analysis Prediction of the COVID-19 spread in African countries and implications for prevention and control: A case study in South Africa J o u r n a l P r e -p r o o f All equations in IDM 288:7) Y t = ln 6 , , 6 , = ! 7 # (8) 6 , = ! 8 9 -8 .-8 : . : -⋯-8 ; . ; (9) < = 0 + 0 + < + 0 1 + < I Kejia Yan, declare that this paper, submitted in fulfillment of the requirement for the Infectious DiseaseModelling. This work has not previously been submitted in any journal. To the best of my knowledge and belief, the dissertation contains no material previously published or written by another person except where due reference is made in the dissertation itself.We know of no conflicts of interest associates with this paper, and there has been no significant financial support for this work that could have influenced its outcome. As Corresponding Author, I confirm that the manuscript has been read and approved for submission by all named authors. Date: 23/05/2021 J o u r n a l P r e -p r o o f