key: cord-1002549-jipz6m7q authors: Cihan, Pınar title: Forecasting fully vaccinated people against COVID-19 and examining future vaccination rate for herd immunity in the US, Asia, Europe, Africa, South America, and the World date: 2021-07-14 journal: Appl Soft Comput DOI: 10.1016/j.asoc.2021.107708 sha: 00e668d2e93e8b7ac364a2f301ac107530e8afec doc_id: 1002549 cord_uid: jipz6m7q Coronavirus disease 2019 (COVID-2019) has spread rapidly all over the world and it is known that the most effective way to eliminate the disease is vaccination. Although the traditional vaccine development process is quite long, more than ten COVID-19 vaccines have been approved for use in about a year. The COVID-19 vaccines that have been administered are highly effective enough, but achieving herd immunity is required to end the pandemic. The motivation of this study is to contribute to review the countries’ vaccine policies and adjusting the manufacturing plans of the vaccine companies. In this study, the total number of people fully vaccinated against COVID-19 was forecasted in the US, Asia, Europe, Africa, South America, and the World with the Autoregressive Integrated Moving Average (ARIMA) model, which is a new approach in vaccination studies. Additionally, for herd immunity, the percentage of fully vaccinated people in these regions at the beginning of 2021 summer was determined. ARIMA results show that in the US, Asia, Europe, Africa, South America, and the World will reach 139 million, 109 million, 127 million, 8 million, 38 million, and 441 million people will be fully vaccinated on 1 June 2021, respectively. According to these results, 41.8% of the US, 2.3% of Asia, 17% of Europe, 0.6% of Africa, 8.8% of South America, and 5.6% of the World population will be fully vaccinated people against the COVID-19. Results show that countries are far from the herd immunity threshold level desired to reach for safely slow or stop the COVID-19 epidemic. The coronavirus was reported as pneumonia of unknown etiology in Wuhan, China in December 2019. In January 2020, it was determined that the disease agent was a new coronavirus (2019-nCoV) that was not previously detected in humans. Because of its high similarity to SARS coronavirus (SARS-CoV), it was named SARS-CoV-2, and the disease that occurred was later named coronavirus disease 19 (COVID-19) [1] . The virus spread rapidly around the world, as there is no treatment and vaccine for COVID-19 disease. In the first three months, 114 countries were affected by the virus, and 4291 deaths occurred. As a result of the rapid spread of the disease, it was declared as a global pandemic by the World Health Organization (WHO) on March 11, 2020 [2] . SARS-CoV-2 can be transmitted from an infected person without symptoms and can lead to a pandemic disease within a week. This situation shows how important and mandatory vaccination is in controlling SARS-CoV-2 [3] . Therefore, a lot of effort has been made recently to develop vaccines against human coronavirus E-mail address: pkaya@nku.edu.tr. infections such as MERS and SARS. However, the fact that antiviral agents or vaccines have not been developed against MERS and SARS viruses until today has made COVID-19 a global threat [4] [5] [6] . Developed SARS-CoV-2 vaccines are based on mRNA, viral vector, inactive and protein. mRNA type vaccines provide the genetic code (DNA or RNA) for our cells to produce viral proteins. Once the proteins (non-disease causing) are produced, the body develops immunity against the virüs [7] . The first COVID-19 vaccine is Pfizer-BioNTech and is of the mRNA type. In viral vector-based vaccines, the genetic material of the virus is inserted into other viruses that do not cause genetic disease and applied to humans. Inactivated or weakened virus vaccine technique is the classical method. Here, the weakened or dead virus is used. In protein-based vaccine, virus proteins are used either directly isolated from the virus or artificially produced. Developed SARS-CoV-2 vaccines and properties of these vaccines are given in Table 1 . Vaccines have proven to be the most effective and economical way to prevent and control infectious diseases [32] . Although COVID-19 vaccines are highly efficacy enough, herd immunity is required to end the pandemic. Herd immunity can be achieved by vaccination or natural immunization. The fastest way to achieve https://doi.org/10.1016/j.asoc.2021.107708 1568-4946/© 2021 Published by Elsevier B.V. herd immunity is vaccination. Herd immunity for the COVID-19 disease may vary from country to country because not every individual in the community can be vaccinated (for example, babies, pregnant women, those with medical problems, or those who do not want to be vaccinated). For this reason, reaching the minimum vaccination rate determined in order to achieve herd immunity is important in terms of eliminating the COVID-19 pandemic. The percentage of people who need to be immune to ensure herd immunity varies with each disease. Because this may vary depending on the vaccine, the population, the populations prioritized for vaccination, and other factors [33] . By forecasting the total number of fully vaccinated people by country, it can be observed that approximately how much of the population will be vaccinated and how close the threshold level for herd immunity has been reached. Estimating the future values of an observed time series plays an important role in many sciences and engineering fields. The main idea of the ARIMA model is to identify mathematical and statistical theory and then used it to forecast time-series data. Integrated Moving Average (ARIMA) model is frequently used to predict future values due to its simple structure, flexibility, fast applicability and captures many different patterns [18] . In the past epidemics and health studies, the ARIMA time series model was successfully used [34] [35] [36] [37] [38] . ARIMA models are also widely and successfully used for current COVID-19 epidemic research. Some studies using the ARIMA model for the COVID-19 epidemic are given in Table 2 . In the literature, there are many studies on estimating the number of COVID-19 cases and/or deaths using the ARIMA models. Nevertheless, there is a gap in the literature in estimating the prevalence trend of the COVID-19 vaccinations by using the ARIMA time series model. The main purpose of this study is to forecast the number of people fully vaccinated against COVID-19 in the US, Asia, Europe, Africa, South America, and the World and to analyze whether it has reached a sufficient rate for herd immunity. ARIMA model was used to forecast the future number of fully vaccinated people for selected regions. Thus, the number of vaccinations and vaccine supply plans needed in these regions in the future for herd immunization can be helped. The contributions of this paper are as follows; • Forecasting the number of people fully vaccinated against COVID-19 in the near future with ARIMA, which is the new approach in the COVID-19 vaccination studies, • To identify the most successful ARIMA models in estimating the number of fully vaccinated people against COVID-19 in the US, Asia, Europe, Africa, South America, and the World, Table 2 Various studies on COVID-19 epidemic using ARIMA. Alzahrani et al. [16] Case Sudi Arabistan Dehesh et al. [17] Case China, Italy, South Korea, Iran, Thailand Ceylan [18] Case Italy, Spain, France Anne [19] Case India Kumar et al. [20] Case and death Fifteen countries Tandon et al. [21] Case Eight countries and Asia regions Ilie et al. [22] Case Nine countries Kufel [23] Case Selected European Sharma [24] Case India Chaurasia [25] Death World Argawu [26] Case Ten countries Lukman et al. [27] Case SA, Nigeria, Ghana, Egypt Perone [28] Case Italy, Russia, USA Maheshwari et al. [29] Case and death India Katoch et al. [30] Case India Kırbaş et al. [31] Case Eight countries • To determine the people fully vaccinated against COVID-19 rate of the population at the begging of the summer in the US, Asia, Europe, Africa, South America, and the World to achieve herd immunity. In this study, the number of people fully vaccinated against COVID-19 in the US, Asia, Europe, Africa, South America and the World were used to model and validate the ARIMA. The dataset were obtained from the Our World in Data [39] . The datasets used in the study start from the first date of fully vaccinated people against COVID-19 and ends on May 22, 2021. So the number of data varies for each region. The date of the first fully vaccinated people of the investigated regions and data size are given in Table 3 . ARIMA, also known as the Box-Jenkins model, is shown as ARIMA (p, d, q). Parameter p in the model is the order of autoregression, parameter d is the degree of difference, and parameter q is the order of the moving average. The ARIMA model consists of Autoregressive (AR), Moving Average (MA) and ''I'' stands for integration sections. Three basic ARIMA models for a stationary time series are mathematically representing as follows [40] ; Autoregressive model of order p or AR(p) model: Moving-average model of order q or MA(q): Autoregressive moving average model of order p and q or ARMA(p,q): Where, φ is the autoregression and θ is the moving average parameters. y t is the actual value at a time t. δ is the constant. The random disturbance term ε t is assumed to be white noise which means it is independently identically distributed with mean 0 and a common variance for all terms. Akaike Information Criterion (AIC), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE) criteria were used to select the most successful ARIMA models for the dataset. The AIC shows how well the model fits the observed series. The most suitable model for the dataset is that have smallest [41] . These performance criteria can be calculated by using Eqs. (4)-(6), respectively. Where, L is the likelihood function, m is the total number of parameters in the model. Where, p is the predicted value, a is the actual value. It is desirable for the above three error measurement criteria to be lower. Error is zero indicates that it is a statistically perfect model. Vaccination has started to be administered in countries on different dates. Until May 22, 2021, 129 .01 million in the US, 96.56 million in Asia, 105.92 million in Europe, 33.95 million in South America, 6.84 million in Africa, and 388.60 million people fully vaccinated against COVID-19 in the World (Fig. 1) . The process of ARIMA modeling consists of four consecutive steps: identification of model, parameters estimation, diagnostic checking, and forecasting. In the first step, it is identified whether the time series variable is stationary and non-stationary. If the series is non-stationary, it is converted to stationary. Time-series graphs are given in Fig. 2 to observe the stationarity of the dataset. Original time-series graphs are shown in Fig. 2 A. It is seen that there is an upward trend in all regions examined, that is, the original series is non-stationary. Therefore, the first-order difference was applied to the original data to stabilize the mean of the people fully vaccinated against COVID-19. The first-order difference of the series is shown in Fig. 3B . When the first-difference series are examined, trends of all series still observed, so the second-order difference was applied. As seen in Fig. 2C , after the second-order difference was taken, all series became stationary. Augmented Dickey-Fuller (ADF) test was applied to confirm the stationarity of the time series and the results are given in Table 4 . ADF test results also show that the time series stabilize after the second difference was taken. This shows us that the parameter d is 2 in the ARIMA model. Candidate ARIMA models' test results are given in Table 5 . ARIMA models with minimum AIC, RMSE and MAPE criteria were chosen as the best models. Accordingly, the ARIMA(5,2,2), The ACF and PACF graphs of the residuals for the best fitted ARIMA models are presented in Fig. 3 . The ACF determine whether the previous value in the series is related to the following value. The PACF shows the amount of correlation between a variable and a lag of itself. When ACF and PACF graphs are examined, it is seen that the residuals generally do not exceed significant boundaries. Box-Ljung test was used to check the residuals are white noise or not. The null hypothesis (H 0 ) for a Box-Ljung test is the residuals are independently distributed. Therefore, it is desirable to reject the null hypothesis. Large p-value indicate that the residuals have no remaining autocorrelations, i.e., they resemble white noise. Box-Ljung test results of models are given in Table 6 . Fig. 3 shows that the residuals are white noise since all the autocorrelation and partial autocorrelation coefficients are small and within two standard deviations at a 5% level of significance. Also, by Ljung-Box statistic results correlations are not significant and hence the residuals are white noise. With these best models determined, the number of people fully vaccinated against COVID-19 was forecasted with 80%-95% confidence intervals (CI) (Fig. 4) . Forecasting of the number of people fully vaccinated against COVID-19 for the next ten days according to ARIMA models is given in Table 7 . The main purpose of this study is to determine the full vaccination rate in the US, Asia, Europe, Africa, South America, and the World on June 1, 2021. For this purpose, the number of fully vaccinated people in these regions was forecasted using the ARIMA time series model. Thus, they will be able to observe how far countries are from the threshold level required to achieve herd immunity. With the determined ARIMA models, the number of people fully vaccinated against COVID-19 was estimated on June 1, 2021. The populations, vaccination information, and the forecasted number of full vaccination people of the regions examined in this study are given in Table 8 . Almost all countries desire to administer vaccines to at least 50% of their population until the beginning of the summer. This seems unlikely based on the total number of full vaccinated people estimated by ARIMA models. The ratio of the number of people fully vaccinated against COVID-19 to the population of the examined regions on June 1, 2021 is shown in Fig. 5 . According to the estimation results of ARIMA models, about 139 million, 109 million, 127 million, 8 million, 38 million, and 441 million people will be fully vaccinated in the US, Asia, Europe, Africa, South America, and the World, respectively (on June 1, 2021). The 2021 population of these regions are 333 million, 4.68 billion, 748 million, 1.37 billion, 434 million, and 7.87 billion, respectively. On June 1, 2021, it is predicted that fully vaccinated to 41.8% of the people in the US, 2.3% in Asia, 17% in Europe, 0.6% in Africa, 8.8% in South America, and 5.6% in the World. According to the findings obtained as a result of the study, the US reaches the highest level in the fully vaccinated rate on June 1, 2021. Nevertheless, none of the examined regions will reach the herd immunity threshold until June 1, 2021. These days when the numbers of COVID-19 cases and deaths are at their peak again, the health systems of many countries have collapsed and they have started to impose curfews again. In addition to the tragic deaths, prohibitions affect the country's economies negatively. To prevent all these negativities, it is thought that vaccination rates should reach quickly the threshold level required to provide herd immunity. Vaccination rates in Fig. 5 show that high-income countries should provide vaccines and healthcare support to low and middle-income countries. In addition, vaccine companies need to make plans to increase vaccine production rates, and countries to provide more vaccines and to apply vaccines quickly. Nevertheless, it has been reported that some mutant viruses are able to vaccine-escaped and antibodyresistant [42] . The rise of new variants may require the monitoring of immune escape variants, force second-generation production that addresses globally circulating variants, and already vaccinated people may be again vaccinated. SARS-Cov-2 virus, which causes COVID-19 disease, is highly contagious. All countries quickly started vaccination development after the coronavirus was declared a global emergency by WHO. Up to now, more than ten COVID-19 vaccines have been approved for use. Although the efficacy of these vaccines is quite high, herd immunity is required to safely slow or stop the COVID-19 epidemic. The fastest way to achieve herd immunity is vaccination. In this study, the total number of full vaccination people against COVID-19 in the US, Asia, Europe, Africa, South America, and the World on the next ten days was estimated using the ARIMA time series method. ARIMA models have been formulated with different parameters and the most successful models have been selected. AIC, RMSE, and MAPE criteria were used to compare model success. Models with the lowest values were selected to estimate the number of vaccines to be administered in the future. ARIMA(5,2,2), ARIMA(1,2,3), ARIMA(5,2,0), ARIMA(2,2,1), ARIMA(1,2,1) and ARIMA(5,2,1) models were selected for the US, Asia, Europe, Africa, South America, and the World, respectively. The selected ARIMA models fitted the data reasonably well with a minimum MAPE US = 1.27, MAPE Asia = 4.80, MAPE Europe = 4.24, MAPE Africa = 11.32, MAPE SouthAmerica = 0.90, and MAPE World = 1.39 values. According to the results of this study, on June 1, 2021, in the US, Asia, Europe, Africa, South America, and, the World of 139 million, 109 million, 127 million, 8 million, 38 million, and 441 million people will be fully vaccinated, respectively. This result shows that beginning of the summer, 41.8%, 2.3%, 17%, 0.6%, 8.8%, and 5.6% of people will be fully vaccinated in the US, Asia, Europe, Africa, South America, and the World, respectively. It has been observed that the other regions except the US are quite far from the herd immunity threshold level. The future goal of our study is to forecast the number of fully vaccinated people in the future with deep learning time series models. Thus, the performances of the ARIMA model can be compared with the deep learning methods. Pınar Cihan: Writing -original draft, methodology, software, validation, visualization, Writing -review and editing. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Situation Report Coronavirus diseases (COVID-19) current status and future perspectives: a narrative review The SARS-CoV-2 vaccine pipeline: an overview Drug treatment options for the 2019-new coronavirus (2019-nCoV) Recent discovery and development of inhibitors targeting coronaviruses Virology, epidemiology, pathogenesis, and control of COVID-19 mRNA-based vaccines Safety and efficacy of the BNT162b2 mRNA Covid-19 vaccine Safety and efficacy of the ChAdOx1 nCoV-19 vaccine (AZD1222) against SARS-CoV-2: An interim analysis of four randomised controlled trials in Brazil, South Africa, and the UK COVID-19 vaccine efficacy and effectiveness-the elephant (not) in the room CanSinoBIO's COVID-19 vaccine 65.7% effective in global trials, Pakistan official says The COVID-19 vaccines: Recent development, challenges and prospects Dosing debates, transparency issues roil vaccine rollouts Safety and immunogenicity of two RNA-based Covid-19 vaccine candidates Target product profile analysis of COVID-19 vaccines in phase iii clinical trials and beyond: an early 2021 perspective Forecasting the spread of the COVID-19 pandemic in Saudi Arabia using ARIMA prediction model under current public health interventions Forecasting of covid-19 confirmed cases in different countries with arima models Estimation of COVID-19 prevalence in Italy, Spain, and France ARIMA modelling of predicting COVID-19 infections Forecasting the dynamics of COVID-19 pandemic in top 15 countries in 2020: ARIMA model with machine learning approach COVID-19): ARIMA based time-series analysis to forecast near future Forecasting the spreading of COVID-19 across nine countries from Europe, Asia, and the American continents using the ARIMA models ARIMA-based forecasting of the dynamics of confirmed Covid-19 cases for selected European countries, equilibrium Modeling and forecasting of Covid-19 growth curve in India COVID-19 pandemic: ARIMA and regression modelbased worldwide death cases predictions Modeling and forecasting of COVID-19 new cases in the top 10 infected African countries using regression and time series models COVID-19 prevalence estimation: Four most affected African countries ARIMA forecasting of COVID-19 incidence in Italy, Russia, and the USA, Russia, and the USA Forecasting epidemic spread of COVID-19 in India using arima model and effectiveness of lockdown An application of ARIMA model to forecast the dynamics of COVID-19 epidemic in India, Global Comparative analysis and forecasting of COVID-19 cases in various European countries with ARIMA, NARNN and LSTM approaches The economic value of vaccination: why prevention is wealth Herd immunity and implications for SARS-CoV-2 control Relationship of meteorological factors and human brucellosis in hebei province Using autoregressive integrated moving average (ARIMA) models to predict and monitor the number of beds occupied during a SARS outbreak in a tertiary hospital in Singapore Epidemiology and ARIMA model of positive-rate of influenza viruses among children in wuhan, China: A nine-year retrospective study Modelling malaria incidence with environmental dependency in a locality of Sudanese savannah area, Mali Forecasting model for the incidence of hepatitis A based on artificial neural network Coronavirus Source Data Introduction to Time Series and Forecasting A study of time series models ARIMA and ETS Circulating SARS-CoV-2 spike N439K variants maintain fitness while evading antibody-mediated immunity No funding to declare.