key: cord-1015229-sprvzfol authors: Tawiah, Kassim; Iddrisu, Wahab Abdul; Asampana Asosega, Killian title: Zero-Inflated Time Series Modelling of COVID-19 Deaths in Ghana date: 2021-04-30 journal: J Environ Public Health DOI: 10.1155/2021/5543977 sha: cdce901ccd5243f006fb662f2f8d1e283e7fc898 doc_id: 1015229 cord_uid: sprvzfol Discrete count time series data with an excessive number of zeros have warranted the development of zero-inflated time series models to incorporate the inflation of zeros and the overdispersion that comes with it. In this paper, we investigated the characteristics of the trend of daily count of COVID-19 deaths in Ghana using zero-inflated models. We envisaged that the trend of COVID-19 deaths per day in Ghana portrays a general increase from the onset of the pandemic in the country to about day 160 after which there is a general decrease onward. We fitted a zero-inflated Poisson autoregressive model and zero-inflated negative binomial autoregressive model to the data in the partial-likelihood framework. The zero-inflated negative binomial autoregressive model outperformed the zero-inflated Poisson autoregressive model. On the other hand, the dynamic zero-inflated Poisson autoregressive model performed better than the dynamic negative binomial autoregressive model. The predicted new death based on the zero-inflated negative binomial autoregressive model indicated that Ghana's COVID-19 death per day will rise sharply few days after 30(th) November 2020 and drastically fall just as in the observed data. Ghana confirmed its first two cases of the novel coronavirus disease on 12 th March 2020 at the Noguchi Memorial Institute for Medical Research (NMIMR) [1] . e two cases were all imported. Since then, the government through the Ministry of Health (MoH), the Ghana Health Service (GHS), and other stakeholders introduced prudent measures to help curb the spread of the virus [2] . Key among them was the introduction of the mandatory quarantine of all travellers arriving at the Kotoka International Airport for testing. ere was the implementation of social distancing protocols and the compulsory wearing of face/nose masks. Enhanced contact tracing of infected persons and routine surveillance was also instituted. All boarders were subsequently closed to travellers. A partial lockdown of Greater Accra and Greater Kumasi was instituted since they were the hot spots. ere was ban on all social gatherings. is led to the closure of all public and private schools, night clubs, and churches. Funerals, weddings, and festivals followed suit. However, private funerals and weddings with a maximum of 25 people with strict adherence to social distancing protocols were permitted. All public transport operators were mandated to reduce their passenger intake in line with the social distancing protocols. e President of Ghana signed an Executive Instrument (EI) to back the ban on social gatherings after Parliament of Ghana passed the Imposition of Restrictions Bill on 21 st March 2020. ere was also the compulsory washing of hands with soap under running water and use of hand sanitizers. Soap and tissue were placed beside Veronica buckets containing water at vantage points throughout the country to enable people wash their hands frequently with soap under running water after every transaction and engagement. Hugging and handshaking were discouraged. e Government introduced stimulus packages for all frontline health workers to boost their efforts in the fight against the pandemic. e Government also cushioned the entire population with free water since water is key to the fight against the spread of SARS-CoV-2. ere was fifty percent electricity subsidy for all consumers. Lifeline consumers of electricity were given hundred percent subsidies. ese freebies were initially for a period of 1 st April 2020 to 30 th June 2020 but were extended for another 3 months [3, 4] . e partial lockdown of Greater Accra and Greater Kumasi was lifted after almost a month or so in operation. Subsequently, social gathering protocols were relaxed to hundred individuals. All public and private schools were reopened to final years to enable them write their final-year examinations. Restrictions on public transport were also lifted. Restrictions on internal public and private transports on land, sea, and air were lifted. Ghana subsequently opened its international airport to foreign travels from 1 st September 2020, with strict testing and quarantine rules. Testing results for SARS-CoV-2, the virus that causes COVID-19, 3 weeks prior to arrival at the airport and subsequent testing on arrival were put in place. Even though the government is making frantic efforts to curb the spread of the coronavirus among the population, the infection figures continue to soar. is calls for an investigation into existing measures and protocols so as to assess their true impact on curtailing the spread of the SARS-CoV-2, the virus that causes COVID-19. Inasmuch as the infection figures are rising, recovery from COVID-19 in Ghana is very impressive. Maleki et al. [5] purported that for COVID-19 data set, error distribution can cogitate about a two-piece scale mixture of normal (TP-SMN) and designed time series models that work better than ordinary Gaussian and symmetry models. ree regression models (i.e., linear, logarithmic, and quadratic) were proposed for COVID-19 deaths in Pakistan. Influenced by the phase reached by COVID-19 deaths and criteria for assessing goodness of fit, the quadratic model was selected as the best for modelling and predicting death cases in Pakistan [6] . Sperrin and McMillan [4] developed the QCOVID model to predict the risk of COVID-19-related mortality, while the Institute for Health Metrics Evaluation (IHME) proposed and applied deterministic susceptible, exposed, infectious, and recovered (SEIR) compartment frame work model for cases in the United States of America [7] . Dwomoh et al. [8] used mathematical models to investigate COVID-19 infection dynamics in Ghana and delivered a brief forecast of the pandemic trajectory in the country using generalized growth models. ey investigated the effective basic reproduction number of the virus in real time applying different techniques of estimation, thereby predicting worse case scenarios amidst integrated individual and Government interventions by the use of compartmental models. eir result indicated that improved individuallevel intervention and intensified media coverage can substantially suppress COVID-19 transmission in Ghana and as a result reduce the COVID-19 death rates in the country. However, there seem to be a rise in the daily infection amidst increased Government and media coverage with reduced individual-level intervention. is rise in infections has increased the daily COVID-19-related deaths. With COVID-19 data from Ghana and Egypt, Asamoah et al. [9] applied sensitivity analysis to suggest that increased diagnoses, enhanced contact tracing, and stringent safety protocols in hospitals or isolation centers with constant supply of PPEs will help reduce (or possibly stop) the spread of the virus in the two countries. Bonful et al. [3] audited forty-five public transport stations in the Greater Accra region of Ghana to assess the compliance with the World Health Organization (WHO) safety protocols on the prevention of the spread of COVID- 19. ese included hand hygiene assessment scale, the availability and use of hand washing facilities, social distancing, and on-going public education on COVID-19 prevention measures. eir findings revealed inadequate washing places, lack of public education on practicing personal hygiene, inadequate alcohol-based sanitizers, and improper face-mask wearing (or no face-mask wearing). ey concluded that there is a challenge with COVID-19 prevention compliance. In this paper, we investigated the current trend of COVID-19 mortalities and use it to predict possible future COVID-19 death trajectory so as to help policy makers to readjust their interventions and strategies. e findings will also help individuals improve their quest to fight the spread of the virus to reduce the number of deaths related to the pandemic. It will also push the media to intensity their coverage to create awareness on what the future holds in relation to possible human life's likely to be lost to COVID-19. We explored zero-inflated time series models [10] [11] [12] [13] [14] [15] to Ghana's COVID-19 death counts per update. e foundation for modelling count data with repeated zeros and overdispersion was provided by Lambert [16] , Lee et al. [17] , Laird and Ware [18] , Min and Agresti [19] , Ridout et al. [20] , and Yau et al. [21] . Zhu [22] proposed zero-inflated Poisson and negative binomial integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) models, while Yang [14] and Yang et al. [15] proposed a zero-inflated Poisson and negative binomial autoregressive models for zero-inflated and overdispersed discrete count time series data. We employed zero-inflated time series models as proposed by Yang [14] and Yang et al. [15] as it was best suited for our data. e befitting model obtained was used to predict COVID-19 deaths in Ghana in order to assist the Government and public health experts who are managing the pandemic to know what to expect in terms of deaths before they occur so as to plan ahead. We used the entire Ghana as the study setting. Figure 1 presents the map of the study setting showing the number of active and cumulative confirmed COVID-19 cases as at 22 nd November 2020 in each of the 16 administrative regions of Ghana. e data for this study consist of confirmed COVID-19 deaths per day from 13 th March 2020 to 22 nd November 2020. e daily COVID-19-related deaths were obtained from Our World in Data, an official website for all COVID-19 (https://ourworldindata.org/coronavirus-source-data). e zero-inflated Poisson (ZIP) and zeroinflated negative binomial (ZINB) autoregressive models proposed by Yang [14] and Yang et al. [15] were adapted to characterize the trend of COVID-19 deaths in Ghana, which is a discrete count time series with excess zeros. Let Y t denote the observed COVID-19 deaths, composed of discrete count data which is conditionally distributed as ZIP (λ t , ω t ), where λ t is the intensity parameter of the baseline Poisson distribution and ω t is the zero-inflation parameter. e zero-inflated Poisson autoregressive (ZIPA) has a probability distribution given by where the intensity parameter λ t and zero-inflation parameter ω t are modelled as follows: where β � (β 1 , . . . , β p ) T and c � (c 1 , . . . , c p ) T are the regression coefficients for the log-linear part (2) and logistic part (3), respectively. Vectors representing past explanatory variables which can incorporate functions of the lagged response series accounting for serial correlation are denoted by Kedem and Fokianos [23] formulated the partial likelihood (PL) of the ZIPA as Substituting (1) into (6) yields Also, when expressions for λ t in (2) and ω t in (3) are, respectively, put into the above expression, we have Journal of Environmental and Public Health 3 Yang [14] and Yang et al. [15] developed the Expectation-Maximization (EM) algorithm in the partial likelihood framework to obtain parameter estimates and their standard errors. Even though the ZIPA may correct for overdispersion [a phenomenon where Var(Y t |y t−1 ) > E(Y t |y t−1 )] in discrete count time series data with excess zeros, we extended the ZIPA to zero-inflated negative binomial autoregressive (ZINBA) model which is well known for overdispersed data. For the ZINBA, the probability distribution is given by with λ t and ω t modelled as in (2) and (3) respectively. e dispersion parameter, k t , is modelled as where α � (α 1 , . . . , α p ) T is the regression coefficients and S t−1 � (s t− 1,1 , . . . , s t− 1,p ) T is a vector of past explanatory variables. e mean and variance expressions for the ZINBA are the same as that of the ZIPA stated in (4) and (5), respectively. As in the ZIPA, the PL of the ZINBA is the same as in (6) . us, substituting (7) into (6) gives the PL When expressions for k t in (10), λ t in (2), and ω t in (3), respectively, are substituted into the PL above, it yields Parameter estimates and their corresponding standard errors in the PL above can be obtained by employing the EM algorithm proposed by Yang [14] and Yang et al. [15] . e models were compared based on the Akaike Information Criterion (AIC; [23, 24] ), Bayesian Information Criterion (BIC; [23, 25] ), and Takeuchi Information Criterion (TIC; [26] ). ese metrics combine a measure of model fit, typically twice the negative log-partial likelihood, with a penalty for model complexity, expressed as a function of the number of parameters [12] . e AIC and BIC are computed by the expressions: where k is the number of parameters in the model and n is the number of observations. e TIC is calculated by the expression: where I −1 n is the information matrix and J n is given by with θ � (β, c) T . Model fitting was done in R [27] using the "ZIM" and the countreg packages [14, 15, 28] . e trend of COVID-19 death in Ghana as illustrated in Figure 2 gives the impression of a general increase in the death toll from day zero (the day COVID-19 was first discovered in Ghana) to about day 160 after which there is an impression of a general decrease in the death toll to day 250 and beyond. e increase in the number of deaths was expected as the infection rate and active and severe cases continued to soar from day zero to day 150. e decrease in the number of active and severe cases amidst a rise in the infection rate could also be attributed to the decline in the number of deaths. However, a thorough investigation is needed to be carried out on the rise and fall of the deaths to ascertain what truly fuelled it. As indicated by the histogram (Figure 3 .), there is a higher proportion of zero counts (no deaths) per day making up 69.02% of the entire time series data. is is clearly an indication of zero inflation in the data. Even though infection rate continues to rise, the number of deaths being reported in most days ought to be looked into in order to confirm whether the majority of the COVID-19 patients in Ghana have developed resistance to the pandemic or they responded positive to care procedures meted to them at COVID-19 treatment centers. In order to realize the most apt zero-inflated time series model to characterize the trend of our data, we fitted the ZIPA and ZINBA (Table 1 ). In the log-linear part of the models, the intercept of the ZIPA was significant at 0.05 significance level, but that of the ZINBA was not. However, the ZINBA has a smaller estimate (0.5821) compared to the estimate (1.1317) of the ZIPA. e count.lag and the trend were both not significant at 0.05. Fascinatingly, the count.lag of the ZIPA model had an increasing effect on the log of the expected number of COVID-19 deaths in Ghana, while it had a decreasing effect on that of the ZINBA model. e trend was also not significant in both the ZIPA model and the ZINBA model. Also, worth noting is the fact that even though the trend had an increasing effect on the log of expected number of COVID-19 deaths in both models, the effect is higher in the ZINBA than the ZIPA model. On the logistic part of our models, the intercept of the ZIPA was significant with a higher estimate as compared to the nonsignificant intercept of the ZINBA with a smaller estimate. Just as in the log-linear part, the trend was not significant in the logistic part of the ZIPA and ZINBA models. In both models, the trend has an increasing effect on the log of the expected odds of the number of COVID-19 deaths in Ghana with the odds in the ZINBA model being higher than that of the ZIPA model. e ZIPA model has higher AIC, BIC, and TIC values than the ZINBA model. For each of these criteria for assessing goodness of fit, there is a tart reduction from the value of the ZIPA to that of the ZINBA. is clearly shows that the ZINBA model may have corrected for more complexity in the data than the ZIPA model. e test for overdispersion conducted (Table 2 ) had a score test of 8.3470 and a P value less than 0.0001. is means that the ZINBA model did well with respect to the overdispersion in our data. Output from the dynamic zero-inflated Poisson autoregressive (DZIPA) and dynamic zero-inflated negative binomial autoregressive (DZINBA) models, with 200 replications, 100 iterations, and sample size of 200 in each model, are presented in Table 3 . e zero-inflation parameter dwindled from the DZIPA (0.6148) to the DZINBA (0.6108). is could mean that the DZIPA may have detected more zeros in the data than the DZINBA. e standard deviation was higher in the DZIPA than in the DZINBA. In the log-linear part of the models, the intercept is significant in the DZIPA model, but not significant in the DZINBA model. e trend was not significant in both the DZIPA model and the DZINBA model. Nevertheless, the trend had a decreasing effect on the log of the expected number of COVID-19 deaths in Ghana with the DZIPA model having a greater decreasing effect than the DZINBA model. We can deduce that the dynamic models generally forecast a decrease in the expected number of COVID-19 deaths in Ghana. In respect of the autoregressive part, AR (1) was significant in the DZIPA model as well as the DZINBA model. us, we can deduce the DZIPA and the DZINBA models are both AR (1). ere was an unsubstantial increase in the AIC and BIC values from the DZIPA to the DZINBA. e TIC values, however, registered a quantum increase from the DZIPA model to the DZINBA model. Consequently, we can assert that the DZIPA has corrected for more complexity in the data than the DZINBA. With respect to the AIC, BIC, and TIC values, the DZIPA model outperformed the ZIPA model. Notwithstanding, the DZINBA outperformed the ZINBA only in terms of the AIC and BIC values while the opposite is true for the TIC value. Trace plots for the DZIPA and DZINBA models are presented in Figure 4 . From the plots, we can see that the partial likelihood becomes progressively greater conspicuously preceding all others in time for several iterations and then maintains stability as the estimated parameters become very close to the maximum likelihood estimator [14, 15] . Figure 5 points to a probability integral transform (PIT) histogram [29] , which appears to approach uniformity. e horizontal line depicts the count that each of the bins would have if the histogram was perfectly uniform. Hence, the probabilistic calibration of the fitted ZINBA model is sufficient. Time series of the observed daily new deaths of COVID-19 from 23 rd November 2020 to 6 th December 2020 and predicted daily new deaths based on the fitted ZINBA model are shown in Figure 6 . It is observed that the overall trend of the two curves is similar, and the values themselves are very close in some cases. is is an indication of a good predictive model. We observed that Ghana's COVID-19 daily death count, from the very first day the pandemic was discovered in the country, is inflated with zeros (no deaths). is excessive number of zeros lead to overdispersion. e trend of COVID-19 deaths per day in Ghana is characterized by a general increase from the onset of the pandemic in the country to about day 160 after which there is a general decrease onwards. e continuous decrease in the death toll amidst rise in daily infections and continuous disregard of safety protocols recently ought to be investigated. We fitted a zero-inflated Poisson autoregressive model and zero-inflated negative binomial autoregressive model to the data in the partial-likelihood framework. e zero-inflated negative binomial autoregressive model outperformed the zero-inflated Poisson autoregressive model. We further obtained dynamic versions of the zero-inflated models. e dynamic zero-inflated Poisson autoregressive model, however, performed better than the dynamic negative binomial autoregressive model. Both dynamic models predicted an AR (1). e predicted new deaths based on the ZINBA model showed that Ghana's COVID-19 deaths per day will rise sharply few days after 30 th November 2020 and drastically fall just like that of the observed data. Data Availability e data used are made up of daily COVID-19 death count in Ghana from 13 th March 2020 to 22 nd November 2020 from Our World in Data, an official website for all COVID-19-related deaths (https://ourworldindata.org/coronavirussource-data). Ghana confirms first two COVID-19 cases COVID-19 updates Limiting spread of COVID-19 in Ghana: compliance audit of selected transportation stations in the Greater Accra region of Ghana Prediction models for covid-19 outcomes Modeling and forecasting the spread and death rate of coronavirus (COVID-19) in the world using time series models Predictive modeling of COVID-19 death cases in Pakistan Modeling COVID-19 scenarios for the United States Mathematical modeling of COVID-19 infection dynamics in Ghana: impact evaluation of integrated government and individual level interventions A Mathematical Model and Sensitivity Assessment of COVID-19 Outbreak in Ghana and Egypt Zeroinflated count time series models using Gaussian copula Regression analysis of zero-inflated time-series counts: application to air pollution related emergency room visit data On zero-inflated hierarchical Poisson models with application to maternal mortality data Markov zero-inflated Poisson regression models for a time series of counts with excess zeros Statistical models for count time series with excess zeros Markov regression models for count time series with excess zeros: a partial likelihood approach Zero-inflated Poisson regression, with an application to defects in manufacturing Analysis of zeroinflated clustered count data: a marginalized model Random-effects models for longitudinal data Random effect models for repeated measures of zero-inflated count data Models for count data with many zeros Zero-inflated negative binomial mixed regression modeling of over-dispersed count data with extra zeros Zero-inflated Poisson and negative binomial integervalued GARCH models Regression Models for Time Series Analysis A new look at the statistical model identification Multimodel inference Distribution of information statistics and criteria for adequacy of models R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing Countreg: tools for count data regression An Introduction to Discrete-Valued Time Series e authors declare no conflicts of interest. KT, IWA, and KAA sensed the idea. KT and IWA proposed the statistical methodology. IWA performed the statistical analysis. KT drafted the manuscript. KAA reviewed the manuscript. All authors agree to be answerable to all aspects of the work and jointly own the work. All authors read and approved the final manuscript.