key: cord-0682946-rcjiknm6
authors: Chattopadhyay, Subhankar; Maiti, Raju; Das, Samarjit; Biswas, Atanu
title: Change‐point analysis through INAR process with application to some COVID‐19 data
date: 2021-06-02
journal: Stat Neerl
DOI: 10.1111/stan.12251
sha: a0e3d9b9e03932b22ffc4eb70d3b88d8aed5031f
doc_id: 682946
cord_uid: rcjiknm6

In this article, we consider the problem of change‐point analysis for the count time series data through an integer‐valued autoregressive process of order 1 (INAR(1)) with time‐varying covariates. These types of features we observe in many real‐life scenarios especially in the COVID‐19 data sets where the number of active cases over time starts falling and then again increases. In order to capture those features, we use Poisson INAR(1) process with a time‐varying smoothing covariate. By using such model, we can model both the components in the active cases at time‐point t namely ‐ (i) number of non‐recovery cases from the previous time‐point, and (ii) number of new cases at time‐point t. We study some theoretical properties of the proposed model along with forecasting. Some simulation studies are performed to study the effectiveness of the proposed method. Finally, we analyze two COVID‐19 data sets and compare our proposed model to another PINAR(1) process which has time‐varying covariate but no change‐point, to demonstrate the overall performance of our proposed model.

Time series of count data have been widely studied during the last three decades or so due to its increased relevance towards various fields of science. There are several ways to model count time series data. For example, McKenzie (1985 McKenzie ( , 1986 and Al-Osh and Alzaid (1987) introduced a class of stationary integer-valued autoregressive (INAR) time series process based on binomial thinning operator. This process was further studied and generalized by Alzaid and Al-Osh (1990) ; Jin-Guan and Yuan (1991); Freeland and McCabe (2004) ; Ristić et al. (2009); Jazi et al. (2012) ; Schweer and Weiß (2014) ; Maiti et al. (2015) and many more. In particular, McKenzie (1986) introduced the integer-valued AR(1) or INAR(1) models with geometric and negative binomial marginals when the data is over-dispersed. McKenzie (1985) and Al-Osh and Alzaid (1987) developed an INAR(1) process with Poisson marginals, well known as PINAR(1) process which is very popular due to its simple form. The INAR(1) process was further extended to a more general INAR(p) process by Alzaid and Al-Osh (1990) and Jin-Guan and Yuan (1991) . Ristić et al. (2009) and Schweer and Weiß (2014) proposed a new INAR(1) process based on negative binomial thinning operator which can also handle the over-dispersion problem. Jazi et al. (2012) and Maiti et al. (2015) studied zero-inflated PINAR(1) (ZIPINAR(1)) processes for zero-inflated count data. Apart from these thinning based INAR processes, Cameron and Trivedi (1986) and Fokianos (2011) studied some regression based time series models to model count time series data.

In this paper, we employ the INAR process to model the data of COVID-19 active cases which is an example of count time series data. In an INAR process there are two components at time-point t namely -(i) non-recovery cases from the previous time-point (survival part) , and (ii) new cases coming in the process at time-point t (innovation terms). These INAR processes are mainly stationary since the innovation terms involve no time-varying covariate, i.e., the new cases coming in the process are not time-dependent. But in real-life scenarios like the COVID-19 data sets, we can find that the rapid change in the number of infected cases makes the innovation terms time-dependent. Besides this time-varying nature of the innovation terms, we also notice some change-points in these data sets. In the current scenario of COVID-19 pandemic, we are seeing mainly two types of curves for daily new cases reported in different parts of the world, which are -(i) the curve, at first, began to increase exponentially, but after major steps like 'nationwide lockdown's, 'social distancing' measures, a massive number of testing, etc. taken by the respective authorities in different countries, the curve started decreasing, and (ii) the curve which came down, started to rise again as the respective authorities began to ease those measures in some parts of the world. The curves of daily active cases are also changing in the same way in those parts of the world. Hence we can spot one change-point (upwards to downwards) for the curve described in Case (i) and two change-points (upwards to downwards and then downwards to upwards) for the curve in Case (ii). In this article, we try to develop a PINAR process based on binomial thinning operator for count time series data like the COVID-19 data where we model the innovation terms through some time-varying covariates and smoothing change-point function without changing the survival part. PINAR process, introduced by McKenzie (1985) and Al-Osh and Alzaid (1987) , is very popular due to its simple form and has a wide application in modeling count time series data. But this PINAR process based on binomial thinning operator is not capable of handling the count time series data which has both change-points and time-varying innovation terms. Hence we introduce a new suitable PINAR model which is able to tackle both these features which can be found in the COVID-19 data sets. To incorporate the change-points in our proposed PINAR model, the innovation terms are modeled with a smoothing version (see Smooth Maximum) of time-varying covariate which consists of the change-points. The idea to capture the change-points in the innovation terms through time-varying smoothing covariate is inspired by Chan and Tong (1986) ; Hansen (2000) and Fong et al. (2017) whose works are mainly based on continuous data. We use this smoothing version of time-varying components in our proposed model to catch the changing curvatures in the data of daily active cases. The effectiveness of the proposed model for both the studies of one change-point and two change-points is reviewed later by simulation study and the analysis of two COVID-19 data sets. We compare our proposed model to another PINAR model which has time-varying covariate but no change-point, to illustrate the overall performance of the proposed model.

The rest of the article is organized as follows. Section 2 discusses two real COVID-19 data sets. Section 3 describes our proposed model along with a brief illustration of the INAR(1) process. We provide the distributional forms of our proposed model and the h-step ahead forecasting distribution in Sections 4 and 5 respectively. In Section 6, we talk about the estimation method for our proposed model. Some extensive simulation studies for our proposed model are provided in Section 7. In Section 8, we analyze the COVID-19 data sets. Finally, some conclusions are drawn in Section 9. All the proofs of the theoretical results are provided in Appendix (Section 10).

The world is now facing the biggest global health crisis in the name of COVID-19 pandemic unlike any in recent times. The outbreak was first identified in Wuhan, China, in early December 2019. The World Health Organization declared the outbreak a Public Health Emergency of International Concern on January 30 and a pandemic on March 11.

To restrict the spread of this virus in early stages, heavy measures have been implemented in different parts of the world by the respective authorities like 'nationwide lockdown's, 'rapid testing' process, strict 'social distancing', using of masks and sanitizers in public places, etc. Hence in certain parts of the world, the situation of COVID-19 has improved, and the lockdown has been eased in those parts. During that period, there were also some Gulf evacuations took place in different countries, especially in India. Therefore the "community transmission" has started in those parts of the world due to the highly infectious nature of this virus, and the number of infected cases began to pile up again. For further discussion in this regard, we explore two real COVID-19 data sets in Sections 2.1 and 2.2.

This data set is an example of Case (i) described in Section 1. We can only see one changepoint in the data of active cases of Italy and hence the study will be based on one change-point analysis. The data (see Worldometer) is collected from February 15 to June 6 (total 113 days). Though the first case in this country was detected back in January 2020, the cases started to increase rapidly from the beginning of March. After continuous measurements taken by the authorities, the curve of active cases has started to come down. As of June 6, 2020, the total number of confirmed cases was more than 234k, and the number of deaths was more than 33.8k. The active number of cases was more than 35000. Figure 1 displays the data of new daily cases, the data of daily active cases, and the autocorrelation function (ACF) and partial ACF (PACF) plots of daily active cases. From the ACF and PACF plots, it seems that the data has a good fit for the AR(1) process. This article is protected by copyright. All rights reserved.

This data set is an example of Case (ii). Here we observe two change-points in the data of daily active cases and hence the study will be based on two change-point analysis. In Kerala, the first case was also detected back in January 2020, but the cases started to pile up from mid-March. Due to heavy measurements taken by the state government of Kerala, the curve of active cases came down, but from mid-May, the cases again started to rise when the Gulf evacuees began to come into the state. The data for Kerala (see GoK Dashboard) is collected from March 9 to June 6 (total 90 days). More than 1800 cases and total of 15 deaths were reported in Kerala as of June 6, 2020, and the active number of cases was more than 1000. Figure 2 displays the data of new daily cases, the data of daily active cases, and the ACF and PACF plots of daily active cases. It seems that the data has a good fit for the AR(1) process. This article is protected by copyright. All rights reserved.

In this section, we develop a new model based on the integer-valued AR(1) process to capture the change-points in the count time series data sets like the COVID-19 data sets of Italy and Kerala.

Here we use the INAR(1) process (proposed by McKenzie (1985) and Al-Osh and Alzaid (1987) ) consisting of binomial thinning operator (introduced by Steutel and van Harn (1979) ), to develop our proposed model for change-point analysis, which is given by

where Y t denotes the number of daily active cases at time-point t and ε t represents daily new cases reported at time-point t. We assume that ε t follows Poisson(λ t ) where λ t is assumed to have the following form

where the tuning parameter δ n (> 0) helps to capture the changing curvature of the data. t ch denotes the change-point in the data. The change-point t ch can be easily estimated from the data. The above model is defined only for one change-point. β 1 = (α, β 0 , β 1 , β 2 ) are the regression parameters, and β 1 is the regression coefficient which associates with the timevarying covariate consisting of the change-point. However, we can easily extend the model for more than one change-point. For two change-points, only the form of λ t will change and the functional form is given by

where t ch1 and t ch2 are the two change-points in the data. Here β 2 = (α, β 0 , β 1 , β 2 , β 3 ) are the regression parameters, and β 1 and β 2 are the regression coefficients which associate with the time-varying covariates consisting of the change-points. In the subsequent section, we provide a general idea about our proposed model.

Remark. The use of one tuning parameter in data with one change-point can be widened for more than one change-point like using two different tuning parameters δ 1n and δ 2n for data with two change-points. But in our proposed process for two change-points, we put only δ n instead of δ 1n and δ 2n , mainly because using δ n reduces the computational difficulty and simplifies the form of the proposed model.

The idea behind the form of λ t , discussed in equations (3.2) and (3.3), comes from the threshold regression model setup (see Chan and Tong (1986) ; Hansen (2000) and Fong et al. (2017) ). From the concept of the segmented model in the threshold regression setup, we can write the form of log(λ t ) for one change-point as log(λ t ) = β 0 + β 1 (t − t ch ) + + β 2 t, where (t − t ch ) + = (t − t ch ) for t > t ch and (t − t ch ) + = 0 for t ≤ t ch . In this segmented form of log(λ t ), we get to see sharp change (upwards to downwards or downwards to upwards) in the curve of daily new cases and hence in the curve of daily active cases. But in real-life scenarios like the COVID-19 data, we do not get to see sharp changes; most of the time we notice changing curvature(s) in these data sets. So we try to capture those changing curvature(s) in the data of daily active cases by modeling the data of daily new cases (innovation terms) in the proposed model through some time-varying covariates and smoothing change-point functions. Moreover, the function (t−t ch ) + is not differentiable at t ch . So we replace (t−t ch ) + (for δ n > 0) by a smooth differentiable maximum function (see Smooth Maximum), which is given by

Hence the functional form of λ t for one change-point is given by

In the similar way, we can find the functional form of λ t for two change-points, which is given by

The changing behaviours of these data sets depend on some conditions on β i 's. We try to provide those conditions through the form of the segmented model of threshold regression setup for both sets of β i 's ((β 0 , β 1 , β 2 ) in equation (3.2), and (β 0 , β 1 , β 2 , β 3 ) in equation (3.3)) which enable our proposed model to capture the change-point(s). The required conditions for both the studies of one change-point and two change-points are given below.

(i) In the segmented form of log(λ t ) for one change-point analysis, we model log(λ t ) as log(λ t ) = β 0 + β 2 t for t ≤ t ch and β 0 + β 1 (t − t ch ) + β 2 t for t > t ch . So the derivatives of log(λ t ) are β 2 for t ≤ t ch and (β 1 + β 2 ) for t > t ch in this segmented model setup for one change-point. So for β 2 > 0 and (β 1 + β 2 ) < 0, log(λ t ) increases when t ≤ t ch and decreases when t > t ch , i.e., λ t increases when t ≤ t ch and decreases when t > t ch . Hence the change-point is t ch th time-point in the data of daily new cases. So for the count time series data of one change-point, the condition: {β 2 > 0, (β 1 + β 2 ) < 0} must hold.

(ii) Similarly for the study of two change-points, we model log(λ t ) as log(λ t ) = β 0 + β 3 t for t ≤ t ch1 , β 0 + β 1 (t − t ch1 ) + β 3 t for t ch1 < t ≤ t ch2 and β 0 + β 1 (t − t ch1 ) + β 2 (t − t ch2 ) + β 3 t for t > t ch2 . Hence the derivatives of log(λ t ) are β 3 for t ≤ t ch1 , (β 1 + β 3 ) for t ch1 < t ≤ t ch2 and (β 1 + β 2 + β 3 ) for t > t ch2 in the segmented model for two change-points. So for β 3 > 0, (β 1 + β 3 ) < 0 and (β 1 + β 2 + β 3 ) > 0, log(λ t ) increases when t ≤ t ch1 , decreases when t ch1 < t ≤ t ch2 and again increases when t > t ch2 , i.e., λ t increases when t ≤ t ch1 , decreases when t ch1 < t ≤ t ch2 and again increases when t > t ch2 . Here the two change-points are t ch1 th and t ch2 th time-points in the data of daily new cases. So the condition: {β 3 > 0, (β 1 + β 3 ) < 0, (β 1 + β 2 + β 3 ) > 0} must hold for the count time series data containing two change-points.

The tuning parameter of our proposed model, δ n , helps to capture the changing curvature(s) in the data. Here δ n > 0. To compute the optimal value of the tuning parameter δ n from the data, we consider a grid search method (see Chakraborty et al. (2013) , James et al. (2013) ). In this method, we use a goodness-of-fit measure based on which the optimal value of δ n is calculated. The idea of δ n comes from the concept of Smooth Maximum and so as the value of δ n increases the changing curvature becomes sharper. We show this property in Figures 3  and 4 where we can clearly see as the values of δ n shift from 0.05 to 1, the changing curvatures become sharper for both the studies of one change-point and two change-points. We also add the non-smoothing version (no use of δ n ) of the generated data, i.e., the segmented data. 

To estimate the change-point(s), we take the difference between every two consecutive observations (i.e., ∆ t = Y t − Y t−1 ) and consider the sign of those differences denoted by S t = sign(∆ t ) where S t = + if ∆ t > 0 and − otherwise. For a data set with one changepoint, the sequence {S t } should give us two runs: (1) run of +, and (2) run of -(see Wald and Wolfowitz (1940) ). Depending on the increasing or decreasing curve of Y t , the run of + and the run of -will be interchanged. For example, if the original time series plot of Y t is bell-shaped (i.e., initially the observations are increasing and then after a certain time-point (say, t ch ) the observations are decreasing), we will have a run of + first and then after the time-point t ch we will have a run of -. The time-point at which the first run of + ends gives us an estimate of the original change-point t ch . However, in real scenarios, time series data with one change-point may not be smooth and often there are random fluctuations present in the data. As a result, there might be many small runs of + and -which make the above estimation procedure difficult to locate the true change-point. Hence we employ a pre-smoothing approach before implementing the above run based point estimation. That is, instead of working with the actual time series data, we make the data smooth by implementing some standard statistical approaches like m-point moving average, or through a p-th degree polynomial function.

For the time series data with two change-points (say, t ch1 and t ch2 ), the sequence {S t } should produce three runs: (1) run of +, (2) run of -, and (3) again run of +. Here the run of + and the run of -will be interchanged twice, i.e., a run of + for the increasing curvature, then a run of -for the decreasing curvature and another run of + when cases again begin to rise (another increasing curvature). The time-point at which the first run ends gives us an estimate of the first change-point t ch1 and the time-point at which the second run ends provides an estimate of the second change-point t ch2 . However, like the case of one change-point, here also time series data sets are non-smooth and hence the implementation of pre-smoothing approaches like m-point moving average, or through p-th degree polynomial function is required.

Later, in Section 7.2, we perform a simulation study where we estimate the true changepoint(s) and provide 95% confidence interval(s) based on normal approximation. And we study the large sample properties by varying the sample size.

In this section, we study the conditional and the marginal distributions of the proposed model.

Under our proposed setup, the conditional distribution of Y t given Y t−1 and X t (the set of all covariates up to time-point t including smooth time-varying and simple time-varying covariates up to time-point t) can be derived as

where I (·) is the indicator function. This is the probability of going from state i to state j in a single step. The conditional mean and variance can be given as

Since the marginal distribution of Y t is difficult to obtain, we find the partial marginal distribution of Y t given X t for t > 1, henceforth it is called the marginal distribution. Here we derive the probability generating function (PGF) of Y n given X n .

The derivation is valid for t > 1 and hence we assume that given X 1 , the marginal distribution of Y 1 is Poisson(λ 1 ). The reason behind this assumption can be given as follows. We know the elements which enter the system in the interval (t−1, t] are the innovation term at time-point t (ε t ). Now for t = 1, the interval is (0,1], and there is no previous existing interval in the system. So in the interval (0,1], the elements which enter the system can be seen as the first count process Y 1 . Hence we can assume Y 1 |X 1 ∼ Poisson(λ 1 ).

This article is protected by copyright. All rights reserved.

Theorem 1. Under the assumptions that Y 1 |X 1 ∼ Poisson(λ 1 ) and ε n |X n ∼ Poisson(λ n ), we can show that the PGF of Y n |X n is

i.e., Y n given X n , follows Poisson distribution with mean (α n−1 λ 1 + α n−2 λ 2 + ... + λ n ).

Proof. The derivation of this result is presented in Appendix (see Section 10.1).

Here we can also use a recursive formula as an alternative way to derive the marginal distribution, which is given by

where I (·) is the indicator function.

Here the marginal mean and the marginal variance are given by

and V (Y n |X n ) = α n−1 λ 1 + α n−2 λ 2 + ... + λ n .

(4.5)

Theorem 2. Under the above setup, the auto-covariance function (ACVF) of Y t given X t+h using the equation

Proof. The derivation of this result is presented in Appendix (see Section 10.2).

Hence for h =0, the ACF can be derived as follows:

It can be seen that the above expression decays exponentially to 0 as h goes to ∞ for α ∈ (0, 1) and the restricted β i 's discussed in Section 3.2.

Accepted Article

This article is protected by copyright. All rights reserved.

To find the h-step ahead forecasting distribution, we use the following recursive method:

Thus the h-step ahead conditional mean and conditional variance can be given as

The h-step ahead forecasting distribution of PINAR(1) process was derived by Freeland and McCabe (2004) using the binomial thinning operator discussed by Al-Osh and Alzaid (1987) and it turned out to be a convolution of binomial and Poisson distributions. Here we can calculate the conditional PGF of Y n+h given Y n and X n+h and then derive the forecasting distribution using this.

Theorem 3. The conditional PGF of Y n+h given Y n and X n+h can be shown as

Proof. The derivation of this result is presented in Appendix (see Section 10.3).

Corollary 1. From the above result, we can say that the h-step ahead prediction distribution of Y n+h given Y n and X n+h is a convolution of Bin Y n , α h and some random variable

Thus, the prediction distribution can be presented as

where " * " is called the convolution between two distributions.

Accepted Article This article is protected by copyright. All rights reserved.

Theorem 4. Using Corollary 1, the h-step ahead forecasting distribution of Y n+h given Y n and X n+h can be derived as

Proof. The derivation of this result is presented in Appendix (see Section 10.4).

Given an observed data set {Y 1 , ..., Y n , Y n+1 , ..., Y n+m } of size (n + m), we partition the data into two sets. The training set containing the first n observations is used to estimate the parameters of the model and based on the rest of m observations called the test set, we define the following descriptive measure of forecasting accuracy. The h-step ahead predicted root mean squared error [denoted by PRMSE(h)] is defined as

whereŶ me t+h is the mean of the estimated h-step ahead forecasting distribution of Y t+h given Y t and X t+h mentioned in Theorem 4. Intuitively, the PRMSE(h) should increase in h.

6 Estimation method for the model parameters 6.1 Conditional least squares estimation Conditional least squares estimation is usually used for estimating the regression parameters of the model in the context of time series models. McCabe (2004, 2005) used this approach for PINAR(1) process.

In order to implement the conditional least squares estimation method, we need to minimize the sum of squared deviation about the conditional expectation which is given as

] 2 with respect to the regression parameters of the model, where E(Y t |Y t−1 , X t ) = αY t−1 + λ t and β is the vector for regression parameters. Here numerical methods are being employed to obtain the CLS estimates of the regression parameters of the model as there are no closed forms of the CLS estimators.

In the subsequent section, we have done an extensive simulation study for both the studies of one change-point and two change-points and from the simulation results, we have shown consistency of the CLS method.

Note 1. In maximum likelihood estimation, given a data set of size n, the likelihood function for the process is given by L

In order to obtain the MLE estimators, we maximize the log-likelihood function with respect to regression parameters which can be written as log L(β) = log p(Y 1 |X 1 ) + n t=2 log p(Y t |Y t−1 , X t ).

Here

In real-life scenarios like the COVID-19 data, the number of daily active cases at time-point t (represented by Y t ) and the number of daily new cases at time-point t (represented by λ t ) will often be large and hence in R programming language, we face difficulties to execute the MLE method because of the terms like λ j−k t ('j' is the number of daily active cases at timepoint t and k = min(i, j) where 'i' is the number of daily active cases at time-point (t − 1)) involved in the likelihood function. So the estimation method which we have employed for data analysis is CLS method. For the simulation studies regarding the analysis of one change-point (t ch ), X n (the set of all covariates up to time-point n) is equal to {1, Z 1 , P 1 , Z 2 , P 2 , ..., Z n , P n } where Z t = (t−t ch ) exp(δn(t−t ch )) 1+exp(δn(t−t ch )) , which is the smooth time-varying component and P t = t, which is the simple time-varying component. And for the simulation studies regarding the analysis of two change-points (t ch1 and t ch2 ),

and Q t = t; here Z 1t 's and Z 2t 's are the smooth time-varying components and Q t 's are the simple time-varying components. In the simulation studies, we use these components for each of the studies to generate data sets of varying sample sizes by the data generating processes mentioned in equation (3.2) for one change-point and equation (3.3) for two change-points.

In the simulation study regarding forecasting performances, we compare our proposed model to the following model

Accepted Article This article is protected by copyright. All rights reserved.

where Y t denotes the daily number of active cases at time-point t and ε t represents the daily number of new cases at time-point t. Here ε t follows Poisson(λ * t ) where λ * t is assumed to have the following form

This model involves no change-point. But the innovation terms depend on time-varying covariates.

Here we perform a simulation study in order to provide 95% confidence intervals for the true change-points from the simulated data sets and examine the widths of those intervals with increasing sample size. The estimation method of change-point(s) is discussed in Section 3.4. In order to perform this simulation study, we simulate data from the proposed model with (1) one change-point (given in equation (3.2)), and (2) two change-points (given in equation (3.3)). Two sets of regression parameters are considered for each of the above two data generating cases. Three different sample sizes (n) of 400, 450 and 500 are explored. Throughout the whole simulation study, we consider two different values of δ n as 0.1 and 0.2. All the simulations results are based on 1000 Monte Carlo replications.

For one change-point simulation study, we assume the value of the true change-point t ch to be 0.5n where n is the sample size of the data. The estimation method of the change-point is discussed in Section 3.4. Two sets of regression parameters used in the data generating process are β 1 = (α, β 0 , β 1 , β 2 ) = (0.4, 1.5, −0.08, 0.04) and (0.7, −1, −0.09, 0.05). For each set of the regression parameters and the tuning parameter δ n , we simulate the data using model (3.1) with λ t given in equation (3.2). Here for the data generating method of one change-point, X n , set of all covariates up to time-point n, consists of both the smooth timevarying components and the simple time-varying components up to time-point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and we report the 95% confidence intervals in Tables 1 and 2 where we can see that as the sample sizes increase the widths of the confidence intervals decrease.

Accepted Article For the simulation study of two change-points, the true change-points t ch1 and t ch2 are assumed to be 0.4n and 0.6n, respectively. Two sets of values of the regression parameters used in the data generating process are β 2 = (α, β 0 , β 1 , β 2 , β 3 ) = (0.5, 3.5, −0.1, 0.1, 0.05) and (0.7, 4.5, −0.08, 0.09, 0.04). For each set of the regression parameters and the tuning parameter δ n , we simulate the data using model (3.1) with λ t given in equation (3.3) . The estimation method of the change-point is discussed in Section 3.4. Here for the data generating method of two change-points, X n , set of all covariates up to time-point n, consists of both the smooth time-varying components and the simple time-varying components up to time-point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and the 95% confidence intervals are reported in Tables 3 and 4 . From the tables, we can see that as the sample sizes increase the widths of the confidence intervals decrease. 

Here we perform a simulation study to investigate the consistency of the estimation method used for the proposed model. In order to perform this simulation study, we simulate data from the proposed model with (1) one change-point (given in equation (3.2)), and (2) two change-points (given in equation (3.3) ). Three sets of regression parameters are considered for each of the above two data generating cases. Those values are mentioned in the subsequent sections. Three different sample sizes (n) of 100, 200 and 500 are explored. Throughout the whole simulation study, we consider three different values of δ n as 0.1, 0.5 and 1. All the simulations results are based on 1000 Monte Carlo replications.

Accepted Article This article is protected by copyright. All rights reserved.

For one change-point simulation study, we assume the value of the change-point t ch to be 0.4n where n is the sample size of the data. Three sets of regression parameters used in the data generating process are β 1 = (α, β 0 , β 1 , β 2 ) = (0.5, 0.1, −0.2, 0.02), (0.6, −0.2, −0.04, 0.02) and (0.8, −0.1, −0.02, 0.01). For each set of the regression parameters and the tuning parameter δ n , we simulate the data using model (3.1) with λ t given in equation (3.2). Then we estimate the regression parameters using CLS estimation method. Here for the data generating method of one change-point, X n , set of all covariates up to time-point n, consists of both the smooth time-varying components and the simple time-varying components up to time-point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and we report the mean estimates and MSEs of the regression parameters in Tables 5, 6 and 7. From Tables 5, 6 and 7, we can see that as the sample size increases MSE of the estimated regression parameters decreases. This empirically establishes the consistency of the CLS estimation. Accepted Article This article is protected by copyright. All rights reserved. For each set of the regression parameters and the tuning parameter δ n , we simulate the data using model (3.1) with λ t given in equation (3.3). Then we estimate the regression parameters using CLS estimation method for a given simulated data. Here for the data generating method of two change-points, X n , set of all covariates up to time-point n, consists of both the smooth time-varying components and the simple time-varying components up to time-point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 1000 times and the combined mean estimates and MSEs of the regression parameters are reported in Tables 8, 9 and 10. We can see as the sample size increases MSE of the estimated regression parameters decreases. This establishes the consistency of the CLS estimation empirically. Accepted Article This article is protected by copyright. All rights reserved. 

Another simulation study is done to study the h-step ahead forecasting performances of the proposed process for varying h, compared to the comparison method mentioned in equation

This article is protected by copyright. All rights reserved.

(7.1). For comparison, we consider the measure of forecasting criteria, namely PRMSE(h), defined in equation (5.2). In order to perform this simulation study, we simulate data from the proposed model with (1) one change-point (given in equation (3.2)), and (2) two changepoints (given in equation (3.3) ). Two sets of regression parameters are considered for each of the above two data generating cases. Throughout the whole simulation study, we consider two different values of δ n as 0.1 and 0.2. Each time we generate a total sample of size 100 of which a training set of size 85 is used to fit the two models considered for comparison and a test set of size 15 is considered to find PRMSE(h) for h = 1, 2, 3. This procedure is repeated for 100 times.

For one change-point simulation study, we assume the value of the change-point t ch to be 0.3n where n is the sample size of the data. Two sets of regression parameters used in the data generating process are β 1 = (α, β 0 , β 1 , β 2 ) = (0.3, −0.8, −0.12, 0.09) and (0.2, −0.5, −0.15, 0.1).

For each set of the regression parameters and the tuning parameter δ n , we simulate the data using model (3.1) with λ t given in equation (3.2). Here for the data generating method of one change-point, X n , set of all covariates up to time-point n, consists of both the smooth time-varying components and the simple time-varying components up to time-point n as described in Section 7.1, where n is the sample size of the simulated data set. The process is repeated for 100 times and we report the h-step ahead forecasting performances for both the proposed model and the comparison model for h = 1, 2, 3 in Tables 11 and 12 where we see the average PRMSE(h) of the proposed process is relatively smaller than that of the comparison process. It is also observed from the tables that the measure seems to have an increasing pattern in h, and this coincides with the theoretical result mentioned in Section 5.2. 

This article is protected by copyright. All rights reserved. 

For the study of two change-points, the change-points t ch1 and t ch2 are assumed to be 0.2n and 0.6n, respectively. Two sets of regression parameters used in the data generating process are β 2 = (α, β 0 , β 1 , β Tables 13 and 14 where we see the average PRMSE(h) of the proposed process is relatively smaller than that of the comparison process. It is also observed that the measure seems to have an increasing pattern in h, and this coincides with the theoretical result mentioned in Section 5.2. 

In this section, we consider two real data sets -(i) Italy COVID-19 data with total of 113 observations, and (ii) Kerala COVID-19 data with total of 90 observations, to illustrate the usefulness of our proposed model. We compare our proposed model to the comparison model mentioned in equation (7.1). For investigating the predictive performances of these two models, we take 100 observations of Italy data as the training set along with the remaining 13 observations as the test set, and for Kerala data, we consider 78 observations as the training set along with the test set of remaining 12 observations.

Accepted Article This article is protected by copyright. All rights reserved.

In this section, we analyze the COVID-19 data of daily active cases of Italy (described in Section 2.1) through our proposed method. We also fit the comparison model (described in equation (7.1)) to this data set. We consider 113 data points from February 15 to June 6. Here X n , the set of all time-varying covariates up to time-point n, contains both the smooth time-varying covariates which have the change-point and the simple time-varying covariates up to n time-points as described in Section 7.1. For this data set, n = 113. From the daily time series plot, we see that there is only one change-point during that period and hence we fit the proposed model with one change-point (given in equation (3.2)).

For the proposed model with one change-point, the change-point t ch for the COVID-19 data of Italy is estimated using the method described in Section 3.4 and the estimated point is the 36 th time-point. This mostly implies that the number of cases increased up to March 21 (36 days from February 15) and after that, the number of daily new cases started decreasing gradually. In order to estimate the optimal δ n , we consider a set of points in the interval [0.1,10] with an increment of 0.1. For each of the δ n in the set, we fit our one change-point model to the data. For every fit, we calculate the goodness-of-fit measure namely Root Mean Squared Error (RMSE). Then we consider the minimum value of this measure to obtain the optimal δ n . δ n = 0.1 gives the minimum value of RMSE which is 949.85. Hence the estimated value of δ n is 0.1.

For this data set, the estimates of the regression parameters of our proposed model by CLS method areβ 1 cls = (α cls ,β 0 cls ,β 1 cls ,β 2 cls ) = (0.9703, 3.5562, −0.1705, 0.1381), and that of the comparison model are (α * cls ,γ 0 cls ,γ 1 cls ) = (0.9925, 7.5794, −0.0156). The RMSE corresponding to our proposed model for the data set is 949.85, which is much lower compared to that for the comparison model, which is 1940.95. In Figure 5 , we provide the plot of RMSEs against each of δ n 's in the set [0.1,10]. And in Figure 6 , we give the plot of the original data along with the fitted data through both the comparison model and the proposed model.

In Figure 6 , if we observe closely, we can find that the fitted data through our proposed model overlaps with major portions of the original data, but for the comparison method, we can distinguish between the original data and the fitted data in those major portions. The differences between the fitted data through our model and that through the comparison model seem to be small in Figure 6 since the magnitudes of observed data points are very high and hence the RMSEs help us here to see the differences between our proposed process and the comparison process easily rather than the plot. Overall, we can say the fit through our proposed model is good.

Accepted Article This article is protected by copyright. All rights reserved. Accepted Article This article is protected by copyright. All rights reserved.

To study the forecasting performance, we partition the data into two sets. As described earlier, the training set containing the first 100 observations is used to fit the models, and the test set with the remaining 13 observations, is used for finding the forecasting measure PRMSE for both models. For this setup, the estimates of the regression parameters of our proposed model by CLS method areβ 1 cls = (α cls ,β 0 cls ,β 1 cls ,β 2 cls ) = (0.9700, 3.6842, −0.1638, 0.1342), and that of the comparison model are (α * cls ,γ 0 cls ,γ 1 cls ) = (0.9930, 7.5369, −0.0123). For onestep ahead forecasting (h = 1), PRMSEs for the proposed model and the comparison model are 1049.62 and 1903.99, respectively. For two-step ahead forecasting (h = 2), PRMSEs for the proposed model and the comparison model are 1994.67 and 3788.10, respectively. So for both the one-step and two-step ahead forecasting results, our proposed model performs much better than the comparison model.

Here we analyze the COVID-19 data of daily active cases of Kerala (see Section 2.2) through our proposed method. We also fit the comparison model (described in equation (7.1)) to this data set. In this data set of Kerala, we consider 90 data points from March 9 to June 6. Here X n consists of both the smooth time-varying covariates which have two change-points and the simple time-varying covariates up to n time-points as described in Section 7.1. For this data set, n = 90. From the daily time series plot, we notice that there are two change-points during that period and hence we fit the proposed model with two change-points (given in equation (3.3)).

For the proposed model with two change-points, the change-points t ch1 and t ch2 are estimated using the method described in Section 3.4 and the estimated change-points are 19 th and 54 th time-points. This mostly implicates that the number of cases increased up to March 27 (19 days from March 9), then the number of daily new cases started decreasing gradually, but after May 1 (54 days from March 9) the cases again began to rise. In order to estimate the optimal δ n for this data, we follow the same process as mentioned for the COVID-19 data of Italy. We find that δ n = 0.2 gives the minimum value of RMSE which is 12.88. Hence the estimated value of δ n is 0.2.

For this data set, the estimates of the regression parameters of our proposed model by CLS method areβ 2 cls = (α cls ,β 0 cls ,β 1 cls ,β 2 cls ,β 3 cls ) = ( 0.8700, −0.1156, −0.2352, 0.1269, 0.1956), and that of the comparison model are (α * cls ,γ 0 cls ,γ 1 cls ) = (0.9921, −2.8129, −0.0811). The RMSE corresponding to our proposed model for the data set is 12.88, which is much lower compared to that for the comparison model, which is 15.10. In Figure 7 , we provide the plot of RMSEs against each of δ n 's in the set [0.1,10]. And in Figure 8 , we give the plot of the original data along with the fitted data through both the comparison model and the 27 Accepted Article This article is protected by copyright. All rights reserved.

proposed model.

If we study Figure 8 closely, we see that the fitted data through our proposed model overlaps with major portions of the original data, whereas we can distinguish between the original data and the fitted data by the comparison model in those major portions. Here also the RMSEs help us to see the differences between the proposed method and the comparison method easily. So overall, we can say that the fit through our proposed model is good. Accepted Article This article is protected by copyright. All rights reserved.

To study the forecasting part, we partition the data into two sets. As described earlier, the training set, containing the first 78 observations, is used to fit the models, and the test set with the remaining 12 observations, is used for finding the forecasting measure PRMSE for both models. For this setup, the estimates of the regression parameters of our proposed model by CLS method areβ 2 cls = (α cls ,β 0 cls ,β 1 cls ,β 2 cls ,β 3 cls ) = (0.9021, −0.5960, −0.2897, 0.2164, 0.2171), and the estimates for the comparison model are (α * cls ,γ 0 cls ,γ 1 cls ) = (0.9947, −9.7129, 0.1759). For one-step ahead forecasting (h = 1), PRMSEs for the proposed model and the comparison model are 154.06 and 275.80, respectively. For two-step ahead forecasting (h = 2), PRMSEs for the proposed model and the comparison model are 186.99 and 348.68, respectively. So for both one-step and two-step ahead forecasting studies, our proposed model performs much better than the comparison model. (1985) and Al-Osh and Alzaid (1987) ) has received significant attention owing to its simplicity and is used widely in the field of count time series data. But this process is unable to model the count time series data like the COVID-19 data containing change-points and time-varying covariates. In this article, we have developed a new PINAR(1) model based on binomial thinning operator to handle the problem of change-point analysis through time-varying covariates. The development of our proposed model is inspired by Chan and Tong (1986) ; Hansen (2000) and Fong et al. (2017) who mainly worked on continuous data. We have used the concept of Smooth Maximum in the proposed model to develop the smoothing change-point function which enables the model to capture the changing curvatures in the data. The key feature of our proposed model is its ability to accommodate both change-points and time-varying covariates. As described earlier, we can see these features in the COVID-19 data sets from which we have obtained the idea to develop our proposed model for both the studies of one change-point and two change-points. We have studied the distributional forms of our proposed model along with the h-step ahead forecasting distribution. Because of the difficulty in estimating the regression parameters through the maximum likelihood method, we have employed the CLS estimation method. We have performed an extensive simulation study to examine the confidence intervals for true change-points for varying sample sizes and seen that as sample sizes increase widths of the confidence intervals decrease. Regarding the estimation of parameters, the simulation results have shown consistency of the CLS estimation method. From the data applications, we can see that our proposed model has led to much better performance over the comparison model with respect to standard statistical measure like RMSE. Our proposed model has also given a much better performance than the comparison model in the forecasting area with respect to the accuracy measure PRMSE in both the simulation study and the data analysis part. We can further extend our proposed model for more than two change-points in the same way as the model for one change-point analysis has been extended to that for two change-point analysis. Therefore we hope that our proposed model could be a viable choice for modeling these kinds of count time series data sets.

where {N 1i } is a sequence of iid Bernoulli(α) random variables, independent of Y n−1 , and we know that given X n , α • Y n−1 and ε n are independent. So we can write the PGF of Y n |X n as

So we have the recursive formula Φ Yn|Xn (s) = Φ Y n−1 |Xn (1 − α + αs) Φ εn|Xn (s).

(10.1)

Putting n = 2 in equation (10.1), we have

Therefore Y 2 |X 2 ∼ Poisson(αλ 1 + λ 2 ). Therefore Y 3 |X 3 ∼ Poisson(α 2 λ 1 + αλ 2 + λ 3 ). Now for n = (k − 1), we assume that Y k−1 |X k−1 ∼ Poisson α k−2 λ 1 + α k−3 λ 2 + ... + λ k−1 .

For n = k,

= exp[− α k−1 λ 1 + α k−2 λ 2 + ... + αλ k−1 (1 − s) − λ k (1 − s)]

= exp[− α k−1 λ 1 + α k−2 λ 2 + ... + λ k (1 − s)].

Y k |X k ∼ Poisson α k−1 λ 1 + α k−2 λ 2 + ... + λ k .

This completes the proof.

Accepted Article This article is protected by copyright. All rights reserved.

The ACVF is given by

−(α t−1 λ 1 + ... + λ t ) α t+h−1 λ 1 + ... + α h λ t = α h α t−1 λ 1 + α t−2 λ 2 + ... + λ t .

This completes the proof.

The conditional PGF is given by

Since α h • Y n |Y n , X n+h ∼ Bin Y n , α h , we can write E s α h • Yn |Y n , X n+h = 1 − α h + α h s Yn .

This completes the proof.

Accepted Article This article is protected by copyright. All rights reserved.

First-order integer-valued autoregressive (INAR(1)) process

An integer-valued pth-order autoregressive structure (INAR(p)) process

Econometric models based on count data: Comparisons and applications of some estimators and tests

Inference for optimal dynamic treatment regimes using an adaptive m-out-of-n bootstrap scheme

On estimating thresholds in autoregressive models

Some recent progress in count time series

chngpt: threshold regression model estimation and inference

Asymptotic properties of CLS estimators in the Poisson AR(1) model

Forecasting discrete valued low count time series

Sample splitting and threshold estimation

An Introduction to Statistical Learning: with Applications in R

First-order integer valued AR processes with zero inflated Poisson innovations

The integer-valued autoregressive (INAR(p)) model

Time series of zero-inflated counts and their coherent forecasting

Some simple models for discrete variate time series

Autoregressive moving-average processes with negative-binomial and geometric marginal distributions

A new geometric first-order integer-valued autoregressive (NGINAR(1)) process

Compound Poisson INAR(1) processes: Stochastic properties and testing for overdispersion

Smooth Maximum

Discrete analogues of self-decomposability and stability

On a test whether two samples are from the same population

We would like to thank the Associate Editor and the Reviewers for their constructive comments which led to substantial improvement over the earlier version.

We haveAccepted Article

The h-step ahead forecasting distribution is given byThis completes the proof.

Accepted Article This article is protected by copyright. All rights reserved.