key: cord-0488441-l7zdj9vh
authors: Roy, Arkaprava; Karmakar, Sayar
title: Bayesian semiparametric time varying model for count data to study the spread of the COVID-19 cases
date: 2020-04-05
journal: nan
DOI: nan
sha: 9234d02adc2a4f97e0b24bfcdfe8d9fef56fb2a9
doc_id: 488441
cord_uid: l7zdj9vh

Recent outbreak of the novel corona virus COVID-19 has affected all of our lives in one way or the other. While medical researchers are working hard to find a cure and doctors/nurses to attend the affected individuals, measures such as `lockdown', `stay-at-home', `social distancing' are being implemented in different parts of the world to curb its further spread. To model this spread which is assumed to be a non-stationary count-valued time series, we propose a novel time varying semiparametric AR$(p)$ model for the count valued data of newly affected cases, collected every day and also extend it to propose a novel time varying semiparametric INGARCH model. We calculate posterior contraction rates of the proposed Bayesian methods. Our proposed structures of the models are amenable to Hamiltonian Monte Carlo (HMC) sampling for efficient computation. We show excellent performance in simulations. Our method is then applied on the daily time series data of newly confirmed cases to study its spread through different government interventions.

Since there is no vaccine yet in the market, government is implementing strong measures such as city-wide or state-wide 'lockdown's, 'stay-at-home' notices, extensive testing to control the outbreak. Thus, this count-valued time-series of daily new cases of infection is expected to vary largely. While initially by the contagious nature these spread looks exponential, once some measures are undertaken the number of cases go down but at varying degrees depending on how strict was the enforcement, how dense is the population etc. There have been ample research study in a very short time to discuss effectiveness of lockdown and forecast of the future path of how the virus will spread. But somehow a comprehensive understanding of the statistical model, its estimation and uncertainty quantification remains inadequate. SIR model (Song et al., 2020) , proportional model (Deb and Majumdar, 2020) , some Bayesian epidemic models (Clancy et al., 2008; Jewell et al., 2009) etc. have been used in the context of continuous modelling because it benefits time-series formulation and ready computation. Relatively straight-forward models has been considered such as polynomial trends or presence of an ARMA structure etc. However, we wish to stick to the actual daily count of new affections and this brings us to an unique juncture of analyzing a count time series with smooth varying coefficients.

Modelling count time series is important in many different fields such as disease incidence, accident rates, integer financial datasets such as price movement etc. This relatively new research stream was introduced in Zeger (1988) and interestingly he analyzed another outbreak namely the US 1970 Polio incidence rate. This stream was furthered by Chan and Ledolter (1995) where Poisson generalized linear models (GLM) with an autoregressive latent process in the mean is discussed. A wide range of dependence was explored in Davis et al. (2003) for simple autoregressive (AR) structure and external covariates. On the other hand a different stream explored interger-valued time series counts such as ARMA structures as in (Brandt and Williams, 2001; Biswas and Song, 2009) or INGARCH structure as done in Zhu (2011 Zhu ( , 2012c . However, from a Bayesian perspective, only work to best of our knowledge is that of Silveira de Andrade et al. (2015) where the authors discussed an ARMA model for different count series parameters. However, their treatment of ignoring zero-valued data or putting the MA structure by demeaned Poisson random variable remains questionable. None of these works focused on time-varying nature of the coefficients except for a brief mention in .

Rapid change in the observed counts make all earlier time-constant analysis invalid and builds a path where we can explore methodological and inferencial development in tracking down the trajectory of this spread. We propose a novel semiparametric time varying autoregressive model for counts to study the spread and examine the effects of these interventions in the spread based on the time-varying coefficient functions. A time varying AR(p) process consists of a time-varying mean/intercept function along with time varying autoregressive coefficient functions. We further generalize it to a time varying integer valued generalized autoregressive conditional heteroscedasticity (tvBINGARCH) model where the conditional mean depends also on the past conditional means. Given the exponential trend of the spread, it is expected that the mean would vary with the level.

Our goals are motivated from both the application and methodological development. First, this modelling is to the best of our knowledge first such work for count time series with time-varying coefficients within autoregressive framework. The mean function stands for the overall spread and the autoregressive coefficients stand for different lags. Since this virus can be largely asymptomatic carrier for the first few days we wish to identify which lags are significant in our model which can be directly linked to how many days the symptom spread but did not show up. We show that for different areas lags 6 to 10 are significant. These findings are in-line with several research articles discussing the incubation length for the novel coronavirus with a median of 5-6 days and 98% below 11 days. For example, see Lauer et al. (2020) . A few province, state, countries have ordered lockdown or stay-at-home orders of various degree and we find that even after these orders are in effect it takes about 12-16 days to reach the peak and then the intercept coefficient function starts decreasing. This is also an interesting find which characterizes the fact that number of infected but asymptomatic cases are large compared to the new cases reported. Additional to the time-varying AR model proposal, we also offer an analysis via time-varying INGARCH model that assumes an additional recursive term in the conditional expectation (See (2.9)) and offer some more comprehensiveness in the modelling part as even INGARCH with small orders can help us get rid of choosing an appropriate maximum lag value. This model is an extension of the GARCH model to the context of count data. First introduced by Ferland et al. (2006) , these models were thoroughly analyzed in Zhu (2012c Zhu ( ,b, 2011 Zhu ( , 2012a ; Ahmad and Francq (2016) . We propose a time-varying analogue of this model which is also a new contribution. Finally we contrast the time-varying AR and the GARCH for both simulations and real-data applications under different windows of evaluation.

An important criticism of the estimate of basic reproduction number in the various research items over the past two months of Feb and March 2020 is its huge degree of variability. This skepticism is natural as the serial interval distribution of a disease cannot be estimated consistently unless we have an exact infector/infectee pair dataset. Majority of this research is pulling out these estimates based on what we know about two other outbreaks, namely SARS and MERS. One can easily see the R0 estimates from the popular R0 package by Obadia et al. (2012) depends on the start and end date and a plugin value of serial distribution. Keeping this in mind, we decided to NOT have R0 coefficient in our model. Instead we focus on the autoregressive coefficient functions and provide insights which we believe can be used to develop new estimates of the basic reproduction number from the data itself, without having to rely on the serial number distribution from other disease.

In our present context, the number of affected can be covered by the popular SIR model in this case, however they assume additional structure on how these numbers evolve and then tries to estimate the rate. Instead, we do not assume any such specific evolution and offer a general perspective. Our simulation results corroborate consistent estimation of the unknown functions. Regression models with varying coefficient were introduced by Hastie and Tibshirani (1993) . They modeled the varying coefficients using cubic B-splines. Later, these models has been further explored in various directions Gu and Wahba (1993) ; Biller and Fahrmeir (2001) ; Fan and Zhang (2008); Franco-Villoria et al. (2019); Yue et al. (2014) . Spline bases have been routinely used to model the time varying coefficients within non-linear time series models (Cai et al., 2000; Huang et al., 2002; Huang and Shen, 2004; Amorim et al., 2008) . We also consider B-spline series based priors to model the time-varying coefficients in our model. In Bayesian literature of non-linear function modeling, B-splines series based priors have been extensively developed under different shape constraints (He and Shi, 1998; Meyer, 2012; Das and Ghosal, 2017; Mulgrave et al., 2018; Roy et al., 2018) . To the best of our knowledge, there is no other work that puts (2.2) or (2.10)-specific shape constraints, required for a time-varying AR or GARCH using B-spline series. We also calculate posterior contraction rates based on minimal assumptions. Although, Poisson nonparametric regression is discussed in Ghosal and Van der Vaart (2017) , the most intuitive link functions that we consider here due to the autoregressive nature of the model are beyond the coverage of their book. We also discuss a pointwise inferential tool by drawing credible intervals. Such tools are important to keep an objective perspective in terms of the evolution of the time-varying coefficients without restricting it to some popular models. See for a comprehensive discussion on time-varying models and its applications.

The rest of the paper is organized as follows. Section 2 describes the proposed Bayesian models in detail. Section 3 discusses an efficient computational scheme for the proposed method. We calculate posterior contraction rates in Section 4. We study the performance of our proposed method in capturing true coefficient functions and show excellent performance over other existing methods in Section 5. Section 6 deals with an application of the proposed method on COVID-19 spread for different countries. Then, we end with discussions, concluding remarks and possible future directions in Section 7. The supplementary materials contain theoretical proofs and some additional results.

In this paper, our primary focus is on modeling the daily count of newly confirmed cases of COVID-19 in a non-parametric way. Instead of compartmental models from epidemiology, we choose to focus on building a model solely out of the COVID-19 dataset. This brings us to discuss Poisson autoregression with time-varying coefficients. Let {X t } be a count-valued time series. In this paper, we consider two different structure for the conditional mean of {X t } given the history of the process. The first modeling framework is in the spirit of time varying auto-regressive models. Next, we consider another modeling structure in the direction of time varying generalized autoregressive conditional heteroscedasticity models.

The linear Poisson autoregressive model (Zeger, 1988; Brandt and Williams, 2001 ) is very popular in the literature to analyse count valued time series. Due to the assumed non-stationary nature of the data, We propose a time-varying version of this model. Thus, the conditional distribution for count-valued time-series X t given F t−1 = {X i :

(2.1)

We call our method Time Varying Bayesian Auto Regressive model for Counts (TVBARC). The rescaling of the time-varying parameters to the support [0,1] is usual for nonparametric function modeling. Due to the Poisson link in (2.1), both conditional mean and conditional variance depend on the past observations. The conditional expectation of X t in the above model

Additionally we impose the following constraints on parameter space for the time-varying paramters,

Note that, the condition on the AR parameters imposed by (2.2) is popular in timevarying AR literature. See Dahlhaus and Subba Rao (2006) ; Fryzlewicz et al. (2008) ; for example. Even though the condition on µ(·) seem restrictive in the light of what we need for invertible time-constant AR(p) process with Gaussian error, it is not unusual when it is used to model variance parameters to ensure positivity; it was unanimously imposed for all the literature mentioned above. Additionally, the above references heavily depend on local stationarity: namely, for every 0 < x < 1, they assume existence of anX i process which is close to the observed process. One key advantage of our proposal is it is free of any such assumption. Our assumption of only the first moment is also very mild. Moreover, except for a very general linear model discussed in , to the best of our knowledge, this is the very first analysis of time-varying parameter for count time-series modelled by Poisson regression. Thus we choose to focus on the methodological development rather than proving optimality of these conditions. When p = 0, our proposed model reduces to routinely used nonparametric Poisson regression model as in Shen and Ghosal (2015) .

To proceed with Bayesian computation, we put priors on the unknown functions µ(·) and a i (·)'s such that they are supported in P 1 . The prior distributions on these functions are induced through basis expansions in B-splines with suitable constraints on the coefficients to impose the shape constraints as in P. Detail description of the priors are given below,

Here B j 's are the B-spline basis functions. The parameters δ j 's are unbounded.

The prior induced by above construction are P-supported. The verification is very straightforward. In above construction,

if and only if δ 0 = −∞, which has probability zero. On the other hand, we also have µ(·) ≥ 0 as we have exp(β j ) ≥ 0. Thus, the induced priors, described in (2.3)− (2.8) are well supported in P.

In the previous model, both conditional mean and conditiona variance depend on the past observations. However, Ferland et al. (2006) proposed integer valued analogue of generalized autoregressive conditional heteroscedasticity model (GARCH) after observing that the variability in number of cases of campylobacterosis infections also changes with level. Given the complexity of COVID-19 data, we also introduce the following time-varying version of the integer valued generalized autoregressive conditional heteroscedasticity model (INGARCH) for counts. The conditional distribution for countvalued time-series X t given

(2.9)

We call our method Time Varying Bayesian Integer valued Generalized Auto Regressive Conditional Heteroscedastic (TVBINGARCH) model. We impose following constraints on the parameter space similar to Ferreira et al. (2017) 

(2.10)

This constraint ensure a unique solution of the time varying INGARCH process as discussed in Ferreira et al. (2017) . Now, we modify the proposed prior from the previous subsection to put prior on the functions µ(·), a i (·) and b j (·) such that they are supported in P 2 . Using the B-spline bases, we put following hierarchical prior on the unknown functions,

Similar calculations from previous subsection also shows that the above hierarchical prior in 2.11 to 2.19 is well-supported in P 2 . We primarily focus on the special case where p = 1, q = 1.

In this section, we discuss Markov Chain Monte Carlo (MCMC) sampling method for posterior computation. Our proposed sampling is dependent on gradient based Hamiltonian Monte Carlo (HMC) sampling algorithm (Neal et al., 2011) . Hence, we show the gradient computations of the likelihood with respect to different parameters for TVBARC(p) and TVBINGARCH(p, q) in following two subsections.

The complete likelihood L of the propose Bayesian method in (2.1) is given by

. We develop efficient MCMC algorithm to sample the parameter β, θ and δ from the above likelihood. The derivatives of above likelihood with respect to the parameters are easily computable. This helps us to develop an efficient gradient-based MCMC algorithm to sample these parameters. We calculate the gradients of negative log-likelihood (− log L 1 ) with respect to the parameters β, θ and δ. The gradients are given below,

where 1 {j=k} stands for the indicator function which takes the value one when j = k.

The complete likelihood L 2 of the propose Bayesian method of (2.9) is given by

We calculate the gradients of negative log-likelihood (− log L 2 ) with respect to the parameters β, θ, η and δ. The gradients are given below,

While fiting TVBINGARCH(p,q), we assume for any t < 0 X t = 0, λ t = 0. Thus, we need to additionally estimate the parameter λ 0 . The derivative of the likelihood with respect to λ 0 is calculated numerically using the jacobian function from R package pracma. Hence, it is sampled using the HMC algorithm too.

As the parameter spaces of θ ij 's and η kj 's have bounded support, we map any Metropolis candidate, falling in outside of the parameter space back to the nearest boundary point of the parameter space. The leapfrog step is kept fixed at 30 and the step size parameter is tuned to maintain an acceptance rate within the range of 0.6 to 0.8. The step length is reduced if the acceptance rate is less than 0.6 and increased it if the rate is more than 0.8. This adjustment is done automatically after every 100 iterations. Due to the increasing complexity of the parameter space in TVBINGARCH, we propose to update all the parameter involved in a i (·)'s and b k (·)'s together.

4 Large-sample properties 4.1 TVAR structure

We start by studying large sample properties of the simpler AR model in (2.1). For simplicity, we fix order p at p = 1 for this section however the results are easily generalizable for any fixed order p. The posterior consistency is studied in the asymptotic regime of increasing number of time points T . Let κ = (µ, a 1 ) stands for the complete set of parameters. For sake of generality of the method, we put a prior on K 1 and K 2 with probability mass function given by,

Poisson and geometric p.m.f.s appear as special cases of the above prior density for b i3 = 1 or 0 respectively. These priors have not been considered while fitting the model as it would require computationally expensive reversible jump MCMC strategy. We study the posterior consistency with respect to the empirical 2 -distance on the coefficient functions. The empirical 2 -distance between κ 1 and κ 2 is given by

The contraction rate will depend on the smoothness of true coefficient functions µ and a and the parameters b 13 and b 23 from the prior distributions of K 1 and K 2 . Let κ 0 = (µ 0 , a 01 ) be the truth of κ 

3) is imposed to ensure strict positivity of parameters and is standard in time-varying literature that deals with such constrained parameters.

Theorem 1. Under assumptions A.1-A.3, let the true functions µ 0 (·) and a 10 (·) be Hölder smooth functions with regularity level ι 1 and ι 2 respectively, then the posterior contraction rate with respect to the distance d 1,T is

The proof is postponed to the supplementary materials. The proof is based on the general contraction rate result for independent non-i.i.d. observations (Ghosal and Van der Vaart, 2017) and some results on B-splines based finite random series.

Next we discuss the more comprehensive tvINGARCH model (2.9). To maintain simplicity in the proof, we again assume p = 1, q = 1. Similar to the previous subsection, we put a prior on the number of Bspline bases, K i with probability mass function given by,

with b i1 , b i2 > 0 and 0 ≤ b i3 ≤ 1 for i = 1, 2, 3. Let us assume that ψ = (µ, a 1 , b 1 ) be the complete set of parameters. We study posterior contraction rate with respect to empirical 2 -distance which is defined as,

For this we modify the assumptions as Assumptions(B): There exists constants 0 < M µ < M X such that,

by recursion. Also we have, by assumption (B.2) E(max(y t , λ t )) < M X .

Assumption (B.3) is imposed to ensure strict positivity of parameters and is standard in time-varying literature that deals with such constrained parameters.

Theorem 2. Under assumptions B.1-B.3, let the true functions µ 0 (·), a 10 (·) and b 10 (·) be Hölder smooth functions with regularity level ι 1 , ι 2 and ι 3 respectively, then the posterior contraction rate with respect to the distance d 2,T is

The proof follows similar strategy as in Theorem 1. The detail of the proof can be found in the supplementary materials.

In this section, we study performance of our proposed Bayesian method in capturing the true coefficient functions. We explore both TVBARC and TVBINGARCH with some competing models. It is important to note that, this is to the best of our knowledge first work in Poisson autoregression with time-varying link. Thus, we compare our method with the existing time-series models with time-constant coefficients for count data and time varying AR with Gaussian error. We also examine estimation accuracy of the coefficient functions with respect to the truth.

The hyperparameters c 1 and c 2 of the normal prior are all set 100, which makes the prior weakly informative. The hyperparmater for Inverse-Gamma prior d 1 = 0.1, which is also weakly informative. We consider 6 equidistant knots for the B-splines. We collect 10000 MCMC samples and consider the last 5000 as post burn-in samples for inferences. In absence of any alternative method for time varying AR(p) model of countvalued data, we shall compare the estimated functions with the true functions in terms of the posterior estimates of functions along with its 95% pointwise credible bands. The credible bands are calculated from the MCMC samples at each point t = 1/T, 2/T, . . . , 1. We also compare different competing methods in terms of average MSE (AMSE) score using INGARCH method of tsglm of R package tscount, GARMA using tscount as well, tvAR and our proposed Bayesian methods. The AMSE is defined as 1 T t (X t −λ t ) 2 .

Here, we consider two model settings p = 1; X t ∼ Poisson(µ(t/T ) + a 1 (t/T )X t−1 ) and p = 2; X t ∼ Poisson(µ(t/T ) + a 1 (t/T )X t−1 + a 2 (t/T )X t−2 ) for t = 1, . . . , T . Three different choices for T have been considered, T = 100, 500 and 1000. The true functions are,

We compare the estimated functions with the truth for sample size 1000 in Figures 1 and Figure 2 for the models p = 1 and p = 2 respectively. Tables 1 and 2 illustrate the performance of our method with respect to other competing methods. 

For the tvINGARCH case, we only consider one simulation settings p = 1, q = 1; X t ∼ Poisson(µ(t/T ) + a 1 (t/T )X t−1 + b 1 (t/T )λ t−1 ). Two different choices for T have been considered, T = 100 and 200,

Figures 3 compares the estimated functions with the truth for sample size 200 for the model in 2.9 with p = 1, q = 1. The performance of our method is compared against other competing methods in Tables 3. Figure 1 to 3 show that our proposed Bayesian method capture the true functions quite well for both of the two simulation experiments. We find that the estimation accuracy improves as sample size increases. As sample size grows, the 95% credible bands are also getting tighter, implying lower uncertainty in estimation. This gives empirical evidence in favor of the estimation consistency which have also been verified theoretically in Section 4. Even though the credible intervals form a very useful tool to build pointwise inference for the time-trajectory of these coefficient functions, we do not report the coverage probability here. The average mean square error (AMSE) is always the lowest for our method in Tables 1, 2 and 3. For the Poisson distribution, mean and variance are the same. Since µ 0 (x) is around 10 for simulation case 5.1, the optimal AMSE will be around 10, which is achieved by our method. Similarly for simulation case 5.2, the optimal AMSE is expected to be around 25.

We collect the data of new affected cases for every day from 23rd January to 14th

April from an open source platform {https://www.kaggle.com/sudalairajkumar/ novel-corona-virus-2019-dataset}. Table 4 provide a summary of total affected cases along with number of recovered and deceased for three mostly affected countries along with Hubei, New York City (NYC) and Seoul, South Korea. Seoul is selected in the analysis for their unique approach to curb the outbreak by implementing aggressive testing strategy. Hubei, Italy and NYC enforced region-wide strict lockdown on January 23rd, March 10th and March 20th respectively. US ingeneral has implemented selective lockdowns across the country.South korea took an alternative measure to track the movements of all affected individuals and to conduct tests for COVID-19 for as many countrymen as possible.

We fit the TVBARC model in 2.1 with p = 10 and the TVINGARCH model in 2.9 with p = 1, q = 1 for the selected set of countries. The hyperparameters are the same as in the Section 5. We collect 5000 post burn samples for inference after burn-in 5000 MCMC samples. We calculate derivatives of the estimated functions using derivative of B-splines (De Boor, 2001) . Note that the confidence bands around the estimated curves provides an uncertainty quantification and offers us to objectively decide on statistically testing certain time-trends.

We compile square-root average MSE (AMSE) scores for different methods in Table 5. One can see all the time-varying methods are doing exceptionally better compared to the time-constant methods. This is not surprising since the spread distribution show significant time-nonstationarity. Interestingly, one of the two Bayesian methods we proposed in this paper stand out as the best fit.

We analyzed three countries, two cities and the earliest epicenter Hubei separately in our analysis. The mechanism of spread in these regions behave a little differently due to population density, government interference, travel etc. However some patterns are very eminently similar for all these regions.

Intercept trend: The trend function µ(·) behaves very similarly for US (Fig 4) , Spain (Fig 8) and Italy (Fig 6. The nature is also prevalent in NYC (Fig 12) however, the same looks somewhat different for Seoul (Fig 14) and very different for Hubei (Fig 10) . Since this analysis is using the data post Jan 23, whereas the first set of infections in Hubei was in mid-December, it is natural that the peak has occurred sometimes earlier and thus the decreasing trend. The slightly bimodal nature for Seoul looks interesting however this might be due to the relatively smaller numbers for Seoul. One can also find similarity in the number of days to reach peak after strict lockdown has been enforced. Our estimate puts it around 15-20 days.

AR(1) function: Note that, at a country level, for each of US, Spain and Italy, the estimated lag 1 coefficient a 1 (·) were not large and generally did not vary much over time. However, the pattern is very different for the cities as one can see this coefficient dominates the other lags significantly. Hubei and Seoul showed a declining curve with periodic peaks however for NYC, there was initial decline, possibly due to the low values around February. But since the beginning of March the a 1 (·) grew very steadily which means the infection was spreading in an exponential fashion which matches with the empirical numbers we observed in this time-frame. The peak for the mean function µ(·) is downward at the present time but the worrying sign is the upward nature of a 1 (·). Interestingly one can see since the lockdown was imposed strictly on March 20 in New york, a 1 has declined and thus it is not unfair to conclude that this measure has helped in prohibiting the rate of spread. This also explains why we see similar patterns in Hubei and Seoul for the AR(1) function. Also, a natural question is why we do not observe this at a country level. The countries are more heterogeneous and with varying degrees of initial infection and local interference.

Important lag: The TVBARC model for US and Italy shows interestingly the sixth lag is dominating lag 1 or 2 almost uniformly. We think this is an extremely important find as this matches with medical research that talks about incubation period of the virus. The number of days required for the symptoms to show up after an infection has been spread is currently being widely researched and this incubation day has been proposed to have a median of 6 days and a 98% quantile of 11 days, see Lauer et al. (2020) . Our finding is coherent with this. This direction of research could potentially be transformative since while estimating the basic reproduction number, there is no result, to the best of our knowledge that could estimate the lag between symptom onset between infector and infectee from the data itself.

TVBINGARCH model: Finally we conclude our discussion of real data analysis with a very comprehensive model such as INGARCH. Even for INGARCH(1,1) model, the single additional recursive parameter b 1 (·) in light of model (2.9) allows us for an excellent fit instead of finding the number of lags upto which one should fit. From Fig 5, 7 and 9, one can see there is striking similarity between the three countries US, Spain and Italy for the AR(1) and CH(1) parameter curves. Hubei (Fig 11) and NYC (Fig 13) show similarity while Seoul (Fig 15) has a different pattern for the CH(1) parameter. Many of these curves have multiple local peaks which corroborates well with the numbers also fluctuating a little bit while flattening out. More interestingly, for all the 6 figures one can see the CH(1) parameter is currently having an upward trend while the intercept trend µ(·) is going down. This dichotomy is an interesting find and can be corroborated with the fact that for a Poisson random variable the intensity parameter is both the mean and variance. While the mean in general is going down the variability for small numbers is somewhat relatively more.

We also provide the estimated derivatives of the estimated µ(·) functions in Fig 16. Overall, we believe that our analysis of data from three countries and three populated cities depict a comprehensive picture about the mechanism of virus spread.

We propose a time-varying Bayesian autoregressive model for counts (TVBARC) and time varying Bayesian integer valued generalized autoregressive conditional heteroskedasticit model (TVBINGARCH) with linear link function within Poisson to study the time series of daily new confirmed cases of COVID-19. We develop a novel hierarchical Bayesian model that satisfies the stability condition for the respective time varying models and propose HMC algorithm based MCMC sampling scheme. We calculate a posterior contraction rate result of the proposed Bayesian method. The 'R' function with a example code can be found at https://github.com/royarkaprava/TVBARC. Relaying on the proposed hierarchical Bayesian model, one can develop a time-varying Bayesian model for positive-valued time-series data too. We summarize our main findings from the analysis of COVID-19 datasets here. First we address the time-varying nature of the dataset that takes care of not only how the virus spreads but also different executive restrictions or government interference. It is a difficult task to pour in other covariates as first it is debatable exactly what to include and second different countries, province even the residents probably behave differently. To keep the flexibility of how the numbers evolve outside the autoregressive effects we choose to keep a mean/intercept coefficient µ(·). With that model set-up we analyze many different countries and two cites. We find out interesting similarities between how the µ(·) behaves over time and see that typically there is a downward trend after a period of around 12 days after lockdown measures have been enforced. However for the AR(1) coefficient, the trends often show multiple peaks probably due to the asymptomatic spreading capability of the virus. On the same note, another interesting find is to see how we find lag number 6-7th to be important in majority of the cases conforming to the facts published about the incubation period length of coronavirus. We also propose an INGARCH model which is more comprehensive than just an AR model and observed an interesting phenomenon in both simulation and real-data analysis. We saw that from a predictive perspective we cannot say the INGARCH always dominates an AR model and also when we settle for a small order INGARCH model we tend to lose out on the interesting 6-7th lag phenomenon that is prevalent with this disease for many cities and countries. However, time-varying INGARCH can be useful in summarizing the coefficients in a more compact way.

There is a growing skepticism in various finding of R0, the basic reproduction number of the pandemic. These are derived from the popular SIR model but clearly due to the huge non-stationary propagation of the data it is heavily dependent on start and end time. Also, the lags for symptom onset between an infector and infectee is difficult to estimate and often done not from the data itself, but from SARS and MERS. We do not think this is a correct approach since the dynamics of virus spread for COVID-19 has been different. Thus we offer a different approach rather than having R0 in our model. However if one wants to make an explicit connection, from the model Obadia et al. (2012) ,

one can see this is very similar to saying a i (t/T ) = R(t)w i . Moreover the interesting find in our paper allows us to highlight that the weights w i have a high concentration around 6 lag. This alternative way of using the data itself to re-estimate the reproduction number can be transformative.

As a future work, it will be interesting to include some country specific information such as demographic information, geographical area, effect of environmental time-series etc in the model. These are usually important factors for the spread of any infectious disease. We can also categorize different type of government intervention effects to elaborate more on specific impacts of the same. In the future we wish to analyze the number of deaths, number of recovered cases, number of severe/critical cases etc. for these diseases as those will hopefully have a different dynamics than the one considered here and can provide useful insights about the spread and measures required. For computational ease, we have considered same level of smoothness for all the coefficient functions. Fitting this model with different level of smoothness might be able to provide more insights. Other than building time-varying autoregessive models for positive-valued data using the hierarchical structure from this article, one interesting future direction is to extend this model for vector-valued count data. In general, it is difficult to model multivariate count data. There are only a limited number of methods to deal with multivariate count data (Besag, 1974; Yang et al., 2013; Roy and Dunson, 2019) . Building on these multivariate count data models, one can extend our time-varying univariate AR(p) to a time-varying vector-valued AR(p). On the same note, even though we imposed Poisson assumption for increased model interpretation, in the light of the upper bounds for the KL distance, it is not a necessary criterion and can be applied to a general multiple non-stationary count time-series. Extending some of the continuous time-series invariance results from to multiple count time-series will be an interesting challenge. Finally, we do wish to undertake an autoregressive estimation of basic reproduction number with the time-varying version of compartmental models in epidemiology immediately. The likelihood based on the parameter space κ is given, P κ (X 0 ) T t=1 P κ (X t |X t−1 ). Let Q κ,t (X t ) be the distribution of X t with parameter space κ. We provide the upper bound the Kullback Leibler divergence between the parameter space as following:

KL(P κ0 (X t |X t−1 = y), P κ (X t |X t−1 = y))Q κ0,t−1 (y)dy + KL(P Qκ 0 ,0 (X 0 ), P Qκ,0 (X 0 )),

where KL(P κ0 (X t |y), P κ (X t |y)) denotes the conditional (on X t−1 = y) Kullback-Leibler divergence between the conditional distributions of X t under κ 0 and κ and Q κ0,t (X t = z) = P Qκ 0 ,0 (X 0 )P κ0 (X t = z|X t−1 ) t−1 l=1 P κ0 (X l |X l−1 )dX 0 dX 1 . . . dX t−1 . Take κ close to κ 0 such that KL(P Qκ 0 ,0 (X 0 ), P Qκ,0 (X 0 )) is bounded, say by one. Then for large T , lim T →∞ KL(κ T 0 , κ T ) T ≤ sup t y KL(P κ0 (X t |y), P κ (X t |y))Q κ0,t−1 (y)dy. (8.1)

This above result is similar to the 1st part of Lemma 8.28 of Ghosal and Van der Vaart (2017) . We need to show the following claim:

Claim 3. For close κ(·) = (µ(·), a 1 (·)) and κ 0 (·) = (µ 0 (·), a 10 (·)), we have sup t y KL(P κ0 (X t |y), P κ (X t |y))Q κ0,t−1 (y)dy µ − µ 0 2 ∞ + a 1 − a 01 2 ∞ , (8.2) Proof of Claim 3. : To show the above, first we establish an upper bound of the KL divergence between two Poisson densities with mean parameters λ 0 and λ. For some λ * between λ 0 and λ, we have, in the light of MVT, KL(Poisson(λ 0 ), Poisson(λ)) = λ 0 (log λ 0 − log λ)

Then the probability on an 2 T -sized around the truth within the sieve W T can be lower bounded by K T +J T T . Also T -entropy of the sieve is bounded by (K 1T +K 2T ) log T using Lemma 10.20. Using the same Lemma we get the upper bound of prior probability in the complement of sieve. It will be exp[−b 12 K 1T (log K 1T ) b13 − b 22 K 2T (log K 1T ) b23 ] for K 1T and K 2T . For θ it will be K 1T exp(−bT 2 ) for some constant b. To satisfy the conditions from general theory of posterior contraction, we have

Following the steps given after Theorem 10.21, we calculate T equal to max T −ι1/(2ι1+1) (log T ) ι1/(2ι1+1)+(1−b13) , T −ι2/(2ι2+1) (log T ) ι2/(2ι2+1)+(1−b23) .

For the INGARCH case, the likelihood based on the parameter space κ is different from above and is given by, P κ (X 0 , λ 0 ) T t=1 P κ (X t |F t−1 ). Let Q κ,t (X t ) be the distribution of X t with parameter space κ. We provide an upper bound the Kullback Leibler divergence between the parameter space as following:

KL(κ T 0 , κ T ) = P Qκ 0 ,0 (X 0 , , λ 0 ) T t=1 P κ0 (X t |F t−1 ) log P Qκ 0 ,0 (X 0 ) T t=1 P κ0 (X t |F t−1 ) P Qκ,0 (X 0 ) T t=1 P κ (X t |F t−1 ) T i=0 dX i = P Qκ 0 ,0 (X 0 , , λ 0 ) T t=1 P κ0 (X t |F t−1 ) T t=1 log P κ0 (X t |F t−1 ) P κ (X t |F t−1 ) T i=0 dX i + P Qκ 0 ,0 (X 0 , λ 0 ) T t=1 P κ0 (X t |F t−1 ) log P Qκ 0 ,0 (X 0 ) P Qκ,0 (X 0 ) T i=0 dX i ≤ T sup t E Ft−1 (KL(P κ0 (X t |X t−1 = y, λ t−1 ), P κ (X t |X t−1 = y, λ t−1 )) + KL(P Qκ 0 ,0 (X 0 , λ 0 ), P Qκ,0 (X 0 , λ 0 )),

where KL in the first term denotes the conditional (on X t−1 = y, λ t−1 ) Kullback-Leibler divergence between the conditional distributions of X t under κ 0 and κ.

Take κ close to κ 0 such that KL(P Qκ 0 ,0 (X 0 ), P Qκ,0 (X 0 )) is bounded, say by one. Then for large T , lim T →∞ KL(κ T 0 , κ T ) T ≤ sup t E Ft−1 (KL(P κ0 (X t |X t−1 = y, λ t−1 ), P κ (X t |X t−1 = y, λ t−1 )).

This above result is similar to the 1st part of Lemma 8.28 of Ghosal and Van der Vaart (2017) . We need to show the following claim:

sieve. Following the steps of Theorem 1, we calculate the posterior contraction rate T equal to max T −ι1/(2ι1+1) (log T ) ι1/(2ι1+1)+(1−b13) , T −ι2/(2ι2+1) (log T ) ι2/(2ι2+1)+(1−b23) ,

Poisson qmle of count time series models

Infection with middle east respiratory syndrome coronavirus. Canadian journal of respiratory therapy: CJRT= Revue canadienne de la therapie respiratoire: RCTR

Regression splines in the time-dependent coefficient rates model for recurrent event data

Spatial interaction and the statistical analysis of lattice systems

Bayesian varying-coefficient models using adaptive regression splines

Discrete-valued arma processes. Statistics & probability letters

A linear poisson autoregressive model: The poisson ar (p) model

Functional-coefficient regression models for nonlinear time series

Monte carlo em estimation for time series models involving counts

Bayesian estimation of the basic reproduction number in stochastic epidemic models

Statistical inference for time-varying ARCH processes

Bayesian quantile regression using random b-spline series prior

Observation-driven models for poisson counts

of applied mathematical sciences. Mechanical Sciences, year

A time series method to analyze incidence pattern and estimate reproduction number of covid-19

Statistical methods with varying coefficient models

Integer-valued garch process

Estimation and prediction of time-varying garch models through a state-space representation: a computational approach

A unified view on bayesian varying coefficient models

Normalized leastsquares estimation in time-varying ARCH models

Subhashis Ghosal and Aad Van der Vaart. Fundamentals of nonparametric bayesian inference

Smoothing spline anova with component-wise bayesian "confidence intervals

Varying-coefficient models

Monotone b-spline smoothing

Functional coefficient regression models for nonimsart-ba ver. 2014/10/16 file: main.tex date

linear time series: a polynomial spline approach

Varying-coefficient models and basis function approximations for the analysis of repeated measurements

Bayesian analysis for emerging infectious diseases

Optimal gaussian approximation for multiple time series

Simultaneous inference for timevarying models

The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application

Constrained penalized splines

Bayesian inference in nonparanormal graphical models

Mcmc using hamiltonian dynamics. Handbook of Markov Chain Monte Carlo

The r0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks

Nonparametric graphical model for counts

High dimensional single-index bayesian modeling of brain atrophy

Adaptive bayesian procedures using random series priors

Bayesian garma models for count data

An epidemiological forecast model and software assessing interventions on covid-19 epidemic in china. medRxiv, 2020. Bayesian time-varying autoregressive model for counts

On poisson graphical models

Bayesian adaptive smoothing splines using stochastic differential equations

A regression model for time series of counts

A negative binomial integer-valued garch model

Modeling time series of counts with com-poisson ingarch models. Mathematical and Computer Modelling

Modeling overdispersed or underdispersed count data with generalized poisson integer-valued garch models

Zero-inflated poisson and negative binomial integer-valued garch models

Thus putting λ 0 = µ 0 (t/T )+a 10 (t/T )y, λ = µ(t/T )+a 1 (t/T )y, we have following upper bound for the left hand side of (8.2), sup t y KL(P κ0 (X t |y), P κ (X t |y))Q κ0,t−1 (y)dy sup t y 2(µ( t T ) − µ 0 ( t T )) 2 + 2(a 1 ( t T ) − a 10 ( t T )) 2 y 2 µ 0 ( t T ) + a 10 ( t T )y Q κ0,t−1 (y)dyIn the above derivation, we have used the closeness of κ(·) = (µ(·), a 1 (·)) and κ 0 (·) = (µ 0 (·), a 10 (·)) multiple times as is and also in conjunction with Assumption A.3 to imply inf t a 1 * (t/T ) > ρ. Due to time varying nature of the coefficient with an AR(1) structure, we could not bound above KL directly using Lemma 2.9 of Ghosal and Van der Vaart (2017) type results that are used for nonparametric Poisson models. Thus, we consider Assumption A.3 to tackle this complicated structure.Proceeding with the rest of the proof of Theorem 1, we use the results of B-Splines,We also have, Consider the following sieve for the parameter spaceThen the T -entropy of the sieve is bounded by a constant multiple of (K 1T + K 2T ) log T . The prior on K 1 and K 2 satisfy the condition A1 from Chapter 10.4 of Ghosal and Van der Vaart (2017) and the induced prior on α is log-normal which satisfies A2 and A3. The Hölder smooth functions with regularity ι can be approximatedly uniformly up to order K −ι with K many B-splines. Thus we have T max{K −ι1 1T , K −ι2 2T } Now we use Lemma 10.20. As overall concentration can not be better that T −1/2 , we assume log(1/ T ) log T .Claim 4. For close κ(·) = (µ(·), a 1 (·), b 1 (·)) and κ 0 (·) = (µ 0 (·), a 10 (·), b 1 (·)), we have

Proof of Claim 4. : Note that, it is easy to prove that under assumption B.3, we have E(max(X t−1 , λ t−1 ) < ∞. Next we use the same upper bound of the KL divergence between two Poisson densities with mean parameters λ 0 and λ. For some λ * between λ 0 and λ, we have, in the light of MVT,Thus putting λ 0 = µ 0 (t/T ) + a 10 (t/T )y + b 10 (t/T )λ t−1 , λ = µ(t/T ) + a 1 (t/T )y + b 1 (t/T )λ t−1 , we prove claim 2 by establishing the following upper bound of the term before E Ft−1 in (8.5),In the above derivation, we have used the closeness of κ(·) = (µ(·), a 1 (·), b 1 (·)) and κ 0 (·) = (µ 0 (·), a 10 (·), b 10 (·)) multiple times as is and also in conjunction with Assumption B.3 to imply inf t a 1 * (t/T ) > ρ and inf t b 1 * (t/T ) > ρ. Due to time varying nature of the coefficient with an AR(1) structure, we could not bound above KL directly using Lemma 2.9 of Ghosal and Van der Vaart (2017) type results that are used for nonparametric Poisson models. Thus, we consider Assumption A.3 to tackle this complicated structure.We also have,