key: cord-0139161-p7k3p9fq
authors: Kobayashi, Hisashi
title: Stochastic Modeling of an Infectious Disease, Part I: Understand the Negative Binomial Distribution and Predict an Epidemic More Reliably
date: 2020-06-02
journal: nan
DOI: nan
sha: 69a5540f2081c512fee25d58a4993946ff3f9b0d
doc_id: 139161
cord_uid: p7k3p9fq

Why are the epidemic patterns of COVID-19 so different among different cities or countries which are similar in their populations, medical infrastructures, and people's behavior? Why are forecasts or predictions made by so-called experts often grossly wrong, concerning the numbers of people who get infected or die? The purpose of this study is to better understand the stochastic nature of an epidemic disease, and answer the above questions. Much of the work on infectious diseases has been based on"SIR deterministic models,"(Kermack and McKendrick:1927.) We will explore stochastic models that can capture the essence of the seemingly erratic behavior of an infectious disease. A stochastic model, in its formulation, takes into account the random nature of an infectious disease. The stochastic model we study here is based on the"birth-and-death process with immigration"(BDI for short), which was proposed in the study of population growth or extinction of some biological species. The BDI process model ,however, has not been investigated by the epidemiology community. The BDI process is one of a few birth-and-death processes, which we can solve analytically. Its time-dependent probability distribution function is a"negative binomial distribution"with its parameter $r$ less than $1$. The"coefficient of variation"of the process is larger than $sqrt{1/r}>1$. Furthermore, it has a long tail like the zeta distribution. These properties explain why infection patterns exhibit enormously large variations. The number of infected predicted by a deterministic model is much greater than the median of the distribution. This explains why any forecast based on a deterministic model will fail more often than not.

Most of the mathematical models of infectious diseases seem to be based on the Kermack-McKendrick model published in 1927 [2] , which was proposed to explain the rapid rise and fall in the number of infected population observed in epidemics such as the great plague in London where more than 15% of the population died (1665-66); and the cholera outbreak in London caused by contamination in the Thames River (1865), and the plague epidemic in Bombay (1906) [3] . The model consists of a system of three coupled nonlinear ordinary differential equations for the infected population I(t), the susceptible population S(t), and the recovered population R(t). Kermack-McKendrick's SIR model is a deterministic model, which provides the expected values of these processes, which we denote as S(t)(= E[S(t)], I(t)(= E[I(t)] and R(t)(= E[R(t)]). A majority of biological and epidemiological models [4, 5, 6] fall in this class of deterministic models.

Actual observed data of the infected population, for instance, is merely an instance or a sample path of this stochastic process I(t). The process naturally deviates from the expected value I(t) obtained by a deterministic model. Thus, a deterministic model alone fails to provide any quantitative explanation when observed data differ significantly from the expected value.

In a stochastic (or probabilistic) model, on the other hand, the intrinsic stochastic nature of a process is explicitly taken into account in its model formulation. The importance of stochastic processes in relation to problems of population growth was pointed out by W. Feller in 1939 [7] . He considered the birth-and-death process in which the expected birth and death rates (per person per unit time) were constants, say, λ and µ. D. G. Kendall [8] extended Feller's birth-and-death (BD) process by considering the birth and death rates as any specified functions of the time t, λ(t) and µ(t). The BD process is a special class of time-continuous discrete-state Markov process, and has found applications in many scientific and engineering fields, including population biology [9] , teletraffic and queueing theory [10] , [11] , [12] , system modeling [13] [14] , pp. 63-94, [15] , pp. 407-410.

I have done some investigation, with help from Prof. Hideaki Takagi, whose unpublished lecture note at the University of Tsukuba [16] provided me with several references, as to who coined the term "birth-and-death process with immigration," and have found the English statistician M.S. Bartlett (1910 Bartlett ( -2002 in his famous book "An Introduction to Stochastic Processes" [17] (1st edition in 1955) in his discussion of the birth-and-death process in "Section 3.4 Multiplicative chain: subsection 3.4.1 "The effect of immigration,"that he uses the phrase "a birth-death-and-immigration process." But Bartlett's doctoral student, David G. Kendall (1918 Kendall ( -2007 gives a detailed analysis of the BDI process in his 1949 article [18] , which Bartlett refers to in his 1949 article [19] . So my tentative conclusion was that Kendall was the first that worked on the BDI although he did not use the term "birth-death-immigration" or something to that effect.

Another English statistician, Norman T. J. Bailey published in 1964 "The Elements of Stochastic Process with Applications to the Natural Sciences," [20] , and in "Section 8.7: The effect of immigration" (pp. 97-101), he gives a thorough treatment of the BDI process. Linda J. S. Allen, "An Introduction to Stochastic Processes with Applications to Biology," (2nd Edition, 2011) [21] discusses the BDI process in Section 6.4.4: Simple Birth and Death with Immigration (pp. 254-258), but her focus seems to be more on the stable case. Frank P. Kelly [12] gives a brief treatment, providing the steady-state distribution. All other numerous textbooks on random processes make no mention of the BDI process, and even the above handful of authors who might have had epidemiologists in mind among their readership seem to treat the BDI process for its possible application to population biology, and none allude to its use in epidemiology.

1. At any time t, an individual is either susceptible (S ), infected and infectious (I ) or recovered and immune (R).

2. Only susceptible individuals can get infected, remain infectious for some time, and recover and become completely immune.

3. There are no births, deaths, immigration or emigration during the study period. In other words, the community is closed.

Consequently, individuals can only make two types of transitions: (i) from S to I, and (ii) from I to R. Thus, the Kermack-Mckendrick model is often referred to as an SIR epidemic model. A model which assumes no immunity (i.e., a recovered person becomes immediately susceptible) is called an SIS model. If we explicitly consider an exposed state, during which an infected individual is not yet infectious, the model is called an SEIR model. A model in which immunity wanes after some period is called an SIRS model, and so forth. In the remainder of this section we give a brief account of the SIR model in a closed community so that the reader can compare this deterministic model to our stochastic model to be presented in the next section. For details of the Kermack-McKedrick type deterministic models, the readers are referred to abundant books and articles; Anderson and May, [5] , Martcheva [6] to name just a few.

Let S(t), I(t) and R(t), respectively 2 , denote the number of the susceptible, infected and recovered at time t. Since we assume no births, deaths, immigration nor emigration, we have

where N is a constant number, representing the population of the community. From the set of assumptions stated above, the deterministic processes can be defined by the following set of three differential equations:

These differential equations, together with (1) and the initial condition

define the deterministic model. It is easy to see that S(t) is monotone decreasing, and R(t) is monotone increasing. The function I(t) increases or decreases at time t, depending on whether the ratio R t defined by the following expression is greater or smaller than unity.

Its initial value R 0 is referred to as the basic reproduction number , a term having its origin in demography. The ratio R t is called the effective reproduction number. or the real-time reproduction number and is more meaningful than R 0 , which is a static number, in estimating the current epidemic situation and making a decision to control the epidemic.

The term βS(t)I(t) in (2) and (3) comes from the argument that the susceptible must have contact with the infected in order to get infected, and if we assume some sort of uniform mixing, the infections should occur at a rate proportional to S(t)I(t). Consequently, the unit of the parameter β is [person/unit-time/person·person], whereas the other parameter 3 µ has the unit of [person/unittime/person].

When the total population N of (1) is sufficiently large, S(t) ≈ N is much larger than I(t) and can be treated as unchanged, at least in the initial phase of an epidemic. Then by setting

and by substituting this into (3), we have the following ordinary differential equation (ODE):

from which and the initial condition (5), we readily find the solution for I(t):

which is an exponentially growing or decaying function, depending on whether a > 0 or a < 0. When a = 0, it is a constant I 0 for all t ≥ 0. When the infected I(t) grows to the extent that the approximation (7) no longer holds, i.e., the "infinite population" assumption fails, we have to deal with the nonlinear differential equations of (2) and (3). A major advantage of the SIR model is that because of the product term S(t)I(t), the differential equations take into account explicitly the fact that occurrences of infections will gradually decreases towards to zero as the susceptible population becomes extinct towards the end of the infection processes. The main drawback of the SIR model, on the other hand, is its inability to capture any probabilistic fluctuation of the infection process. The SIR model may be an appropriate model in describing the interactions between the susceptible group and infected group in a closed environment, such as a hospital, retirement home, cruise ship, night club, etc. But it is a poor model in describing a major outbreak of an epidemic in a larger environment such as a city or a country, where most infections take place independently and randomly, and the product term βS(t)I(t) does not have any significant meaning. Furthermore, the product term makes the entire SIR model a nonlinear system, and makes the system mathematically intractable, except for a few simple cases, which may not be useful in reality.

As we will show in this article and Part II [1] , the deterministic model, in addition to being unable to describe the stochastic fluctuation of an epidemic pattern, is more likely to grossly overestimate the number of casualties. Thus, the deterministic model is not only limited in its applicability, but can be damaging and harmful in some cases.

In this and following sections we will discuss our stochastic model based on the birth-and-beath with immigration (BDI) process. It is a special case of general birth-and-death (BD) process 4 . Before we present a detailed mathematical analysis of this model, we will show an example of our simulation model based on the BDI process. Figure 1 shows the first 6 out of a total of 12 simulation runs consecutively done in one execution of our simulation program in a MATLAB script. Our simulator is based on the event scheduling approach 5 , where events are arrivals of infected persons from the outside (at rate ν persons/day), occurrences of secondary infections within the community (at rate λ infections/day/infectious person), and recovery/deaths of the infected persons (at rate µ recovery/death/day/infected person). The parameter values of λ = 0.3, µ = 0.1 and ν = 0.2 are assumed. Figure 2 plots the same set of curves in a semilog scale. The exponential growth curves are shown as straight-lines; the initial part of the simulation is more clearly shown in the semilog scale. Figures 3 and 4 are the plots of the remaining six runs.

There are at least two questions concerning these simulation runs.

1. Why are the variations among different simulation runs so large?

2. There are more runs whose plots are below the deterministic model curve. Does the deterministic model tend to overestimate the size of the infected population? If so, why?

The analysis of the BDI process in the following sections should be able to answer these questions.

Let I(t) represent the number of infected persons at time t, and P n (t) be the time-dependent (or transient) probability mass function (PMF) 6 of the process I(t), i.e., P n (t) = Pr[I(t) = n], n = 0, 1, 2, · · · , and t ≥ 0.

We assume that each infected person is infectious, and infects susceptible persons at rate λ [persons/unit-time/person], where the time unit can be arbitrary, e.g., a second, an hour, a day, etc. Let us assume that an infected person recovers, gets removed or dies at rate µ/[unit-time]. Thus, µ −1 [unit-time] is the mean infectious period. The ratio λ/µ is equal to the basic reproduction 4 William (Willy) Feller introduced what is now known as the "birth-and-death" process in his 1939 article [7] published in German regarding the problem of population growth. He used the term "Vermehrung" (reproduction) and "Tod" (death). David. G. Kendall in his 1948 paper [8] referred to Feller's model as a "birth-and-death" process. Feller used this term in his Volume I [9] , whose first edition was in 1950. 5 The time-asynchronous event scheduling approach is a more time efficient and accurate simulation method than a time-synchronous approach. See e.g., [13] pp. 230-234, or [14] , pp. 626-630. 6 We could use perhaps a more common term the probability distribution function but PMF is more explicit that we are dealing with a discrete distribution, not a continuous distribution. number, i.e., the mean number of infections caused by an infected person during the infectious period.

We can formulate an infectious disease as a birth-and-death (BD) process, by defining the birth and death rates both of which are simple linear functions of the state n of the process I(t):

A few remarks are in order. This particular state-dependent BD process is also known as the "birthdeath-immigration (BDI)" process (see e.g., [12] , p. 14 for the steady state distribution), in which the parameters λ, µ and ν represent the birth (i.e., secondary infection), death (i.e., recovery or death) and immigration (i.e., arrival of an infected individual from outside) rates, respectively. A few remarks are in order.

1. In actuality, "recovery," "removal" and "death" are distinctly different matters. In analyzing the infection process, however, these three sources of loss from the susceptible or infected population, are mathematically equivalent in the sense they will not contribute to the infection process in the future. We assume here that those who have recovered from the disease have acquired immunity and will not be susceptible nor infectious.

The assumption that each infected individual recovers (or is removed or dies) at rate µ is equivalent to assuming that the duration S that each sick person remains infectious is exponentially distributed with mean 1/µ, that is:

2. It can be shown mathematically that many of our results to be obtained in terms of the probability mass function (PMF) of I(t), and other related quantities are insensitive to the actual distribution of S . All that matters is that we set

3. In the present paper, we assume that the population is homogeneous, and the susceptible population size remains sufficiently large, thus mathematically treated as "infinite." Furthermore, the parameters λ, µ and ν are assumed to be constant. Many of our results can be extended to the case of multiple types of populations (e.g., clustering of infections): a model with the susceptible population decreases as some of them get infected; and the case where the model parameters' values change (e.g., the situation where the infectious rate λ may change as people's behavior changes), and these generalized models will be discussed in subsequent reports.

We can show (see e.g., [15] , pp.407-408) that the PMF (10) of this BD process should satisfy the following set of linear differential-difference equations, a.k.a. Kolmogorov's forward equation:

with the initial condition

where δ m,n is Kronecker's delta. We transform the above set of infinitely many equations (13) into a single equation by using the probability generating function (PGF) (see e.g., [15] , p. 402) defined by

Multiply the set of equations (13) by z n and sum them from n = 0 to ∞, obtaining the following partial differential equation:

with the boundary condition

Although the process I(t) is the main focus of our analysis, it will be worthwhile to introduce related processes and our assumptions.

Definition 1.

is the cumulative count of external arrivals of infectious individuals from the outside. We assume that A(t) is a Poisson process with rate ν [persons/unit time], 2. The process B(t) is the cumulative count of internally infected individuals. We assume that the birth of such persons occurs at the rate of λ [persons/unit time/infectious person].

3. The process R(t) is the cumulative count of recovered/removed or dead individuals. We assume that the departure of such persons occurs at the rate of µ [persons/unit time/infected person]. Note that all infected persons are infectious persons until their recovery/removal/death.

is the present number of infected persons, i.e.,

The expectation and variance of the above processes will be of our interest, which we denote by

Before we discuss how to find the PGF G(z, t) from the PDE (16), let us derive first an ordinary differential equation for I(t). By dividing both sides of (16) by (z − 1)G(z, t), we will have

By setting z = 1, we find 7

where, on the LHS 8 , we first set z = 1 (which corresponds to differentiation at z = 1), and use L'Hôpital's rule, obtaining 9 E[I(t)] = I(t). The ordinary differential equation (22) can be solved, yielding

If the model parameters are set to new values, say, to λ , µ and ν at some point t = t 1 ≥ 0, then the solution I(t) for t ≥ t 1 is given by

It should be clear that I(t) diverges to infinity, if a > 0 in (23) and converges to ν/|a| in the limit t → ∞ if a < 0. Similarly, in (24), the process converges to ν /|a |, if a < 0. If a = 0, then 10

In Figure 5 we show the case where λ = 0.3, µ = 0.1 and ν = 0.2 and at t 1 = 30, a new parameter λ = 0.06 is set, whereas the original values of µ and ν are retained.

The mean values of other processes can be easily found. A(t) is a Poisson process with rate ν, which implies

Since each person in the infected population I(t) infects others at the rate of λ persons/unit time, the differential of B(t) is given by

From this and (23) we obtain

7 Alternatively, we can obtain this differential equation directly, by multiplying each equation in (13) by n and summing them up from n = 0 to infinity. 8 The abbreviations LHS and RHS mean the left-hand side and right-hand side, respectively. 9 Here we use an important property of PGF, i.e, ∂G(z,t) ∂t = E[I(t)z I(t)−1 ], and by setting z = 1, the RHS becomes E[I(t)]. We changed the order of differentiation w.r.r. to z and t, which can be justified because the function G(z, t) is an analytic function, i.e., it is continuous and differentiable everywhere. 10 The second term becomes 0/0, so we apply L'Hôpital's rule.

For t ≥ t 1 , we find

which takes the same form as that for t ≤ t 1 . It should be worthwhile to note that both (28) and (29) could be derived from the mean value of the identity equation (18) together with the relation

which is evident as shown below. The recovery process R(t) should satisfy the following differential equation, similar to (27):

Thus, it readily follows:

The above expressions for I(t) and other processes can be viewed as our deterministic (or non-probabilistic) model for the dynamics of the BDI process. From these simple expressions, we can extract a few important characteristics concerning the mean values of I(t) and others.

1. a = λ − µ determines the exponential growth or decay rate of B(t), R(t) as well as I(t).

2. ν is merely a linear scaling factor for I(t) and other processes, and so is I 0 , the initial number of the infected. 11 3. If a > 0, then I(t) grows exponentially without bound; if a < 0, it decays exponentially towards ν/|a|. If a = 0, I(t) = I 0 + νt, i.e., I(t) grows linearly.

4. The ratio of the infection rate (or reproduction rate) λ to the recovery or removal rate µ

is called the basic reproduction number in epidemiology (see e.g., [6] , p. 21). The term was originally defined in the context of a deterministic model called the SIS(Susceptible-Infected-Susceptible) epidemic model. It is the average number of persons whom an infectious person infects before his/her recovery, removal, or death. The reproduction number determines whether the infection will grow exponentially, die out, or remain constant, depending on whether R 0 > 1, R 0 < 1, or R 0 = 1, respectively. The exponential parameter a can be expressed in terms of R 0 and µ: 11 Eq.(23) can be rewritten as I(t) = I0 + ν a e at − ν a . So ν a + I0 is the multiplying coefficient of the exponential term e at . 5. The amount of time T that takes for I(t) or other related quantities to double, and thes exponential growth parameter a are related by e aT = 2, or equivalently aT ≈ 0.693.

Note that both the integral and derivatives of the exponential function e at are also ∝ e at . Thus, the above formula for T equally applies, when cumulative numbers or incremental numbers are to be counted for a given a via an observed T .

6. Unless we can expect to increase the value of µ by improving the medical service or producing an effective vaccine to immunize the susceptible population, the only options we have for controlling an infectious disease is to increase µ by removing as many infectious individuals away from susceptible population as possible, and/or to decrease λ by increasing the socalled social distances between the susceptible and the infectious. We will provide an illustrative example in Part II [1] . 

So far we have discussed only the mean values of the random process I(t) and other processes. Before we find the probability mass functions P n (t), n = 0, 1, 2, · · · for any t, we obtain in this section the steady-state distribution lim t→∞ P n (t) = π n , if it exists. We know already that when a > 0, such distribution does not exist. Thus, the steady state distribution can possibly exist, only when a ≤ 0. In order to find it, we set the LHS of the PDE (16) equal to zero, obtaining the following ordinary differential equation for the PGF G(z, ∞):

which readily leads to

Integrating the above and using the boundary condition G(1, t) = 1 for any t, we find

This PGF reminds us of the negative binomial distribution (NBD). This distribution was originally introduced to express the probability of the number of failures n needed to achieve r successes in a sequnce of Bernoulli trials, when the probability of failure is q.

Definition 2 (Negative binomial distribution (NBD)). Negative binomial distribution NB(r, q) is defined by

where the parameter r is a positive real number.

When r is a positive integer, and 0 < q < 1 the above reduces to the classical definition of the (shifted) 12 negative binomial distribution, sometimes called the Pascal distribution, associated with Bernoulli trials. The Gamma function Γ(x) is defined for a positive real number x by 13

The probability generating function of NB(r, q) of (39) is given by

The mean and variance of a RV (random variable) X possessing this distribution can be readily found:

From (38) and the above formula, we readily find the steady state distribution of I(t) for a < 0 is given as follows [12] ,p.14.

4 Time-Dependent Probability Distribution of the Infected Process I(t)

The partial difference equation (PDE) (16) for the PGF G(z, t) is a linear PDE and is sometimes referred to as planar differential equation (see [22] , [14] , pp. 600-605). This type of PDE can be solved by using Lagrange's method with its auxiliary differential equations, which is discussed in Appendix A. 14 12 Some authors define the negative binomial distribution as the distribution of the number of trials, instead of the number of failures, needed to achieve r successes. Under this definition, n = r, r + 1, r + 2, · · · . The probability distribution (39) is then referred to as the shifted negative binomial distribution (see e.g., [15] , pp. 59-62). 13 This definition can be extended for a complex number z, with (z) > 0. 14 It will be worth noting that Ren and Kobayashi [23] , [24] discuss this type of PDE in the analysis of multiple on-off sources in traffic characterization of a data network. 4.1 When the system is initially empty, i.e., I(0) = 0

When the system is initially empty, i.e., I(0) = 0, its PGF can be found by solving the PDE given by (see (A.20) 

If we define a function β(t)

we can write (44) compactly as

which is the PGF of the generalized negative binomial distribution NB(r, β(t)), defined in Definition 2. Thus, we find that the PMF of the BDI process I(t) is given by

When a = 0, the above expression can be simplified. Noting

we find

which lead to

and P n (t) = n + r − 1 n 1 µt + 1 r µt µt + 1 n .

(51)

The mean and variance can be computed from the distribution function obtained above. But we already know the expression for the mean I BDI (t) from (23) , which can be also found from the formula of the NB(r, q, i.e.,

which certainly agrees with (23) . Similarly, the variance can be found from the formula of NB(r, q) as

The reason why this simple innocent-looking equation is put in a box is because this expression for the variance is perhaps the most important feature behind the erratic behavior we observe in the COVID-19, or an infectious disease in general, as we analyze below and demonstrate by presenting simulation results in Part II.

Up to now, we have focused on the case when the system is initially empty. In this section, we wish to study a general case where I(0) = I 0 is an arbitrary non-negative integer. The PGF solution for this general case is, in fact, given in (A. 19) of Appendix A. In this section we will add to the PGF and PMF of this general case a subscript or superscript BDI:I 0 wherever we need to distinguish similar functions of other processes. So we can write

where

where the first given by (55) is the same as (44) discussed in a preceding section, whereas the second one (56) is the PGF of the birth-and-death process, obtained by setting ν = 0 in the PGF (A. 19 ). Thus, we see that the BDI:I 0 process can be expressed as a sum of two statistically independent processes:

I BDI:I 0 (t) = I BDI:I 0 =0 (t) + I BD:I 0 (z, t).

It can be further shown that the the birth-and-death process can be decomposed into two processes: the the pure-birth process and a generalized binomial distributed process, denoted I GB:p(t) (t), where p(t) is given by (87). Thus,

and the PGFs of these processes can be expressed as G BD:I 0 (z, t) = G N B(I 0 ,β(t)) (z, t) G GB:I 0 (z, t)

where

5 Important Properties of the Negative Binomial Distribution (NBD)

Let us further examine properties of a random variable X of NB(r, q) defined by its PMF (39) and PGF (41). Its mean and variance are given in (42). The mode of the this distribution is given by

Recall that the parameter r in our problem is the ratio of ν [person/unit-time] (the rate with which an infected person arrives in the community in question) to λ [person/unit-time/person] (the rate with which an infectious person infects susceptible people. As long as tight security measures are enforced at the boundaries of the community, ν is kept small. Thus, in our problem of practical interest, r < 1, or even r 1. Figure 6 shows the (shifted) negative binomial distribution (NBD) for p = 1 − q = 0.5 and for various values of r = 0.5, 1, 2, 4, 8, 16, 32.

As we can see in Figure 6 , the PMF with r = 0.5 falls off rapidly by around n ≈ 5. For q = 0.5 and r = 0.5, we compute the first several values of P n : P 0 = 0.7071, P 1 = 0.5303, P 2 = 0.3315, P 3 = 0.1934, P 4 = 0.1088, P 5 = 0.0598, · · · , P 10 = 0.00256.

(63)

In order to see more clearly how the PMF P n looks for small r, let us rewrite (39) as follows:

The leading term k(r) does not depend on n. So we can write

In the limit r → 0, the shape of P n will become

Recall the following Maclaurin series expansion

from which we obtain

Thus, we find (66) can be written as

This distribution is called the logarithmic distribution (a.k.a. logarithmic series distribution), which was introduced by R. A. Fisher in 1943 [25] . The PGF of (69) is given, using (67) once again, by

In our case the probability q is given by β(t) of (45):

Thus, for sufficiently large t, the parameter q is very close to one, thus the term q n is a slowly decreasing geometric series. For small r, the coefficient is also slowly decaying. Thus, for small r and q close to 1, the NBD is a very slowly decaying distribution. having a long tail, similar to Zipf 's law or the zeta distribution with small power exponent α (see e.g., [15] , pp. 62-64).

Example 2: Consider the same set of model parameters used in Example 1, i.e., λ = 0.3, µ = 0.1 and ν = 0.2. Then r = ν λ = 2 3 , and a = λ − µ = 0.2. In Figure 7 through Figure 12 , we show the PMF P n (t) (47) of the infected population I(t) at t = 1, 5, 10, 20, 30 and 50. In all these cases, I 0 = 0, i.e., there are no infected individuals at t = 0.

One useful method to see clearly the tail end of the distribution function is to plot the log-survival function, log 10 (1−F n ), where F n is the cumulative distribution function (CDF), i.e., F n = n i=0 P n (t), and its complement 1 − F n is called the survival (or survivor) function. 15 In Figure 13 and Figure 14 , we show the log-survivor functions of the distribution of I(t) at t = 10 and t = 30. We confirm our earlier observation that the NBD behaves similar to a geometric distribution, which is a straight line in the log-survivor plot.

As we observed in Example 5.1, the PMF P n (t) looks similar to a slowly decaying geometric distribution for modest values of t, with a long tail, which gives a large variance. An often used measure of dispersion is the coefficient of variation (CV). From (53) we can derive the following expression:

from which we find the CV at time t of the BDI process is given by

It is evident that β(t) quickly approaches unity from below, as t increases 16

Thus, we have arrive at the following proposition regarding the CV of the BDI process:

Proposition 1. The coefficient of variation of the BDI process rapidly converges to a constant √ r −1 for all t such that (λ − µ)t ≥ 7: So, we have confirmed that the coefficient of variation is bounded from below by √ r −1 =1.225. 15 The term "survival (or survivor) function" is often used in reliability theory. If a continuous variable X represent the "life" of a human, or a product (e.g., electric bulb), with its probability density function (PDF) fX (x) and cumulative distribution function (CDF) FX (x) = t 0 fX (s) ds , the survival function is defined by 1 − FX (x). By plotting log(1 − FX (x) vs. x, we find more clearly, how the remaining life will behave than a regular plot of fX (x) or FX (x). See e.g., [15] pp. 144-146. 16 When t is such at = 7, for instance, will make β(t) within around 2% off from the unity. In our example of a = 0.2[/day], t = 23[days] will make β(t) practically equal to unity.

In the previous section we gave a formal definition of the generalized negative binomial distribution, by allowing the parameter r to be a positive real number. We now want to generalize the binomial distribution B (N, p) , by allowing N to be a real number (positive or negative), and p be any real number, positive or negative. Thus, we cannot assign any probabilistic interpretation.

Definition 3. Consider a generating function of z, which takes the form for real numbers, p and α.

Then its inverse transform

is called a generalized binomial distribution, denoted GB(n, p).

There is no guarantee that each f n non-negative, let alone 0 ≤ f n ≤ 1, However, they add up to unity, because G(1) = 1. The main reason why we wish to generalize the binomial distribution is for the convenience of computing P n (t) for the BDI process and other birth-and-death processes (without immigration) discussed.

Proposition 2 (A generalized binomial distribution associated with a (generalized) negative binomial distribution). For a given (generalized) negative binomial distribution, whose PGF is given by

with the PMF

its associated generalized binomial distribution is given by

where p and q satisfy the relation

Proof. First we compare the generalized binomial coefficients

We also find

Thus, we have shown the equivalence between (79) and (80).

An alternative way to prove this equivalence is to write the PGF (78) using p. defined by (81):

Recall Newton's generalized binomial formula:

Apply this formula to the RHS of (84), by setting a = −r and t = pz 1−p . The coefficient of the z n term is given by f n of (77) Now let us revisit the PGF (44), which we can write as

where

By referring to Proposition 2 we find that p(t) and β(t) are related by

Thus, by applying Newton's formula, we can obtain the following expression for P n (t):

P n (t) = −r n (1 − p(t)) n p(t) −r−n , n = 0, 1, 2, · · · .

Using the binomial coefficient formula and the equivalence (83), i.e.,

we see that P n (t) given by (89) is equivalent to (47).

In this section we will investigate how to represent the BDI process I BDI (t) as a compound Poisson process. The negative binomial distribution is infinitely divisible (see e.g., Feller [9] ), and hence can be represented as a compound Poisson process. First we show this representation by considering a special case of the BDI process where no death occurs.

Let us consider a process which has not been discussed, to the best of our knowledge, in the literature. That is, the pure birth process, accompanied by immigration. Let us call this process a birth-and-immigration process, a BI process for short. The PGF of this process can be readily found by setting µ = 0 in (A. 19) , and after some manipulation, we obtain

where

Not surprisingly, this β 0 (t) can be obtained by setting µ = 0 in β(t) of (45). Thus, the BI process I BI (t) is distributed according to the negative binomial distribution NB(r + I 0 , β 0 (t)) shifted by I 0 . Hence, we find the PMF at time t as given below:

If, I 0 = 0, the PMF reduces to NB(r, β 0 (t)). By noting rλ = ν, we find the first few PMF values as follows: 

where P BI:0 0 (t) represents the idle period of the system which ends upon the Poisson arrival. This initial idle period is the only time when the system is empty. Since there is no death or departure of any kind, the population monotonically increases as the time elapses.

As we defined in Definition 1 of Section 3.2, A(t) is the cumulative count of infected and infectious persons, arriving from the outside up to time t. Let us count them as c = 1, 2, · · · , A(t). Consider a "countable attribute" associated with each arrival and denote it N c . Assume that the N c 's are independent identically distributed (i.i.d.) RVs. Kobayashi and Konheim [26] discuss use of acompound process in the context of data communication system, where each arrival of a data packet carries N c [data units], e.g., [bytes] . Then, the sum

should be a quantity of interest in determining, e.g., the buffer size of a statistical multiplexer needed to keep the probability of buffer overflow below some prescribed level.

In our problem at hand, each infected arrival from the outside becomes an "ancestor" who produces many "descendants" over multiple generations, who are all "infected" persons and constitute the members of the "c clan." Such a process is called a branching process. In our model, each infected individual, whether an ancestor or a descendant, produces a new descendant at rate λ [person/unit-time].

Let A j (t) be the number of arrivals in (0, t] and having j descendants at time t. The case j = 0 does not exist, since there is no death, hence each clan includes at least its ancestor, such that

Branson [27] , who ascribes the original idea to Karin and McGregor [28] , shows that the RVs A j (t)'s are independent Poisson variables with mean m j (t) where

where ξ is a function of λ and t.

The above formula (96) leads to the idea that the probability distribution that each ancestor should produce j descendants (i.e., secondary, tertiary infectees, etc.) whose probability distribution takes the form

This distribution is the logarithmic distribution of (69). The PGF of N c (t) is therefore given by

Note that N c (t) is the number of the family members at time t of the cth clan, with the ancestor itself included. Since we are assuming no death, the sum of these numbers over all A(t) ancestors should be the total population I(t) at time t:

Then, the PGF of I(t) can be expressed as

The PGF of the Poisson arrival process of rate ν is given by

Thus, we find the PGF of the BI process, a compound Poisson process, is

which is the PGF (91), where I 0 = 0. Thus, we have shown that the Poisson process with rate ν ends up with the negative binomial distributed process NB(r, β 0 (t)). It is interesting that the infection process which forms a branching process where an infectious person infects a susceptible person at rate λ [persons/time unit] will result in the negative binomial distribution. The positive feedback loop of creating its descendants exhibits the negative binomial distributed process.

Now we extend our argument of the previous section to the BDI process with the initial condition I 0 = 0, whose PGF is given by (44).

where β(t) is defined by (45):

The first few PMF values can be found from (47) as

The function β(t) is between 0 and 1, and approaches 1 as t → ∞. If ν → 0, then r → 0 as well and B(t) → 1. Since B(t)r does not depend on n, we see again the Fisher series:

Then by replacing β 0 (t) = 1 − e −λt by β(t) = λ(e at −1) λe at −µ the entire mathematical steps of the BI process case carry over to the case of the BDI process, except for the last step (102), where the parameter r needs to be defined as

If a = λ − µ > 0, then for t 1, the following approximation holds

Therefore, it seems clear that we need to to choose a Poisson process other than the one with mean νt. Let ν(t) be the mean of the Poisson process which is to be compounded with N c (t) whose PGF is

Then it should be clear from the above discussion that ν(t) must be given by

Thus,

Thus, we have shown that the BDI process as a compound Poisson process with rate ν(t).

At the beginning of Section 3 we showed the results of twelve simulation runs of the BDI process to provoke the reader's interest. Hopefully you have found an answer to the first question. How about the second question? If you are an avid reader, you should have figured out the answer by now. In Part II of this report under preparation [1] , we will present more simulation results, which should help the reader find a definitive answer to the second question.

In this report, we focused on the analysis of I(t), the infected population. Although we presented the stochastic means of B(t) (the population of the secondary infections) and R(t) (the recovered/dead population), we defer a full analysis of these processes to Part II.

We will also show, both by analysis and simulation, how the infection process can be controlled by changing the values of λ, µ and ν at certain points in time. Increasing the so-called social distance would decrease the value of λ. The value of µ can be increased by improving medical treatments to speed up the recovery process. Another possible option is to reduce the size of the susceptible population significantly by exposing the young and healthy susceptible to the disease so that they become immune to the disease. The quantitative model such as ours should help policy makers and health officials to make judicious choices of these options by assessing the effects and costs of various options available to them. In Part II (or III), we will also discuss how we should estimate the model parameters from empirical data so that the forecast can be made as reliable as possible.

The partial differential equation (PDE) (15) can be solved for PGF G(z, t) by Lagrange's method with auxiliary differential equations. We write (15) in the following form of a planar differential equation (see e.g., [22] , [24] , [23] , [14] , pp. 600-603):

where G denotes G(z, t) defined above, and p, q and r are, in general, functions of t, z, and G.

The solution G = f (t, z) represents a surface in the t − z − G space. The normal vector to the surface at any point (t 0 , z 0 , G 0 ) is perpendicular to the line through this point, whose direction numbers are the values of p, q and r evaluated at this point, which we denote by p 0 , q 0 and G 0 . This line has the equations

Hence, at each point (p, q, G) of the surface, there is a normal vector whose direction numbers, dt, dz and dG, satisfy the following equation:

In our problem at hand, we set where C 1 and C 2 are integration constants. We write the functional relations between C 1 and C 2 as 

Stochastic Modeling of an Infectious Disease: Part II: Validation by Simulation Experiments (under preparation)

A Contribution to the Mathematical Theory of Epidemics

The model of Kermack and McKendrick for the plague epidemic in Bombay and the type reproduction number with seasonality

Population Biology of Infectious Diseases: Part I

Infectious Diseases of Humans: Dynamics and Control

An Introduction to Mathematical Epidemiology

Die Grundlagen der Volterraschen Theorie des Kampfes ums Dasein in wahrscheinlichkeitstheoritischer Behandlung

The generalized "birth-and-death" process

Introduction to Probability and Its Applications

Introduction to Congestion Theory in Telephone Systems

Modeling and Analysis: An Introduction to System Performance Evaluation Methodology

System Modeling and Analysis: Foundations for System Performance Evaluation

Probability, Random Processes, and Statistical Analysis

Lecture Note: Bith-and-Death Process and Its Application

An Introduction to Stohasctic Processes with Special Reference to Methods and Applications

On some modes of population growth leading to R. A. Feller's logarithmic series distribution

Some Evolutionary Stochstic Processes

The Elements of Stochastic Processes With Applications to the Natural Sciences

An Introduction to Stochastic Processes with Application to Biology

Fundamentals of Queueing Theory

Transient solutions for the buffer behavior in statistical multiplexing

A mathematical theory for transient analysis of computer communication networks

The relation between the number of species and the number of individuals in a random sample of an animal population

Queuing Models for Computer Communications Systems Analysis (Invited Paper)

In homegeneous birth-death and birth-death-immigration processes and the logarithmic series distribution

The number of mutant forms maintained in a population

Acknowledgments I thank Prof. Brian L. Mark of George Mason University for his valuable suggestions and help during the course of this study. He has read this manuscript carefully and made numerous editorial suggestions. Were it not for his patient help, I would have taken a lot more time in debugging my MATLAB simulation programs. Prof. Hideaki Takagi of the University of Tsukuba kindly shared with me his unpublished lecture note on the birth-and-death processes [16] . My excitement of having obtained a closed form solution of the time-dependent PMF of the BDI process was short lived, when he showed me his lecture note and informed me of the existence of the book by Bailey [20] . Dr. Linda Zeger read the first draft of this report and gave me valuable suggestions to improve the presentation.

From the initial condition (17) , we findfrom which we findWe introduce a new variable y by z − 1 µ − λz = y, i.e., z = 1 + µy 1 + λy .(A.14)From the last two equations, we can determine the functional form of f :By substituting the last equation into (A.11), we obtainBy notingandwe finally obtain the PGF of I(t)Note that the PGF is given as a product of two PGFs, the second being the PGF of the birth-and-death process without immigration.If I 0 = 0, i.e., if the system is initially empty, the above PGF reduces to G(z, t) = a λe at − µ − λ(e at − 1) z r , when I(0) = I 0 = 0.(A.20)