key: cord-0036326-o9kb7irj authors: Yan, Ping title: Distribution Theory, Stochastic Processes and Infectious Disease Modelling date: 2008 journal: Mathematical Epidemiology DOI: 10.1007/978-3-540-78911-6_10 sha: fece9870f56ee8381803fa3cdad6f90f545af90f doc_id: 36326 cord_uid: o9kb7irj The occurrence of a major outbreak, the shape of the epidemic curves, as well as the final sizes of outbreaks, are realizations of some stochastic events with some probability distributions. These distributions are manifested through some stochastic mechanisms. This chapter divides a typical outbreak in a closed population into two phases, the initial phase and beyond the initial phase. For the initial phase, this chapter addresses several aspects: the invasion (i.e. the risk of a large outbreak); quantities associated with a small outbreak; and characteristics of a large outbreak. In a large outbreak beyond the initial phase, the focus is on its final size. After a review of distribution theories and stochastic processes, this chapter separately addresses each of these issues by asking questions such as: Are the latent period and/or the infectious period distributions playing any role? What is the role of the contact process for this issue? Is the basic reproduction number R (0) sufficient to address this issue? How many stochastic mechanisms may manifest observations that may resemble a power-law distribution, and how much detail is really needed to address this specific issue? etc. This chapter uses distribution theory and stochastic processes to capture the agent–host–environment interface during an outbreak of an infectious disease. With different phases of an outbreak and special issues in mind, modellers need to choose which detailed aspects of the distributions and the stochastic mechanisms need to be included, and which detailed aspects need to be ignored. With these discussions, this chapter provides some syntheses for the concepts and models discussed in some proceeding chapters, as well as some food for thought for following chapters on case studies. This chapter limits the discussions to human-human transmission through direct contacts involving an agent (e.g. virus, bacteria, etc.) in a closed population. The agent has biological characteristics. The human hosts may differ in susceptibility. The environment is where contacts and transmissions take place. One wishes to control the outbreak aimed at preventing a large outbreak from happening. Should it happen, the number of individuals who are infectious at time t can be approximated by a curve, either symmetrical or slightly negatively skewed, like that illustrated in Fig. 10 .1. 1. Reducing the initial growth of the curve (and delaying the peak) 2. Reducing the peak burden 3. Reducing the final size, defined as the total number of individuals or the proportion of individuals in the population that will be eventually infected by the end of the outbreak. Brauer [1] (Chap. 2 of this book) discussed the initial growth rate and the maximum value of I(t) for the number of infectious individuals at time t in a compartment model. The final size of an outbreak was not only discussed in deterministic compartment models, but also by Allen [2] (Chap. 3 of this book) with respect to stochastic models. There concepts are also embedded in network models introduced by Brauer [3] in Chap. 4 of this book. if X is absolutely continuous; and for the probability mass function (p.m.f.) f X (x) = F X (x) − F X (x + 1) = Pr{X = x} if X is discrete taking values x = 0, 1, 2, · · · . f X (x) satisfies (1) f X (x) > 0; (2) ∞ x=0 f X (x) = 1 for discrete and ∞ 0 f X (x)dx = 1 for continuous random variable X. It can be shown that when X ≥ 0, (10.1) If X is discrete taking values x = 0, 1, 2, · · · , the probability generating function, as previously introduced in Allen [2] and Brauer [3] , is a mathematical tool to study its distribution. It is defined as satisfying G X (0) = Pr{X = 0}, G X (1) = 1, G X (s) > 0, G X (s) > 0. (10.3) If the p.m.f. f (x) = Pr{X = x} is given, G X (s) is uniquely defined through (10.2) . If G X (s) is given, provided that it is a smooth function of s with higher order of derivatives, then X (0), x = 0, 1, 2, · · · (10.4) where G (x) s=0 , so that the p.m.f. f (x) can be uniquely generated through G X (s). Calculations of moments and of some probabilities are very easy. The mean and variance of X are E[X] = G X (1), (10.5) var[X] = G X (1) + G X (1) − (G X (1)) 2 . If X is continuous, the hazard function, h X (x) def. Pr{x≤X x + y|X > x} = exp(−α(x+y)) exp(−αx) = exp (−αy) . For X with exponential distribution, the rate is reciprocal to the mean duration: The exponentially distributed infectious period is the underlying assumption in the continuous time Markov chain model in Allen [2] . In the second case lim x→∞ h X (x) = α, a commonly used model is the gamma distribution with p.d.f. (10.7) and survivor function F where α is a scale parameter. κ is the shape parameter that determines the shape of the hazard function. When κ > 1, the hazard is an increasing function of x and when κ < 1, a decreasing function of x. When κ = 1, it is the exponential distribution. It can be shown that for any κ > 0, h X (x) = fX (x) F X (x) → α as x → ∞. In the third case, the monotone hazard function satisfies lim x→∞ h X (x) = ∞, if increasing, and lim x→∞ h X (x) = 0, if decreasing. A common choice is h X (x) = βα β x β−1 , a power function of x such that when β = 1, h X (x) = α; when β > 1, h X (x) is an increasing function of x and when β < 1, h X (x) is a decreasing function of x. β is the shape parameter that determines the shape of the hazard function. The corresponding p.d.f. and the survivor function are f X (x) = βα (αx) β−1 e −(xα) β and F X (x) = e −(αx) β . α is a scale parameter. This is the Weibull distribution. Of non-monotone hazard functions that initially increase to a maximum value and then decrease with lim In both distributions α is a scale parameter and θ is the shape parameter. Another continuous distribution for X used in this chapter is the Pareto distribution with hazard function h X (x) = ακ 1+αx which is monotonically decreasing lim x→∞ h X (x) = 0. It has p.d.f. f X (x) = κα (1+αx) κ+1 and survivor function F X (x) = 1 (1+αx) κ . For a non-negative continuous random variable X, it is sometimes convenient to work with the Laplace transforms L X (r) = ∞ 0 e −rx f X (x)dx, and its related function L * X (r) = ∞ 0 e −rx F X (x)dx = 1−L(r) r , provided that they exist. As a function of r, the following statements hold: Let us call an infectious contact a contact at which transmission takes place [4] . We use the notation N for a discrete valued random variable, defined as N = the number of infectious contacts made by an infective individual, throughout its entire infectious period, in a wholly susceptible population. The expected value of N is the basic reproduction number R 0 = E [N ] . Uncertainties are captured by var [N ] as well as the probability distribution Pr{N = x}, x = 0, 1, 2, · · · . Some public health questions can be sufficiently addressed by R 0 = E[N ] alone. Some other public health questions can be addressed by the first two moments: E[N ] and var [N ] . For some questions, one needs to know the precise distribution for N. However, there are many other important public health issues that are not addressed by the distribution N. The distribution for N may arise from a combination of the following stochastic mechanisms: 1. The contact network structure and stochastic features that give rise to the contact process 2. The individual (host) properties that determine transmissibility per contact, such as host susceptibility 3. The probability distributions of durations of time-to-events, such as the infectious period. The same probability distribution Pr{N = x} can arise from different stochastic mechanisms. For some public health questions, different stochastic mechanisms may give different answers, even if the probability distribution Pr{N = x} is the same. For some other public health aspects, different distributions for Pr{N = x} may give the same answer as long as certain aspects of the underlying stochastic mechanisms remain the same. Calendar time is denoted by t. At t = 0, there are S 0 initial susceptible individuals and I 0 initial infectious individuals in a closed population. The population size n = S 0 + I 0 . S 0 and I 0 are fixed integers. Under the assumption S0 n ≈ 1 (i.e. S 0 is very large) and I0 n = ε ≈ 0 (i.e. I 0 is very small), if infective cases are removed at the end of their infectious period by recovery with immunity, or by complete isolation through intervention, or death, there may be a proportion of susceptible individuals that eventually escape from infection when the outbreak is over. The following pair of quantities are random: • S ∞ = the total number of susceptible individuals who have escaped from infection in the population • Z = n − S ∞ = the final size, with a probability distribution Pr{Z = z}, z = 1, 2, · · · Diekmann and Heesterbeek [5] describe the distinction of a small outbreak from a large outbreak. Using the random variable Z, Small outbreak. As n → ∞, the infectious agent produces a handful of cases and the outbreak becomes extinct, such that the expected number of cases, ). In other words, although the final size as a proportion (scaled to the population size n) is concentrated at zero, the final outbreak size as absolute numbers can be very large. In this case, the outbreak size is neither small, nor large. A random variable T g is defined as the time to extinction at generation g = 1, 2, 3, · · · , where the event {T g = g} refers to {no infected case at generation g and at least one infected case at generation g−1}. The generation time g is different from the calendar time t. In certain situations, observations arising from an outbreak can be identified by generation time. Allen [2] used p.g.f. to determine the probability of extinction in the context of branching processes. Brauer [3] reviewed the use of p.g.f. to characterize and calculate degree distributions with respect to the spread of diseases over contact networks. In this chapter, p.g.f. will be used to generate the distributions for N which is related to the basic reproduction number R 0 , to derive probability distributions for the final size Pr{Z = z} in small outbreaks, and where possible, to derive the probability Pr{T g = g}. We use the notation X which could be N , Z or T g depending on context. Table 10 .1 lists three commonly referred discrete distributions in this chapter. The shapes of f X (x) for these distributions are compared in Fig. 10 .2. The Poisson distribution is a special case of the negative binomial distribution as φ → 0; the geometric distribution is a special case of the negative binomial distribution as φ = 1. In models involving a sequence of events, one may treat each pair of successive events as an initiating event that leads to a subsequent event over a random duration X ≥ 0. Durations can be categorized by: (1) the natural history of infectiousness of an infected individual (e.g. latent and infectious periods); (2) the natural history of clinical manifestation (e.g. incubation period and duration of illness); and (3) the reaction time of the public health system (e.g. time to detect an infection or to isolate an infectious individual). Models are often developed along the event history based on one or a combination of the three categories. For example, models based on the natural history of infectiousness of an infected host might be a deterministic or stochastic SIR or a Susceptible-Latent-Infective-Removed (SEIR) model, like those discussed in previous chapters. Very often one wants to compare two non-negative random variables X 1 and X 2 , either through some overall summary of their distributions, or through their tail properties. For notational simplicity, we write F 1 (x) = F X1 (x) and F 2 (x) = F X2 (x) hereafter. In general, The reverse is not true for non-exponential distributions. Definition 2. X 2 is longer than X 1 in hazard rate order, denoted by Definition 3. X 2 is longer than X 1 in Laplace transform order, denoted by Note that for the Laplace transforms In general, Another stochastic property used in this chapter is the relationship between the hazard function and the tail properties of randomly distributed durations. Denote X s = (X − s|X > s) as the residual life of a duration X conditioning on X > s. If X stands for the infectious period, then X s stands for the time remaining to be infectious, after s amount of time since the beginning of the infectious period. Thus is the survivor function of X s . Given the hazard function h X (x), it can be shown that F X (s+x) F X (s) = exp − s+x s h X (u)du . This leads to: The distribution is said to be heavy tailed if it satisfies (10.9) . Intuitively, if X ever exceeds a large value, then it is just as likely to exceed any larger value. exp(−αx) = ∞, implying that it has a heavier tail (goes to zero more slowly) than an exponential tail. All heavy tailed distributions are also sub-exponential. Another form of heavy tailed distribution is that of the Pareto form, such that for some θ > 0, A > 0, lim x→∞ F X (x) x θ+1 = A. They are heavy tailed with power-law property ∝ 1 x θ+1 . (sα+xα+1) κ = 1 (heavy and power-law) In a heterogenous population an individual i is associated with a random variable X i following a distribution with p.m.f. or p.d.f. f (x|θ i ) specified up to a parameter θ i . If the heterogeneity can be observed through a vector of covariates z, say, such as gender, birth date, height, etc., a common practice in statistics is to model θ i as a function of z via a generalized linear model If the heterogeneity is not observable, one assumes that θ i varies among individuals as independently and identically distributed (i.i.d.) random variables with expectation E θ [·] such that at the population level, one may model X arising from a distribution given by where U (θ) is a c.d.f. of the mixing distribution. Example 1. Each individual is associated with an infectious period with a constant removal rate λ i , and λ i itself varies across individuals. Then at the individual level, this leads to an exponentially distributed infectious period There is often unobservable heterogeneity with respect to the infectious period, where removal can be due to deaths, recovery, or public health intervention such as isolation. By this example, even at the individual level it is justified to use an exponentially distributed infectious period with a removal rate λ i , observed data from a population often arise as if the infectious period follows a distribution with a power-law heavy tail. Mixed Poisson distributions play a central role in the chapter. It is constructed by f (x|λ) = λ x x! e −λ and an arbitrary mixing distribution U (λ). It is a discrete distribution with p.m.f. and p.g.f. A stochastic process {X(t), t ∈ T } is a collection of random variables. For each t in the index set T, X(t) is a random variable. If t is referred to as time, then X(t) is the state of the process at time t. A realization of {X(t), t ∈ T } is called a sample path. This chapter is restricted to discrete, non-negative sample paths, that is, for given t, X(t) is a non-negative discrete random variable taking values x = 0, 1, 2, · · · . The index t can be either discrete, or continuous. The distribution of X(t) for given t often depends on the past history of the process H t = {X(u), 0 < u ≤ t − }. The conditional probability Pr{X(t) = x|H t } is meaningful. If the conditional distribution of the future state at time t + s, given the present state at time s and all past states, depends only on the present state and is independent from the past, that is, for all continuous time s, t > 0 and non-negative integers i, j An SIR model can be expressed as a bivariate continuous time-homogeneous Markov chain {S(t), I(t)} as described in Allen [2] . At any given time t, both S(t) and I(t) take integer values. Transitions can only occur from {S(t) = s, [2] can be re-written as Assuming the outbreak is originated from a single initial infective, the stochastic means follow the following renewal-type equations: (10.14) where I(t) = M (t) − R(t). The first term e − 1 µ t in (10.14) is the probability that the initial infective is still infectious at time t according to an exponentially distributed infectious period. With respect to the integration for the second term, during the interval (0, t] the initial infective makes many infectious contacts with a constant rate β, so that the expected number of infectious contacts at an infinitesimal interval containing x ∈ (0, t] is βe − 1 µ x . The expected number of individuals who are still infectious at time t evolved from such contacts at x ∈ (0, t] is E[I(t − x)]. It can be shown that [7] d dt The deterministic counterpart of (10.13) is (10.16) The solution of the deterministic equations is not simply the mean of the stochastic process since the covariance term in (10.15) is ignored. Nevertheless, for a broad class of processes, the deterministic solution is a good approximation to the stochastic mean of a major outbreak when n is large. An important consequence of the independent increment and the stationary increment properties is that, if we denote X 1 the time to the first event, and for k ≥ 1, X k the time between the (k − 1)th and the kth events, then X k : k = 1, 2, · · · are i.i.d. exponentially distributed random variables with mean 1 λ . (X 1 , X 2 , · · · , X k ) are called inter-arrival times between events. Let Y 1 = X 1 , Y 2 = X 1 +X 2 , · · · , Y k = X 1 +X 2 +· · ·+X k denote the times that events occur, it can be further shown that, conditioning on K(t) = k, the k arrival times of events (Y 1 , Y 2 , · · · , Y k ) have the same distribution as the order statistics corresponding to k independent uniformly distributed random variables on the interval (0, t). At the individual level, the cumulative number of infectious contacts is the conditional process {K(t)|ξ} and is assumed to follow a Poisson process with intensity rate ξ. The mean and variance are The process as seen at the cohort level is the marginal process {K(t)} and is a mixed Poisson process, of which, for given time t, the probability Pr{K(t) = k} is For given t, one can write the p.g.f. for K(t) as To show it is not a Poisson process, it is only necessary to notice that To prove that {K(t)} does not have independent increment, one only needs to show that Pr{K(s) = k, Unlike the Poisson process where E[X 1 ] = 1 λ , in the mixed Poisson process, E[X 1 ] > 1 λ as explained by Jensen's inequality, Consider a population consisting of individuals or particles able to produce offspring of the same kind. Each individual i is associated with a generation time T G and by the end of T G , it produces a random number N i of new offsprings. We use g = 0, 1, 2, · · · as the discrete time unit to represent generations. Let X 0 be the number of individuals initially present at the zeroth generation. All offsprings of the zeroth generation constitute the first generation and their number is denoted by X 1 . Let X g denote the size of the gth generation. Given X g−1 , if the distribution of X g only depends on X g−1 , then {X g } is a discrete time Markov chain. This process {X g } is called a branching process. Suppose that X 0 = 1, we can calculate In most branching processes, it is assumed that N i is i.i.d. according to a random variable N which does not vary over generations. We say a branching process becomes extinct at generation g, if X g−1 > 0 but X g = 0. In such case, we denote Z = g X g as the final size upon extinction. Branching processes are often used in infectious disease epidemiology to approximate the initial stage of an outbreak, where the depletion of the number of susceptible individuals is negligible. A discrete time branching process that is well studied in the literature is the Galton-Watson process. The CMJ processes have been used to approximate the early phase of the SIR models [9, 10] ), where the generation time T G has the meaning of the infectious period. In an SEIR model, let us denote T E as the latent period during which an infected individual is unable to transmit the disease and T I as the infectious period during which an infected individual is able to transmit the disease, and let T G = T E + T I . If the infectious period degenerates to µ = E[T I ] → 0, such that by the end of the latent period, the infected individual instantly produces N new infections and is removed, then T G = T E and the branching process approximations are Bellman-Harris processes, including special cases (1) the Markov branching process assuming T E is exponentially distributed; (2) the Galton-Watson process assuming T E is non-random. If there is no latent period, T G = T I , the branching process approximations are CMJ processes. Therefore, we get: Combined Bellman-Harris-CMJ processes. T E and T I are independent. T E is random with an arbitrary but specified distribution. Only after T E amount of time has elapsed, an infected individual can produce secondary infections according to a CMJ process. In these branching processes, there is an embedded Galton-Watson branching process to track the generations. The basic reproduction number corresponds to the mean value of this embedded Galton-Watson branching pro- . A schematic illustration of these branching processes is given in Fig. 10 .5. As introduced in Brauer [3], a graph G consists of a set of vertices V = {v 1 , ..., v n } and a set of pairs of distinct vertices called edges. The degree of a vertex v i is the number of vertices adjacent to it which is the number of edges attached to vertex v i . A subgraph G is a graph whose vertices and edges form subsets of the vertices and edges of a given graph G. A component is a connected subgraph. A graph may consist of a number of disjoint components. The size of a graph is n = the total number of vertices. A random graph is obtained by starting with a set of n vertices and adding edges between them at random. The degrees in a random graph are random associated with the degree distribution. In addition, a random graph has several measures on its geometry, such as measures on its connectivity, diameter, sizes of its components, clustering coefficient, etc. All these measures are outcomes of random events. Different random graph models produce different probability distributions on graphs. If new edges are added to vertices according to some stochastic processes over time, then it becomes a random graph process. It has not only a time dimension, but also a spatial dimension. At any snapshot at time t, one observes a realization of a random graph G t . This is analogous to a random variable X versus a stochastic process {X(t)}. One may aggregate, or superimpose, these random graphs per unit of time over a fixed period of time. The result is still a random graph. This gives an analogy to the counting process. Similar to such concepts as independent increment, stationary increment, Markov, etc. that determine the temporal growth of the counting processes (e.g. the distribution of inter-arrival times and arrival times of events in a Poisson process), one may develop concepts to capture the spatial-temporal growth of random graph processes. Individuals are represented by vertices. Contacts are represented by edges. (Social) contacts can be regarded as a random graph process by itself. Contacts made over a fixed period of time is a random graph. An infectious contact is a contact at which a transmission of infection takes place, and hence all infectious contacts during the same period make a subgraph. The geometry of this subgraph is different from the graph that represents social contacts. This subgraph grows along a tree, because if three individuals {a, b, c} are friends forming a triangle relationship, and if individual a infects both individuals {b, c} , then b and c can not infect each other. The social contact random graph may contain triangles and loops, as illustrated by broken lines in Fig. 10 .6. The subgraph of infectious contacts can not. This tree -like subgraph, illustrated in Fig. 10 .6 by edges, may resemble a realization of an embedded Galton-Watson branching process. However, its growth is limited to the social contact network of susceptible individuals. Only when the number of susceptible individuals is very large can the initial growth of the subgraph be approximated by such a branching process. The infectious contacts arise from a combination of two aspects: 1. The environment. The social contact network of individuals in a population as a random graph process, determined by the temporal and spatial network properties, such as whether the network is directed, the neighbourhood structure of the network, clustering, its growth over time (e.g. stationary increment), etc. 2. The hosts. (1) Whether all susceptible individuals are of the same type with equal susceptibility and (2) whether all infectious individuals have equal ability to infect others, so that the probability of transmission per contact, does not vary by contact to contact. If either (1) or (2) is not true, there exists heterogeneity among hosts and this is the intrinsic heterogeneity. Let x denote the time measured along a typical infectious individual, starting from the beginning of the infectious period at x = 0. Following Bartlett [9] , and Mode and Sleeman [10] , we use a counting process over the time This process is illustrated in Fig. 10 .7, which is (c) in Fig. 10 .5. When an individual is still in the latent period, we write x < 0. Given Let time τ denote that time since infection of a typical infectious individual. A generic expression for R 0 is given by the equation [5] where A(τ ) is the probability that at time τ since infection the individual is infectious. In other words, is associated with the property of a counting process that generates new infectious individuals during the infectious period. The product β(τ )A(τ ) is the expected infectivity at time τ after infection. If T E and T I are independent and if at time τ since infection, an individual is infectious, then τ ≥ T E and T I > τ − T E . Therefore, which can be easily proven since We first show that the formulation of R 0 does not depend on any assumption on the latent period, during which transmission does not occur. For an individual with observed latent period Furthermore, if the counting process {K(x)} has stationary increment: β > 0 is a parameter that captures the transmission rate in the agent-hostenvironment interface. It can be further sub-modelled to reflect the agenthost-environment interface. A frequently seen expression is β = λp so that R 0 = λpµ. p is the probability of transmission per contact between an infectious host and a susceptible host. λ is the average number of contacts among hosts per unit of time in the environment. The control measure for β may be further categorized into interventions designed to alter the social-contact network and designed to alter the transmission probability. One may also model β = ab where a measures the infectivity of an infectious individual who comes in contact with a susceptible individual, for whom, the susceptibility is measured by b. For example, to model the way individuals respond to vaccination, an individual may respond to vaccination with a change in their susceptibility to infection and if ever infected, their infectivity may change from what it is without vaccination. For reference, one may consult Becker and Starczak [11] which leads to references to many more publications in this area. Formation of R 0 by a multiplicative relationship is useful for statistical analyses and is thus of much public health interest. It leads to a log-linear model to explore the controlled reproduction number R c : where covariates are associated with intervention measures, such as social distancing, vaccination, treatment of infected individuals with anti-viral drugs, or altering the duration of infectiousness such as rapid isolation of identified infectious individuals. If the probability distribution for N such that R 0 = E[N ] is known, the (10.25) leads to the use of generalized linear regression analyses for exploring the effectiveness of these intervention measures aided by statistical packages. An immediate generalization of (10.24) is when the infectious period can be divided into stages T [10, p. 205] . In this case, the infectious contact process may not have stationary increment over the entire infectious period but has piecewise stationary increment within each stage so that β j (x) = β j . (10.24) becomes In order to derive the distribution for N, denote For fixed x, N (x) is also a random variable with p.g.f. It has been shown that [10, pp. 177-178] the following two recursive formulae hold: (10.28) The mean and variance of N can be evaluated by (10.5), specifically, R 0 = G N (1). It can be also shown that (10.23) can be obtained through integration by parts of (10 The equation (10.28) implies that the probability distribution of N does not depend on the distribution of the latent period T E , but depends on: properties of the contact process and the probability of transmission per contact 2. The infectious period distribution F I (x) = Pr{T I ≤ x} However, differently specified infectious contact processes defined by G K (s, x) and infectious period distributions F I (x) may result in the same probability distribution for N and the same R 0 . We shall see later in this chapter that for some public health applications, such as determining the risk of whether a large outbreak may occur, or the distribution of the final size η (in E [Z] n → η) should a large outbreak occur, knowing R 0 will be sufficient. For some applications, one requires the knowledge of the distribution for N. Once the distribution for N is determined, one can use theories for the embedded Galton-Watson branching process to calculate quantities such as the probability of a small outbreak π, the final size of the small outbreak Pr{Z = z} and the probability of time to extinction Pr{T g = g}. Yet there are also applications where the detailed distribution for N is not sufficient. One also needs to know the underlying stochastic mechanisms such as the property of the counting process {K(x), x ∈ [0, ∞)}, the distribution of the latent period T E and the distribution of the infectious period T I . Competing risk comes from survival analysis on a non-negative random variable X > 0 associated with a "lifetime". In a continuous framework, X is associated with a hazard function h(x) for "failure". In the competing risk model, when failure occurs, it may be one of m distinct causes or types, denoted by J ∈ {1, 2, ..., m}. The overall hazard function is the sum of type- Example 3. In Fig. 10 . . T > 0 is a random variable representing either the infectious individual is removed or the individual produces its kth infectious contact. The probability distribution for T arises from a competing risk framework. The word independence in the study of infectious diseases has several different aspects. 1. The counting process {K(x)} and the infectious period T I are independent. This also implies that in the competing risk framework in Example 3, Y k and T I are independent for all k. 2. Viewing the transmission as random graphs, if one randomly chooses a vertex v to be the initial infective, this vertex makes independent infectious contacts (i.e. adding edges) with other susceptible vertices throughout its infectious period. 3. If one randomly chooses a vertex v to be a susceptible individual, transmission of infection to this individual by any of its neighbouring infectious individuals is independent from potential transmission from other neighbouring infectious individuals. If both 2 and 3 hold true, then we say that infectious contacts between different pairs are mutually independent. The word homogeneity also consists of three aspects. 1. The population is homogeneously mixed. An individual makes contacts with other individuals with the same probability. As the population size → ∞, the contact process itself is a stationary Poisson process with constant intensity λ. Violations to one or a combination of the above assumptions result in heterogeneous transmission. If {K(x)} satisfies stationary increment E[K(x)] = βx so that β(x) = d dx E[K(x)] = β (constant), (10.23) has a simpler expression R 0 = βµ where µ = E[T I ] = ∞ 0 F I (x)dx is the mean infectious period. Since the distribution of N does not depend on the distribution of the latent period, we restrict discussion to the case where the latent period is absent. In the expression G N (s) = ∞ 0 G K (s, x)dF I (x), differently specified infectious contact process {K(x)} defined by G K (s, x) and infectious period distribution F I (x) may result in the same probability distribution for N at the macro level, with very different characteristics at the micro level. Some models are classified below as mixed Poisson processes. The assumptions are: (1) the infectious contact process is a Poisson process with G(s, x) = e βx(s−1) ; (2) contacts between different pairs are mutually independent; (3) the infectious period from all infected individuals are equal to some constant µ with F I (x) given by (10.29) . Under this model, −1) . (10.30) Hence N is also Poisson distributed. This terminology is used in von Bahr and Martin-Löf [12] . The model assumes that the infectious periods from all infected individuals are equal to some constant µ with F I (x) given by (10.29) . β is further sub-modelled as β = λp where p is a random probability with itself following some distribution F p so that β = λE [p] . In this case, it can be shown (later) that Without losing generality, we extend the above by assuming that factors resulting in heterogeneous mixing and/or heterogeneous transmission are not at present understood and the product ξ = λp is a random variable following Provided that the mixing distribution U (ξ) is specified, G N (s) uniquely defines the distribution of N as a mixed Poisson distribution. This terminology is used in Lefèvre and Utev [13] . The assumptions are: (1) the infectious contact process is a Poisson process with G(s, x) = e βx(s−1) ; (2) contacts between different pairs are mutually independent; (3) the infectious periods from all infected individuals are i.i.d. following an arbitrary distribution F I (x). Using (10.28) , One notices the duality between (10.31) and (10.32). As far as the distribution for N is concerned, the generalized epidemic and the randomized epidemic models can be unified as one mixed Poisson model with with a mixing distribution U (ρ). In (10.32) , ρ is re-scaled as ρ = βT I where β is a constant and T I is random such that E[ρ] = βµ. In (10.31), ρ is re-scaled as ρ = ξµ where µ is constant, but ξ is random. It implies that, if one starts with a simple Poisson distribution with p.g.f. e ρ(s−1) and chooses a mixing distribution U (ρ), the resulting distribution obtained by (10.33) can be interpreted as arising either from a generalized epidemic with an appropriately chosen infectious period distribution, or from a randomized epidemic with an appropriately chosen distribution for ξ. This is a special case of the generalized epidemic, by assuming Under this model, (10.32) becomes This p.g.f. gives the geometric distribution for N with R 0 = βµ. This corresponds to the underlying assumption for both the deterministic SIR model and its stochastic counterpart, the bivariate Markov chain SIR model. Bailey [8] gave it the name the general epidemic. A better name for this may be the Kermack-McKendrick epidemic to reflect its early origins [14] . The infectious contact process {K(x)} is a mixed Poisson process under the same assumptions as that in the randomized epidemic model, with Karlis and Xekalaki [15] pointed out that the shape of the p.m.f. of a mixed Poisson distribution exhibits a resemblance to that of the p.d.f. of the mixing distribution u(λ). Lynch [16] proved that mixing carries the form of the mixing distribution over to the resulting mixed distribution in general. Tail properties of some commonly used probability models for continuous non-negative random variables have been summarized in Table 10 .3. We now compare the shape of the distributions for N generated by p.g.f. ρ e ρ(s−1) dU (ρ) with that of their mixing distributions U (ρ). Figures 10.3 and 10.2 illustrate that the shape of the geometric distribution resembles that of the exponential distribution and the shape of the negative binomial distribution resembles that of the gamma distribution. We may re-parametrize the negative-binomial distribution (10.11) as Pr{N = x} = x by letting ζ = 1 1+φµ . The geometric distribution is a special case ς (1 − ς) x . 1. The geometric distribution has exponential tail because for any x, The negative binomial distribution has exponential tail when φ > 1. In fact, from page 210 of Johnson, Kotz and Kemp [17] Pr{N When φ > 1, A distribution with supports 1, 2, · · · given by is the normalizing factor and Li θ (z) = ∞ n=1 z n n θ , was mentioned in Brauer [3] . It has been used in modelling the spread of infectious diseases on scale-free networks by Newman [18] ; Meyers, Pourbohloul et al. [19] and many others. • If θ → 0, (10.35) returns to the (zero-truncated) geometric distribution • If τ → ∞, (10.35) has the limiting distribution as the Zipf distribution When τ < ∞, the tail of this distribution is exponential. As s → ∞, There are many possible stochastic mechanisms for which a distribution like (10.35) may arise. One of them is a generalized epidemic model arising from mixed Poisson distributions p.g.f. ∞ 0 e βx(s−1) dF I (x), where the infectious period W arises from a competing process. For some individuals, the infectious period arises from an exponential distribution with constant hazard h e (x) = γ as the removal rate. For other individuals, the infectious period follows a generally very short but highly skewed distribution, which may be fitted well with a Pareto form, with the hazard function expressed by h p (x) = αθ 1+αx . This scenario may arise if identified infectious individuals are aggressively isolated during an outbreak. The survivor function of the infectious period has a shape of that of a Pareto distribution with exponential cut-off: It has an exponential tail: Although the exact distribution for N , may not be easy to calculate, we may use the results in Karlis and Xekalaki [15] to argue that it carries the shape of the mixing distribution F I (x) which is a continuous analogue of (10.35). The term "scale-free" [20] refers to networks with degrees following a powerlaw distribution with Pr{N = x} ∝ 1 x θ+1 . Scale-free models have gained much attention in infectious disease literature. Liljeros et al. [21] used the power-law distribution to model the web of sexual contacts with implication of studying sexually transmitted infections. A discrete power-law distribution is the Zipf distribution 1 ς(θ+1) x −(θ+1) , a discrete analogue to the continuous Pareto distribution. In a general sense, any distribution such that lim x→∞ F (x) x θ+1 = A, for some θ > 0, A > 0 is a power-law distribution. Romillard and Theodorescu [22] showed that certain Poisson mixtures are power-law. A class of power-law distributions have the form with p.m.f. which is the Waring distribution given by (6.149) and the special case Yule distribution corresponding to (6.139) when α = 1 of Johnson, Kotz and Kemp [17] . The mean and variance may not exist. The first moment exists: Using the Barnes expansion [17, p. 6] , it shows that when x is sufficiently large, the Waring and the Yule distribution follow the power-law property: ∝ 1 x θ+1 where the parameter α plays little role. The Waring (Yule) distribution can be generated as a mixture of the negative binomial distribution of the general form If one takes a one-parameter Beta distribution u(ς) = θς θ−1 as the mixing distribution onto a negative binomial, then which returns to the Waring distribution. The Waring distribution can be justified as arising from a mixed Poisson [23, 24] . It can be also formulated as arising from a infectious contact process that itself arises from a mixed negative binomial (10.37) with Pareto distribution as its mixing distribution. Then one can use the results in Lynch [16] to justify why (10.38) carries the Pareto shape. In fact, if we write ζ = 1 1+φρ , where u 1 (ρ) = θφ (1+ρφ) θ+1 is the Pareto mixing distribution. n is the total number of individuals in the population. The initial phase of an outbreak is the period of time when depletion of susceptible individuals can be ignored, which is t : E[S(t)] n ≈ 1 . During this phase, there are two approximations to describe the disease spread: 1. The branching process approximation: A typical infectious individual is associated with a generation time T G = T E + T I and by its end, it produces a random number of N new infections. The mean number of these new infections is R 0 . If R 0 ≤ 1, with certainty (i.e. with probability one), this branching process will become extinct, resulting in a handful of total infections. The extinction is intrinsic. 2. The exponential growth approximation: If R 0 > 1, the expected number of infectious individuals at time t follows the Malthus' Law E[I(t)] ∝ e rt [25, 26] , characterized by a parameter r. The growth is intrinsic and r is known as the intrinsic growth rate, also known as the Malthusian parameter. There is an Euler-Lotka equation [5] ∞ 0 e −rτ β(τ )A(τ )dτ = 1, (10.39) where β(τ ) and A(τ ) are defined the same way as in (10.21). Kendall [27] considered a continuous random variable Y = Z n defined on (0, 1], where Z = n − S ∞ is the final size of an outbreak. The shape of this distribution may have one of the two shapes: the J-shape and the U-shape. The term J-shape refers to a distribution that is monotonically decreasing so that it has a mode at zero. The distribution is said to have U-shape if it is bimodal. As shown in Fig. 10 .8, the U-shaped distribution can be thought of as a weighted average between a J-shaped distribution with weight 0 < π < 1 and a uni-modal distribution of a bell-shape with weight 1 − π. Näsell [28] describes the threshold as a value at which the distribution of Y makes a transition from J-shape to U-shape. In a population with finite size n, the transition occurs in the vicinity of 1 at the value 1 + K n 1 3 , where K is a constant independent of n, but depends on the number of initially infected individuals I 0 . When n → ∞, a sufficient condition for a J-shaped distribution of Y is R 0 < 1, under which, π = 1 (see Fig. 10 .8). When R 0 > 1, either an outbreak dies out with a handful of cases with probability π < 1, or starts an exponential growth into a large outbreak with probability 1 − π. Metz [29] showed that the outbreak can be either small or large with no middle road in between. A small outbreak corresponds to a J-shaped distribution. The opposite is not true. When n is finite and R 0 lies within the vicinity of 1 of the order of O(n − 1 3 ), then with probability π = 1, Y = Z n will have a J-shaped distribution with E [Z] n → 0 meanwhile E[Z] → ∞. In this case, the outbreak size is neither small, nor large. Martin-Löf [30] shows that Y * = Z n 2 3 has a limit distribution as n → ∞. It may have different shapes, from J-shape to a bimodal U-shape, to uni-modal with mode not at zero, or with a shape that is rather flat. This makes the final size unpredictable. When n → ∞, a necessary condition for a non-zero probability 1 − π > 0 is that R 0 exists and R 0 > 1. It is possible that for some distributions for N , R 0 = E[N ] does not exist. Public health control measures against an outbreak are often centred around the reduction of R 0 to some controlled reproductive number R c by a reduction factor c so that Ideally, one wants R c < 1 so 1−π = 0. If this is not achievable, it is important to investigate what other aspects of the distribution for N such as var [N ] as well as aspects of the underlying stochastic mechanisms that manifest the p.g.f. G N (s) contribute to the reduction of the risk of a large outbreak 1 − π. Using standard results from the (embedded) Galton-Watson branching process, for any R 0 > 0 (if it exists), there is always a probability π so that the branching process will become extinct and produce a small outbreak. π is the smallest root of the Fixed-Point Equation When R 0 = G N (1) ≤ 1, the smallest root is π = 1. In this case, a large outbreak will not occur. When R 0 = G N (1) > 1, there is an unique solution π in the open interval (0, 1) and the risk of a large outbreak is 1 − π. If the outcome happens to be a small outbreak, the observed branching process will be indistinguishable from that as if arising from a different "reproduction number" R * 0 = G N (π) < 1 with its p.g.f. obtained by taking the graph If the outcome happens to be a small outbreak, the observed branching process will be indistinguishable from that as if arising from a different "reproduction number" 1+R0(1−s) , then N has the geometric distribution. One of the stochastic mechanisms for producing the geometric distribution is the Kermack-McKendrick epidemic, with all the underlying assumptions governing the deterministic or stochastic bivariate Markov chain SIR models. Another stochastic mechanism for producing the geometric distribution is the randomized distribution with the infectious contact process arising from a mixed Poisson process with exponential mixing distribution. There are other stochastic mechanisms that also produce N with geometric distribution. However, for studying the initial behaviour using the embedded Galton-Watson branching process, it is only the p.g.f. G N (s) that matters. If R 0 > 1, the risk of a large outbreak is calculated by π = 1 1+R0(1−π) with exact solution π = 1 R0 . The risk of a large outbreak is 1 − 1 R0 . It can be also shown that R * for some φ > 0, then N has the negative binomial distribution. It may arise as a generalized epidemic with gamma distributed infectious period, or as a randomized epidemic with the infectious contact process arising from a mixed Poisson process with gamma mixing distribution. There are also other stochastic mechanisms that can produce N with the negative binomial distribution. In addition, . Corollary 1. If R 0 > 1 and π ∈ (0, 1) is the Fixed-Point as the smallest root of (10.43), then π is an increasing function of φ. The larger the value of φ, the larger the probability that an initial outbreak will die out without evolving into a large outbreak and the smaller the risk. An alternative way to show that π is an increasing function of φ is to use implicit differentiation for (10.43) to directly show that dπ dφ > 0. In fact One only needs to show for any x > 0. Corollary 2. If R 0 > 1, the Poisson epidemic, as φ → 0, gives the smallest probability π and hence the highest risk of a large outbreak, compared with any other model within the negative binomial family of distributions with the same R 0 . This statement can be extended beyond the negative binomial family. For all randomized and generalized epidemics that can be formulated as mixed Poisson distributions, the Poisson epidemic always produces the smallest probability π. Proposition 3. Let π 0 be the smallest root of s = e R0(s−1) corresponding to the Poisson epidemic. Let π ρ be the smallest root of s = e ρ(s−1) dU (ρ) corresponding to an epidemic arising from some stochastic mechanisms such that G N (s) = E ρ e ρ(s−1) can be interpreted as the p.g.f. of a mixed Poisson distribution. If the latter epidemic has finite mean R 0 which equals that of the Poisson epidemic, then π 0 < π ρ. Since from any distribution with p.g.f. given by G N (s), Pr{N = 0} = G N (0) which is the probability that, after introducing the initially infected individuals into a susceptible population, transmission never occurs. This leads to the next Corollary, which is a well-known result that was originally proven by Feller [31] in 1943. It implies that variances imposed by infectious period, heterogeneous contact environment, heterogeneous transmission among individuals or combinations of them, all reduce the risk of an large outbreak. Using the branching process approximation in the early stage of the epidemic and from the theory of branching process (Chap. 8 of [32] ), Pr{T g ≤ g} is uniquely determined by the p.g.f. G N (s) and is calculated by If there is no secondary transmission, the initial infective individuals make the first generation, hence Pr{T g = 1} = Pr{N = 0} = G N (0). Pr{T g ≤ g} is a non-decreasing function of g. Starting from Pr{T g = 1} = G N (0), with the limit For any R 0 > 0, the probability of a small outbreak that becomes extinct in a few generations is π, the smallest root of s = G N (s), s ∈ (0, 1]. The recursive procedure and convergence of (10.45) are illustrated in Fig. 10 .10. is the conditional probability that, given the outbreak being small, it has become extinct after g generations. It can be shown that for a suitable positive constant A, g and 0 < G N (π) < 1. It implies that if extinction is going to occur, the smaller the value of G N (π), the more likely it will happen quickly with very few generations. When N arises from the negative-binomial distribution, (10.45) becomes with the limit lim g→∞ Pr{T g ≤ g} = π. It can be shown that, T g is ranked by stochastic order in the sense of the survivor function F T g (g) = Pr{T g ≥ g} = 1 − Pr{T g ≤ g − 1}, according to the parameter φ. The larger the φ, the shorter the distribution Pr{T g ≥ g}. For N with negative-binomial distribution with mean value R 0 , not only large variance var[N ] = φR 2 0 + R 0 lowers the risk of large outbreaks 1 − π, but also that if extinction is going to occur, it is likely to happen quickly with very few generations. An extreme case is when φ → ∞. It can be shown that lim φ→∞ (1 + φR 0 ) Special Cases as φ → 0 and φ = 1 The special cases are summarized in Table 10 .4. To demonstrate that T g is stochastically longer when φ → 0 compared to when φ = 1, Fig. 10 .11 illustrates the survivor function Pr{T g ≥ g} at R 0 = 0.667. Under the same R 0 , the probability of lasting more than 5 generations before extinction if N is Poisson distributed, is approximately 1.4 times of that if N has a geometric distribution. For the integer-valued random variable Z corresponding to the final size of a small minor outbreak, the probability where π is the probability of a minor outbreak as a solution of (10.41). Let G Z (s) = ∞ z=1 s z Pr{Z = z}. It was shown [10, p. 193 ] that G Z (s) = s · G N (G Z (s)). In general, G Z (s) is not a p.g.f. since G Z (1) = π ≤ 1. Taking derivatives with respect to s, Mean and Variances for Z When R 0 < 1 In this case, π = 1 and G N (1) = G Z (1) = 1. Since G N (1) = R 0 < 1, letting s = 1 in (10.46a) , one gets G Z (1) = 1 + R 0 G Z (1). Hence Therefore, E[Z] = 1 1−R0 is valid for any distribution of N. Similarly, letting s = 1 in (10.46b) and using var Mean and Variances for Z When R 0 > 1 In this case, there exits π ∈ (0, 1) such that π = G N (π). With probability π, the observed branching process will be indistinguishable from that as if arising from a different "reproduction number" R * 0 = G N (π) < 1. (10.47) and (10.48) can be extended as the following conditional expectation and conditional variance: where (10.49a) and (10.49b) are extensions of (10.47) and (10.48) in the sense that they are valid for both R 0 > 1 and R 0 < 1. For the latter, π = 1 and G N (π) = R 0 . We treat G Z (s) = ∞ z=1 s z Pr{Z = z} as if a p.g.f. for both R 0 > 1 and is the conditional distribution of the final size starting from one initial infective, conditioning on the outcome being a small outbreak. For any given p.g.f. G N (·), a closed analytic form of G Z (s) may not always exist if one wants to solve G Z (s) = sG N (G Z (s)). However, G (z) Z (0) can be sometimes solved recursively starting from Pr{Z = 1} = G Z (0). We use the convention that because the event {Z = 1} implies that there is no secondary transmission in the population. There is also a convention that G Z (0) = Pr{Z = 0} = 0 as there must be at least one infective individuals to start an outbreak. The recursive procedure can be demonstrated for Pr{Z = 2} and Pr{Z = 3}. It is the probability that the index case gives transmission to one individual with probability Pr{N = 1} and the second individual does not transmit with probability Pr{N = 0}. With a bit more calculus, G Z (0) can be expressed as It implies that either the index case produces two secondary cases with probability Pr{N = 2} and neither of the secondary cases produces further transmission with probability (Pr{N = 0}) 2 ; or the index case produces one transmission and the secondary case produce one transmission with joint probability (Pr{N = 1}) 2 and the third case does not transmit with probability Pr{N = 0}. When G N (s) has a relatively simple form, one can use G N (s) and the recursive procedure to generate the entire distribution 1 can be written as: . We can calculate recursively where π needs to be numerically calculated from the equation π = When R 0 < 1, the mean and variance with respect to (10.51) is Example 5. When φ → 0 and R 0 < 1, (10.51) is Example 6. When φ = 1 and R 0 < 1, (10.51) is (10.54) This distribution can be found as (6.619) in Mode and Sleeman [10] . Its variance is var When R 0 > 1, to calculate the condition mean E[Z|small outbreak] and the conditional variance var[Z|small outbreak], one uses (10.49a) and (10.49b) for the calculation and substitutes R * In the geometric distribution where φ = 1 and π = 1 R0 , one gets simple expressions for R 0 > 1 [10, p. 196 ] For the assessment of the risk of a large outbreak 1 − π and quantities associated with a small outbreak, it is sufficient to know the distribution for N , without the necessity of knowing the distributions for the latent and infectious periods. On the other hand, should a large outbreak occur, the intrinsic growth rate r, also known as the Malthusian parameter, for E[I(t)] ∝ e rt , depends crucially on the latent and infectious periods distributions. In other words, even detailed information of the distribution for N is not sufficient. One needs to know the stochastic mechanisms that manifest the distribution for N. Yan [34] re-writes the equation (10.39) under the following conditions: 1. The infectious contact process {K(x)} has stationary increment β(x) ≡ β over the infectious period 2. T E and T I are independent 3. The Laplace transforms for the latent period and the infectious period: implying that, the property of the infectious contact process {K(x)} as summarized by the parameter β, the distribution of the latent period as summarized by L E (r) and the distribution of the infectious period as summarized by L * I (r), separately shape the shape of E[I(t)] during the initial phase of an outbreak and a general relationship between the intrinsic growth rate r and the basic reproduction number R 0 . In fact, Since T E and T I are independent, the Laplace transform of the generation time T E + T I is L E+I (r) = L E (r)L I (r). By writing L * E+I (r) = 1−LE (r)LI (r) r and L * E (r) = 1−LE (r) r , one immediately gets βL E (r)L * I (r) = 1 in (10.55). Since R 0 = βµ, one further gets The necessary condition for r > 0 is R 0 > 1. Together with R 0 = βµ, one gets µ > β −1 . In this case, the relationship (10.55) can be illustrated in Recalling the discussions on stochastic ordering and the relationship in (10.8), one can immediately see that when β is fixed, stochastic ordering of the latent and infectious periods determines the intrinsic growth rate r. Therefore we have the following two propositions. When β and the infectious period distribution F I (x) are given, the longer the latent period T E , the smaller the intrinsic growth r. In this statement, the word longer refers to conditions so that T E is larger in Laplace transform order. Figure 10 .12 shows that, when L E (r) is smaller, the root for r in the equation L E (r)L * I (r) = β −1 is smaller. By consequence, the existence of a latent period produces a smaller initial growth rate than that produced by a model in the absence of a latent period. Since the distribution of T E does not affect the basic reproduction number R 0 , we also have: Corollary 4. For fixed infectious period distribution F I (x) and empirically observed r, if one takes a distribution for the latent period which is shorter in Laplace order than it really is, one underestimates R 0 . When β and the latent period distribution F E (x) are given, among the distributions for the infectious period with equal mean value µ = E[T I ], the longer the infectious period T I , the larger is the intrinsic growth rate r. In this statement, the word longer refers to conditions so that T I is larger in Laplace transform order. The expression (10.57b) was originally given by Anderson and Watson [35] and recently re-visited by Wearing et al. [36] It has many special cases as summarized by Table 10 .5. In particular, Cases C1, C4, C5, C7 and C11 are highlighted below Table 10 .5 Special cases with gamma distributed latent and infectious periods Of these expressions, (10.58a) is the most intuitive and appears in ecology textbooks (e.g. [37] ), assuming that the variation in the generation time T G is negligible and only if T G = T E so that transmission only occurs at an instantaneous moment at the end of the latent period. In the absence of a latent period, a non-random infectious period gives (10.58b) which can be found in Anderson and May [38] . The most commonly encountered relationship is (10.58c) which can be derived from the deterministic or stochastic SIR models assuming an exponentially distributed infectious period without a latent period, as in Chap. 2 of this book. If one assumes that both T E = ν and T I = µ are not random, one gets (10.58d). In recent years, (10.58e) is frequently seen in the literature, such as Lipsitch et al. [39] in the application to the SARS epidemic in Singapore; Chowell et al. [40] for influenza, etc. The underlying assumption in (10.58e) is that both the latent and infectious periods are exponentially distributed. For gamma distributed latent period, the Laplace transform is L E (r) = . Both ν and κ 1 rank T E according to Laplace transform order. implying that the longer the mean latent period, the larger the latent period in Laplace transform order. On the other hand, for the same mean latent period ν, implying that the larger the value of the shape parameter κ 1 (hence the smaller the variance var[T E ] = ν 2 κ1 ), the smaller the value of L E (r) and hence the longer the latent period in Laplace transform order. In both cases, if β and L * I (r) are fixed, a longer latent period in Laplace transform order implies a smaller value of r. Two benchmarks are κ 1 = 1 and κ 1 → ∞. Among a subset of the gamma distribution with κ 1 ≥ 1, for the same R 0 , the exponentially distributed latent period gives the largest r. When the latent period distribution degenerates to a fixed point with T E = ν, one gets the smallest r. Conversely, if r is empirically observed, then the exponentially distributed latent period yields the smallest R 0 whereas the fixed latent period yields the largest R 0 . To examine the effects of the infectious period distribution on r with a given latent period distribution F E (x), since R 0 = βµ, we only consider distributions for the infectious period with equal mean value µ = E[T I ]. Under this condition, the longer the infectious period T I in Laplace transform order, the larger is the intrinsic growth rate r. For gamma distributed infectious In this case, since we keep the mean µ = E[T I ] fixed and var[T I ] = µ κ2 , a longer infectious period T I in Laplace transform order implies that the variance is smaller. In other words, under the same R 0 and fixed latent period distribution, a gamma distributed infection period with smaller variance produces larger initial growth rate. Among a subset of the gamma distribution with κ 2 ≥ 1, for the same R 0 , the exponentially distributed infectious period (κ 2 = 1) gives the smallest r and the fixed infectious period (κ 2 → ∞) gives the largest r. Conversely, if r is empirically observed and if the mean infectious period µ is given (along with the given latent period distribution), the larger the value κ 2 (hence the smaller the variance), the smaller the R 0 . This can be also seen by taking the derivative of (10.57b) with respect to κ 2 : Example 7. Let us consider an SIR model without the latent period. Let us assume that the infectious contacts arise from a Poisson process and that the infectious period has mean µ = 4.098. In this example, let R 0 = βµ = 1.386. The infectious period T I has the distribution f I (x) = κ κ x κ−1 µ κ Γ (κ) e − κx µ , µ > 0, κ = 1, 2, 3, · · · , a special case of the gamma distribution (the Erlang distribution) with integer-valued κ. It can be viewed as the distribution of the sum of κ i.i.d. exponential random variables with mean µ κ . A "linear chain trick" is to consider a compartment model as given by Fig. 10 .13 and to use deterministic ordinary differential equations to numerically calculate . Figure 10 .14 shows results with κ = 1, 2, 3, 4. When κ = 1, One can replace the gamma distribution for the latent or the infectious period by other distributions, provided that their Laplace transforms exist. For example, one may consider an inverse-Gaussian distribution for a nonnegative random variable X with p.d.f. , µ,κ > 0. Similar to Table 10 .5, one can make a table of all the special cases; see Table 10 .6. One does not need to assume that the latent and infectious periods arise from the same distribution family. For example, one may take an inverse-Gaussian distributed latent period to combine with a gamma distributed infectious period and derive Table 10 . Table 10 . The latent period does not play any role. Knowledge of the distribution for N is sufficient. One uses the p.g.f. G N (s) to calculate the probability 1−π for the risk of a large outbreak. If the outbreak is small, G N (s) further determines the distribution of the generation time to extinction Pr{T g ≤ g} and final size at extinction Pr{Z = z}. If the distribution of N can be derived from observed data, then details with respect to underlying stochastic mechanisms that give rise to G N (s), such as the property of the infectious contact process {K(x)} and infectious period distribution F I (x), are irrelevant. On the other hand, with knowledge of the property of the infectious contact process {K(x)} and the infectious period distribution F I (x), one can use the CMJ branching process to derive the p.g.f. G N (s) for N. The initial phase of a large outbreak is characterized by the approximation E[I(t)] ∝ e rt . Under the condition that {K(x)} has stationary increment property, r is determined by (10.55) . It shows that β, the distribution of the latent period and the distribution of the infectious period separately shape the shape of E[I(t)] during the initial phase. However, the distribution for N is irrelevant. For a large outbreak, the expected final outbreak size number scales linearly with the size of the susceptible population. In other words, n → η where η is a positive quantity, 0 < η < 1. Therefore, one considers the continuous random variable Y = Z n , distributed over the range (0, 1), as the final size. Theorem 1. When R 0 > 1, conditioning on the outcome being a large outbreak and assuming that lim n→∞ I0 n → ε, let η be the root of the equation (10.60) Then S∞−nη √ n has Gaussian limit distribution N (0, σ 2 ) and the asymptotic variance is given by This central limit theorem has been developed from different mathematical approaches by von Bahr and Martin-Löf [12] , Ludwig [41] , Scalia-Tomba [42] , Martin-Löf [43] and Lefèvre and Picard [44] . The final size proportion Y = Z n converges in distribution to a point mass at η, the root of (10.60). The fluctuations around the limit are Gaussian of order 1 √ n , which become large if the variance var[N ] is large. Contrary to the study of the intrinsic growth rate r at the initial phase, the final size is more robust with respect to the stochastic mechanisms at the micro level. Even the distribution of N does not play a significant role, except for its first two moments: R 0 = E[N ] and var [N ] . In von Bahr and Martin-Löf [12] , Theorem 1 was first proven under a Reed-Frost epidemic model [45] . For such a model, it is assumed that an individual i is infected at time t. A given individual j is contacted by i and if j is susceptible then j becomes infected at time t + 1. Meanwhile, i becomes removed (by immunity or death) and plays no further part in the epidemic process. All contacts are assumed to be independent of each other. At each increment of time from t to t + 1, an individual produces a random number of infectious contacts among the available susceptible individuals following a binomial distribution. When n is large, the binomial distribution is approximated by a Poisson distribution. von Bahr and Martin-Löf [12] then extended their proof to the randomized epidemic model where each infectious individual i has its own transmission probability per contact p i ; the p i are i.i.d. random variables with a given distribution. Earlier in this chapter we made the statement that as far as the distribution for N is concerned, the generalized epidemic and the randomized epidemic models can be unified as one mixed Poisson model with −1) . This corresponds to Ludwig [41] , where it was noted that as far as the distribution of Y = Z n is concerned, the generalized epidemic model with an arbitrary but specified infectious period distribution can be reformulated as a special case of the randomized epidemic model. Subject to the approximation S0 n ≈ 1 and if one re-writes x = 1− S∞ S0 , ε = I0 S0 , the asymptotic mean of a large outbreak (10.60) is in close agreement with that given by Kermack and Mckendrick [14] , of which, given (R 0 , S 0 , I 0 ) , one solves for S ∞ in the equation From a deterministic perspective, Ma and Earn [46] used integro-differential equations to show that the final size calculated by (10.62) is invariant, including the existence of a latent period, arbitrarily distributed infectious period, any number of distinct infectious stages and/or a stage during which infectious individuals are isolated, as well as the existence of super-spreading events. From a probabilistic perspective, let us consider a typical susceptible individual ν s . We define a type-specific hazard function where t j is the time of infection of the individual j, representing the instantaneous risk of infection being transmitted by the jth infectious individual. As illustrated by Example 4, the hazard of a susceptible individual ν s to become infected at time t, denoted by h (s) (t), can be thought as a competing risk problem. Under the independence assumption, Then the probability that a susceptible individual ν s ever gets infected over the course of the epidemic is Under these assumptions, the type-specific hazard is h which is (10.60) and R 0 = β * S 0 µ = βµ. Ludwig [41] used this analogy in a discrete time setting, and provided a rigorous proof that the final size distribution does not depend on the duration of latency in the individuals, or on the time course of infectivity, but only on the "time-integrated infectivity" in discrete time. The "time-integrated infectivity" is analogous to To give a counterexample, if N follows a geometric distribution, with R 0 > 1, the risk of a large outbreak is calculated with π = The statement holds if the outbreak can be described by a Poisson epidemic. In this case, the probability of a large outbreak is 1 − π = e −R0π as derived from (10.42) and it is identical to the final size equation (10.60) as ε → 0. Recall that a Poisson epidemic arises in the following manner. 1. As n → ∞, the social contact network grows in such a way that at any time t, the realization can be regarded as a Bernoulli random graph [47] with Poisson degree distribution with mean λt. If D is the duration of an outbreak, the random graph as observed at D is a large Bernoulli random graph approximated by a Poisson degree distribution with mean R * 0 = λD. 2. The infectious contact subgraph also grows according to a Poisson degree distribution with mean λpt = βt. Every infective has a fixed infectious period µ. By the end of the outbreak, the observed infectious contacts makes a subgraph which is a large Bernoulli random graph with i.i.d. Poisson degree distributions with mean R 0 = βµ. Using Theorem 2.3.2. of Durrett [33] (which can be proven using random walk theory), if R * 0 > 1, with probability one, there is only one giant component with size ∼ η * n where η * is root x of the equation 1 − x = exp (−R * 0 x) in (0, 1). All other components are small in the sense that there is a constant ω such that the largest of the remaining components can not have more than ω log n vertices. The infectious contact subgraph is proportional to the social contact graph with R 0 ∝ R * 0 by a factor pµ D . There is a difference in concept between a random graph and a realization of a random graph. The random graph belongs to a probability space. A realization of a large outbreak takes place only within the giant component of the social contact graph. Since the relative sizes of all other component in the social contact graph → 0 as n → ∞, the size of a realized large outbreak η is thus proportional to η * by the same factor pµ D . Therefore we get the final size equation 1 − η = exp (−R 0 η) . There are two independence properties involved. vertex v i to a susceptible vertex v j is independent from infections to other susceptible vertices derived from v i . A heuristic argument is that, if one randomly chooses vertex v i to be the initial infective, the probability that it will lead to a large outbreak is the same probability that it belongs to the giant component. And hence, "the risk of a large outbreak = its final size". However, if there is a random period of time T , the evolution of the infectious contact subgraph will be very different from the evolution of the subgraph where every vertex has a fixed infectious period, even though the social contact network is the same Bernoulli random graph. If one chooses a vertex v i to be the initial infective, then each of the other susceptible vertices in the graph has a probability . N is associated with a larger variance, with many infectious vertices having few edges due to a shorter infectious period and some infectious vertices having a large number of vertices due to a longer infectious period. Because of the loss of independency and the large variance, the equality between the risk of a large outbreak and the final size of the large outbreak is lost in generalized epidemic models. Under the independence assumption that transmission of infection to individual ν s by individual j is independent from potential transmissions from other infectious individuals, then the hazard function for individual ν s to be infected at time t is h (s) (t) = j h (s) j (t − t j ). There are occasions that this independent competing risk can be violated. Such an occasion can arise in a highly clustered social contact network where a number of infectious individuals are correlated within a social cluster. Another occasion is that susceptible individuals are removed by quarantine if they have been identified as exposed to known infectives. In these circumstances, final size equations (10.60) and (10.62) may be incorrect. If the infectious contact process Let us consider a process without the stationary increment property, with = βe βφx as a function of x. It can be constructed in the following manner. Consider a linear pure birth process, defined by the conditional probability with the Markov property so that Pr{K(x + h) − K(x) = 1|H x }, k = 0, 1, 2, · · · is equal to (10.64) Conditioning on K(x) = k, the instantaneous rate of producing the next infectious contact during [t, t + h) can be considered as an independent competing risk hazard: either from a global environment, with constant rate β 2 , or from a clustered environment with non-constant rate β 1 k and β 1 = β 2 . The hazard of producing an infectious contact at time x is lim h→0 Pr{K(x + h) = k + 1|K(x) = k} h = β 1 k + β 2 = β (kφ + 1) . When φ = β1 β1 → 0, the linear pure birth process reduces to a Poisson process with β = β 2 Pr{K(x + h) = k + 1|K(x) = k} = βh + o(h), k = 0, 1, 2, · · · . (10.65) When φ = 1, the linear pure birth process is the Yule process with Pr{K(x + h) = k + 1|K(x) = k} = β(k + 1)h + o(h), k = 0, 1, 2, · · · . (10.66) According to (10.66), given that an infectious individual has produced k infectious contacts by time x, the hazard for producing the (k + 1) st contact is a increasing function of k. Starting at x = 0, corresponding to the beginning of the infectious period for an infectious individual, the waiting time to producing the first infectious contact is exponentially distributed with mean E[X 1 ] = 1 β1 ; conditioning on the first infectious contact, the waiting time to the second infectious contact is exponentially distributed with mean E[X 2 ] = 1 2β1 ; · · · ; and conditioning on an infectious individual who has produced k infectious contacts, the waiting time to producing the (k + 1) st infectious contact is exponentially distributed with mean E[X k+1 ] = 1 (k+1)β1 . On the surface, it looks as if the more infectious contacts it produces, the more likely it produces more infectious contacts. This infectious contact process may arise from the following hypothesis. A typical infectious individual resides in a highly clustered environment (household, hospital ward, etc.) but meanwhile has contacts with susceptible individuals from outside this environment. The contact network structures in the clustered environment is different from that in the "global" environment. During its infectious period, this individual may infect: 1. Susceptibles from the highly clustered environment such as household members (if such environment is a household) or nurses and other patients (if such environment is a clustered hospital ward), such that this network manifest data as if arising from a preferential attachment model 2. Susceptibles from a "global" environment that can be approximated by a large Bernoulli random graph For outbreaks that mainly spread within and among a highly clustered environment, infected individuals directly and indirectly attributed to the index case during the time x ≤ T I (infectious period) are often recorded as exposed and counted as the next generation of the index case. This artificially changes the order of events and creates artificial "generations" of infectives. However, real-life epidemiologic data may arise in this "artificial" order. Let us modify the definition of {K(x)} so that K(x) is not only the cumulative number of infectious contacts produced directly by an infectious individual, but also includes those infected cases attributed to itself in later generations, counted up to time x when the original index case is still infectious. By time . . . where (10.67) implies that given the index case has produced one infectious case, and at time x both cases are infectious, the instantaneous rate of a new infectious contact attributed to the index case is due to the contribution of the index case itself and of the other infectious individual. {K(x)} is a Yule process which could be manifested by real epidemiologic data, if the index case resides in a highly clustered and connected environment and simultaneously gives exposure (not necessarily transmission) to a large number of people. It is the environment that manifests a large number of infected individuals (i.e. super-spreading events) and data "directly link" them to the first case in the environment. In random graph theory, there is a concept of "preferential attachment" [20] of the random graphs over the space of vertices. The linear pure birth process is an analogue to such a concept over time. There is an ongoing debate whether preferential attachment actually happen in growing networks. Liljeros et al. [21] are convinced of preferential attachment as a mechanism for sexual networks, as "people become more attractive the more partners they get." However, Jones and Handcock [48] are skeptical and argue that networks with infinitely large variances but dramatically different structures can manifest the same marginal degree distribution, whereas these different network structures produce different epidemic behaviour. This debate has a longer history. In 1919, Greenwood and Woods [49] put forward three hypotheses into the occurrence of accidents: 1. Pure chance, which gives rise to the Poisson process (10.65) 2. True contagion, i.e. initially all individuals have the same probability of incurring an accident, but this probability is modified by each accident sustained to give rise to the linear pure birth process 3. Apparent contagion (proness), i.e. individuals have constant but unequal probabilities of having an accident and the resulting process being a mixed Poisson process (10.18) Xekalaki [50] gives a comprehensive survey of the history of these hypotheses and related models. The arguments on occurrence of accidents can be equally applied to disease transmission: whether observed data can distinguish if the underlying stochastic mechanism is arising from a linear pure birth process or a mixed Poisson process with large variance. The two very different stochastic mechanisms produce quantitatively very different epidemic behaviour. Let us start with an infectious contact process {K(x)} with a marginal distribution having the negative binomial form which a parameter 0 < ς < 1. If {K(x)} arises as a linear pure birth process given by (10.64), from page 274 of Bhattachrya and Waymire [51] , the marginal distribution for K(x) is where K(x) has p.g.f. G K (s, x) = ∞ 0 e ξx(s−1) dU (ξ) and ξ follows a gamma distribution. In this case, ς = 1 1+βφx . The link functions ς = 1 1+βφx and ς = e −βφx are two hypotheses for whether {K(x)} has stationary increment or not. However, they are often not identifiable through data. An infectious contact process {K(x)} with marginal distribution (10.68) can be further stopped by a randomly distributed infectious period to generate distributions for N. If all infectious individuals have constant infectious period µ, then N also has a negative binomial distribution: If data are available to suggest a distribution for N which is negative binomial, there may be two further assumptions, either ς = 1 1+βφµ or ς = e −βφµ , associated with two completely different stochastic mechanisms for how {K(x)} arises. In addition, we have also seen a third way to get the negative distribution for N. It corresponds to G(s, x) = e −βx(1−s) with gamma distributed infectious period and E[T I ] = µ. Therefore, there are three stochastic mechanisms that give rise to the identical distribution for N. Observed data can not identify these underlying models. If T I follows an exponential distribution with p.d.f. F I (x) = 1 µ e − x µ and if the infectious contacts follow a linear pure birth process with Pr{K(x) = k} given by (10.69), the resulting distribution for N is Since ς = e −βφx , (10.71) becomes , θ > 0, α = 1 φ > 0. (10.72) where θ = 1 βφµ > 0. It can be regarded as a mixing distribution in We have seen it before as (10.36) , with the power-law property: ∝ 1 j θ+1 . Notice that the same Waring distribution can also arise if {K(x)} has marginal distribution given by (10.70) combined with a Pareto infections period distribution. Therefore, there are multiple stochastic mechanisms that generate the distribution for N with power-law tail behaviour. If {K(x)} arises from a non-stationary increment process described by the linear pure birth process (10.64), the marginal distribution for K(x) is given by a negative-binomial form (10.68) with the link ς = e −φβx . Any further assumption on randomness in φβx, such as a random infectious period, will result in a mixed negative-binomial distribution with some mixing distribution U (ς). It has been shown by Karlis and Xekalaki [52] that Pr{N = j} ∝ j −θ j Compartmental models in epidemiology An introduction to stochastic epidemic models An introduction to networks in epidemic modeling Some problems in the theory of infectious diseases transmission and control Mathematical epidemiology of infectious diseases: model building, analysis and interpretation An inquiry into the nature of frequency distributions representative of multiple happenings with particular reference to occurence of multiple attacks of diseases or of repeated accidents Stochastic models for epidemics The Mathematical Theory of Infectious Diseases and its Applications. Second Edition. The Griffin & Company Ltd Stochastic Population Models in Ecology and Epidemiology Stochastic Processes in Epidemiology, HIV/AIDS, Other Infectious Diseases and Computers The effect of random vaccine response on the vaccination coverage required to prevent epidemics Threshold limit theorems for some epidemic processes Poisson approximation for the final state of a generalized epidemic process Contributions to the mathematical theory of epidemics, part I Mixed Poisson distributions. International Statistical Review Mixtures, generalized convexity and balayages Univariate Discrete Distributions Spread of epidemic disease on networks Network theory and SARS: predicting outbreak diverity Emergence of scaling in random networks Sexual networks: implications for the transmission of sexually transmitted infections Inference based on empirical probability generating function for mixtures of Poisson distributions A unified derivation of some well-known frequency distributions of interest in biometry and statistics The generalized Waring distribution applied to accident theory An Essay on the Principle of Population as It Affects the Future Improvement of Society, with Remarks on the Speculations of Mr. Godwin, M. Condorcet and Other Writers Analysin Infinitorum. Tomus primus. MarcumMichaelem Bousquet and Socios Deterministic and stochastic epidemics in closed populations The threshold concept in stochastic and endemic models The epidemic in a closed population with all susceptibles equally vulnerable; some results from large susceptible populations with small initial infections The final size of a nearly critical epidemic, and the first passage time of a Wienner process to a parabolic barrier An Introduction to Probability Theory and Its Applications A First Course in Stochastic Processes Random Graph Dynamics Separate roles of the latent and infectious periods in shaping the relation between the basic reproduction number and the intrinsic growth rate of infectious disease outbreaks On the spread of a disease with gamma distributed latent and infectious periods Appropriate models from the management of infectious diseases Infectious Diseases of Humans, Dynamics and Control Transmission Dynamics and Control of Severe Acute Respiratory Syndrome Comparative estimation of the reproduction number for pandemic influenza from daily case notification data Final size distributions for epidemics Asymptotic final size distribution for some chain binomial processes Symmetric sampling procedures, general epidemic processes and their threshold limit theorems Collective epidemic processes: a general modelling approach to the final outcome of SIR infectious diseases An examination of the Reed Frost theory of epidemics Generality of the final size formula for an epidemic of a newly invading infectious disease On the evolution of random graphs An assessment of preferential attachment as a mechanism for human sexual network information On the incidence of industrial accidents upon individuals with special reference to multiple accidents The univariate generalized Waring distribution in relation to accident theory: proness, spells or contagion? Stochastic Processes with Applications The polygonal distribution. International Conference on Mathematical and Statistical Modelling in Honour of Enrique Castillo