key: cord-0330862-3dayhwck authors: Vazquez, A. title: Exact solution for the infection dynamics with a gamma distribution of serial intervals date: 2020-10-20 journal: nan DOI: 10.1101/2020.10.17.20214262 sha: be77d13082e4256f46730be859882fe3f4baf816 doc_id: 330862 cord_uid: 3dayhwck The basic reproductive number, the expected number of secondary cases generated by an infected individual, is used to forecast infectious disease outbreaks. Yet, the basic reproductive number does not contain information about the incubation period, which influences the serial interval between infection and transmission. Using gamma distributions of serial intervals, I derive a formula to calculate the population reproductive number as a function of the basic reproductive number and the shape parameter of the serial interval distribution. This formula can be used to estimate the basic reproductive number from the serial interval distribution and the disease doubling time. The basic reproductive number, the expected number of secondary cases generated by an infected individual, is used to forecast infectious disease outbreaks. Yet, the basic reproductive number does not contain information about the incubation period, which influences the serial interval between infection and transmission. Using gamma distributions of serial intervals, I derive a formula to calculate the population reproductive number as a function of the basic reproductive number and the shape parameter of the serial interval distribution. This formula can be used to estimate the basic reproductive number from the serial interval distribution and the disease doubling time. Infectious diseases can spread to a significant fraction of the population causing an epidemic. The chance for that to happen is determined by the infectious dynamics within each individual and the disease transmission between individuals. The within individual dynamics determines the serial interval, the time interval between a primary case being infected to the disease transmission to a secondary case. The transmission between individuals is reflected in the reproductive number, the average number of secondary infectious caused by a primary case. The networks underlying the transmission of infectious diseases have connectivity distributions with heavy tails and exhibit the small-world property, as shown for sexual [1] and proximity [2] contact networks. In these networks the average reproductive number is proportional to the ratio between the second and first moments of the connectivity distribution [3] [4] [5] [6] , which can be very large. This evidence indicates that the topology of modern contact networks enables the occurrence of epidemic outbreaks. Less attention has been given to the interplay between the contact network topology and the distribution of serial intervals. Before entering the main calculations, let's have a look at standard models of infectious disease dynamics. In the susceptible, infected and removed (SIR) model, susceptible individuals get infected at rate β and infected individuals get removed at a rate γ. The basic reproductive number, the average number of new infections generated by a single infected individual, is given by The dynamics of the infected individuals at the population level is given byİ where S and I are the number of susceptible and infected individuals and N is the population size. For short times * avazque1@protonmail.com we can assume that S ≈ N and I N . In this limit we can integrate equation (7) obtaining where is the population reproductive number of the SIR model. According to the SIR model, the basic and population reproductive numbers are identical. In the SEIR model susceptible individuals get infected at rate β without becoming infectious (exposed), exposed individuals become infectious at a rate α and infectious individuals get removed at a rate γ. The infected compartment is now split into exposed, with number E, and infectious, with number I. Note that once an individual become infectious it behaves exactly as an infected individual in the SIR model. Therefore we obtain the same basic reproductive number The dynamics of the exposed and infectious individuals is given byĖ For short times we can assume that S ≈ N and I N . In this limit we can integrate equation (7) using an eigenvalue approach. Focusing on the largest eigenvalue, we obtain where . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.17.20214262 doi: medRxiv preprint is the population reproductive number of the SEIR model. The introduction of the exposed compartment carries as a consequence a discrepancy between the local and population reproductive numbers. The SIR and SEIR models neglect key aspects of real populations. First, there is a variable contact rate across the population, which translates to a variable rate of disease transmission across infectious individuals. Second, while the SEIR model accounts for the incubation period of infectious diseases, it is still too restrictive. Using well stablished mathematics from the theory of age-dependent branching processes, I have previously calculated the expected number of infected individuals of infectious disease outbreaks on heterogeneous populations and any given time interval distribution [7, 8] . However, the applications were limited to an exponential distribution of serial intervals. Here I use this formalism to calculate the population reproductive number for a broader class of serial interval distributions. The branching process approach maps the contact process mediating the disease transmission into a causal tree of disease transmission. The average reproductive number of patient zero, i.e. the expected number of secondary cases causally linked (generated) by patient zero, is given by where β is the average contact rate in the population and γ is the rate of recovery from the infection. For infected cases other than patient zero one needs to take into account that the disease spreading biases the disease transmission to individuals with a higher contact rate. The patient zero can be thought as an individual selected at random from the population. Any other infected individual will not be selected at random, but it will be found with a probability proportional to its contact rate: β/N β , where N is the population size. Once infected, the individual found by contact will engage in new contacts at a rate β. Therefore, the average reproductive number of patients other than patient zero is R 0 gives the average number of infectious at the first generation, those generated by patient zero. R 0 R gives the average number of infections at the second generation and R 0 R d−1 gives the average number of infections at the d generation. The actual time when an infected case at generation d becomes infected equals the sum of d serial intervals and it has a probability density function g d (t), where the symbol denotes convolution. Therefore, the average number of new infected individuals at time t is given by where N 0 is the number of patients zero and D is the maximum generation, when the disease transmission ends. When D is small, due to lockdown for example, the spreading dynamics follows a polynomial or powerlaw growth [7, 9] . Here, I focus on the case D → ∞, or more precisely t D dtg(t)t. In this limit Now we will focus on the shape of the serial interval distribution. In the SIR model an infected individual is infectious right from the time it becomes infected, until the infected individual is removed. Let t R be a specific realization of the removal time. Since removal takes place at a rate γ, t R has the exponential probability density function γe −γt . Assuming that each individual has a time-independent contact rate, the serial intervals will be uniformly distributed between 0 and the recovery time, resulting in the same exponential distribution of serial intervals g SIR (t) = γe −γt (14) In the SEIR model the serial interval can be decomposed into the sum t S = t I + t R , where t I is the time interval from exposed to infectious and t R is the removal time. Since the transition from exposed to infectious takes place at a constant rate α, then t I has the exponential probability density function αe −αt . Therefore There are two key differences between the serial interval distributions for the SIR and SEIR model. First, the mode for g SIR (t) is zero and that for g SEIR (t) is non-zero. Second, around t = 0, g SIR (t) ∼ γ, while g SEIR (t) ∼ . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.17.20214262 doi: medRxiv preprint αγt. Both differences are a consequence of introducing an incubation state between being infected and becoming infectious. The case α = γ in (15) makes difficult the calculation of convolutions. A more suitable functional form is the gamma distribution where s ≥ 1 is a shape parameter quantifying the convexity around t = 0 (Fig. 1) . The case g(t, 1) corresponds with the exponential distribution of the SIR model (14) . For s = 2 we obtain the case α = γ of the SEIR model (15) . More generally, the values of s and γ could be derived from the fitting to empirical data. For example, for COVID-19, the inspection of inferred serial interval distributions suggest γ ≈ 2 [10, 11] and a fitting to a gamma distribution resulted in γ = 2.5 [12] . Besides covering many possible scenarios, the gamma distribution is amenable to convolution. Using the Laplace transform method one obtains The convolution of a gamma distribution is itself a gamma distribution, with the exponent scaled by the order of the convolution (d → ds). Substituting (17) into (13) we obtain Equation (18) provides a series representation for the expected number of new infections. In the following we analyze different cases depending on the value of s. For s = 1 equation (19) is the series expansion of the exponential For s = 2 the series expansion of the hyperbolic sine For s = 4 it decomposes into the series for sinh(x) and sin(x) Using this analytical representations we uncover the impact of the time serial distribution on the infection dynamics (Fig. 2) . The case s = 1 is not representative for x 1. Indeed, f (0, 1) = 1 while f (0, s) = 0 for all s > 1. We also observe that with increasing s the plot of f (x, s) shifts to the right. More generally, when s is a natural number is a subseries of the exponential function. This series has been calculated [13, 14] , resulting in where ω = e 2πi/s . Since (ω n ) < 1 for all n = 1, . . . , s−1, we obtain the asymptotic behaviour for x 1 Based on this asymptotic behaviour, for R 1/s γt 1 the expected number of infected individuals, equation (18), can be approximated by where is the population reproductive number. The equation for the population reproductive number (27) can be used to estimate the basic reproductive number from empirical data for the serial interval distribution and the doubling time. According to equation (26), the disease doubling time is given by . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.17.20214262 doi: medRxiv preprint (27) and (28) we can estimate P and R. Equation (27) is definitive proof that the shape of the serial interval distribution determines the relationship between the basic (R) and population (P ) reproductive numbers. For s = 1 we recover the SIR model equation (4), when the individual and population reproductive numbers coincide. For s = 2 we recover the case α = β of the SEIR model in equation (9) . Figure 3 shows the population reproductive number as a function of s for two different local reproductive numbers. When R > 1, P decreases monotonically with increasing s. When R < 1, P increases monotonically with increasing s. In either case, P approaches 1 for large s. Note, however, that the shape parameter s does not change the fact that if R > 1 then I(t) grows exponentially, while I(t) decays exponentially when R < 1. The analysis of a gamma distribution of serial intervals can be extended to calculate the population reproductive number when there are heterogenous mixing patterns between individuals according to certain types [15, 16] . The outcome is similar to equation (27), after replacing R by the largest eigenvalue ρ of the missing matrix of reproductive numbers (equation 12 in Ref [16] ). After making this substitution, we obtain the population reproductive number for the multi-type generalization where Λ i is the largest eigenvalue of the mixing matrix of the ith population stratification (e.g., age, mask use, etc) [16] . In conclusion, the branching process formalism allows for a flexible description of infectious disease outbreaks that can be fully based on empirical distributions. The outcome is two equations: (i) The population reproductive number as a function of the basic reproductive number and the shape parameter of gamma distribution of serial intervals and (ii) the disease doubling time as a function of the population reproductive number and the scale parameter of the gamma distribution of serial intervals. These equations can be used to make estimate the basic reproductive number from empirical data. Causal tree of disease transmission and the spreading of infectious diseases This work was supported by Cancer Research UK C596/A21140.