key: cord-0767398-x30emox6 authors: Vazquez, A. title: Transition to multi-type mixing in d-dimensional spreading dynamics date: 2020-11-28 journal: nan DOI: 10.1101/2020.11.24.20238337 sha: debed1b5af4038b393d8e79859da51ddedf4ae37 doc_id: 767398 cord_uid: x30emox6 The spreading dynamics of infectious diseases is determined by the interplay between geography and population mixing. There is homogeneous mixing at the local level and human mobility between distant populations. Here I model spatial locations as a type and the population mixing by intra- and inter-type mixing patterns. Using the theory of multi-type branching process, I calculate the expected number of new infections as a function of time. In 1-dimension the analysis is reduced to the eigenvalue problem of a tridiagonal Teoplitz matrix. In d-dimensions I take advantage of the graph cartesian product to construct the eigenvalues and eigenvectors from the eigenvalue problem in 1-dimension. Using numerical simulations I uncover a transition from linear to multi-type mixing exponential growth with increasing the population size. Given that most countries are characterized by a network of cities with more than 100,000 habitants, I conclude that the multi-type mixing approximation should be the prevailing scenario. Homogeneous mixing is a cornerstone in the mathematical analysis of infectious disease outbreaks [1, 2] . In a fully mixed population we can define the basic reproductive number R 0 = β/γ, where β is the average rate of disease transmission from infected to susceptible individuals and γ is the rate of recovery from the disease. When R 0 < 1 the infectious disease outbreak dies out, but it grows exponentially in time when R 0 > 1. Human populations are not fully mixed, calling into question the value of R 0 . There are different mixing patterns according to age, immunization status and the adherence to government recommendations. Nevertheless, these heterogeneous mixing patterns can be treated under a generalized mixing approximation. If we stratified individuals into multiple types, then we can model the infectious disease spread as a multi-type spreading process [3] [4] [5] . In multi-type spreading processes R 0 is replaced by the largest eigenvalue ρ of the mixing matrix. When ρ < 1 the infectious disease outbreak dies out, but it grows exponentially in time when ρ > 1. Geographical heterogeneity seems to challenge the mixing hypothesis [6] [7] [8] . When simulated agents are constrained to a ring with nearest neighbours transmissions, the total number of infected cases grows linearly in time [6] . Sub-exponential infectious dynamics has been reported for agent based simulations of in urban settings as well [7] . There are two caveats though. First, the observation of non-exponential dynamics is not sufficient evidence to rule out the mixing hypothesis. I have shown that a combination of a high reproductive number with a truncation of the disease transmission chain yields a power law growth in the number of new infections [9] . The pre-lockdown phase of the COVID-19 outbreak is consistent with this power low growth in the number of new infections [10] [11] [12] . Second, there is no prove that * avazque1@protonmail.com geography cannot be harness under some mixing approximation. Here I demonstrate that geographical heterogeneity can be modelled as a type. Focusing on large cities, countries like Cuba look like a string of cities, with an effective 1-dimensional topology (Fig. 1a) . Other countries, like Germany, are better represented by a two-dimensional mesh, with an effective 2-dimensional topology (Fig. 1b) . At this level of description, we can model the network of cities as a multi-type network, where each city is represented by a type and the mixing pattern between cities accounts for the human mobility between them. In the following I introduce the multi-type mixing approximation for type networks with a d-dimensional topology, derive analytical results and test them with numerical simulations. I investigate the susceptible, infected and removed (SIR) model on the multi-type network of cities. In the SIR model each individual can be in one of three states: susceptible, infected and removed. Infected individuals transmit the disease to susceptible individuals, which in turn can become infected. Eventually, the infected individuals recover or die from the disease, at which point they are removed from the disease transmission chain, the removed state. A. One-type branching process When all individuals are of one-type we have a fully mixed population. We need to make a distinction though, between patient zero and any other case. In more detail, each individual contact other individuals at some rate ζ. Each contact results in disease transmission with probability p and the effective disease transmission rate is denoted by β = ζp. Finally, the infected individuals are removed, due to recovery, isolation or death, at a rate γ. Since patient zero is selected at random its typical disease transmission rate is β , resulting in the reproductive number R 0 is called the basic reproductive number. For infected cases other than patient zero we take into account the disease transmission bias towards individuals with a higher contact rate. Any case other than patient zero is found with a probability proportional to its contact rate: β/N β , where N is the population size. Once infected, the individual found by contact will engage in new contacts at a rate β. Therefore, the average reproductive number for patients other than patient zero is R 0 gives the average number of infectious at the first generation, those generated by patient zero. R 0 R gives the average number of infections at the second generation and R 0 R k−1 gives the average number of infections at the k generation. The actual time when an infected case at generation k becomes infected equals the sum of d generation times and it has a probability density function g k (t), where the symbol denotes convolution (g k = t 0 g (k−1) (τ )g(t − τ )dτ ). Therefore, the average number of new infected individuals at time t is given bẏ where I 0 is the of number of patient zeros. For the standard SIR model the distribution of generating times is the distribution of recovery times. Given that recovery takes place at a constant rate, the distribution of generation times is exponential In this case, equation (3) becomeṡ Noting that the series in (5) is the Taylor series expansion of the exponential, we obtaiṅ When R < 1 the disease dies out, but it grows exponentially when R > 1. R replaces the basic reproductive number R 0 in the context of contact heterogeneity ( β 2 > β 2 ). The multi-type formalism replaces the average reproductive number, a scalar, by a matrix of reproductive numbers. Let n be the number of types, representing cities or subpopulations. To patient zero we assign the reproductive number matrix R 0n,n . The matrix element R 0a,b represents the average number of cases of type b generated by a patient zero of type a. To any other case we assign the reproductive number matrix R n,n . The matrix element R a,b represents the average number of cases of type b generated by a case of type a that is not a patient zero. I have previously calculated the expected number of infected individuals of infectious disease outbreaks on heterogeneous populations with a multi-type structure [4] . Although the calculation is more involved it yields an expression with the same structure as the one-type casė where I 0 is a column vector representing the number of patient zeros by type and J n,1 is a column vector of ones. Or, making use of the exponential of a matrix, where I is the identity matrix. When R is diagonalizable we obtain results that resemble the exponential dynamics of the single-type case. Let P be the transformation matrix diagonalizing R, where λ 1 , . . . , λ n are the eigenvalues of R. Note that some λ n may be equal if some eigenvalues have multiplicity larger than 1. I assume that R 0 has the same form of R . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 28, 2020. ; https://doi.org/10.1101/2020.11.24.20238337 doi: medRxiv preprint where the factor is the ratio between equations (1) and (2). Finally, I will use the standard notation for functions of diagonal matrices Using (9) and (10) we can rewrite (8) aṡ Whether the number of new cases grows or decays is determined by the largest eigenvalue When ρ < 1 the disease dies out, but it grows exponentially when ρ > 1. ρ replaces the basic reproductive number in the context of multi-type mixing. The predictive value of ρ can be extended beyond diagonalizable R. In most realistic scenarios there is a bidirectional path between every pair of cities in the network. In the language of graph theory, the network of cities is strongly connected. In such a case R is irreducible in addition to be non-negative. Using the Perron-Frobenious theorem for irreducible matrices [13] , one can demonstrate that when ρ < 1 the disease dies out, but it grows exponentially when ρ > 1 [3, 4] . The diagonalization procedure can be extended to any generation interval distribution. Re-starting from equation (3), without specifying the form of g(t), we obtaiṅ where For the SIR model we have γe (x−1)γt and we recover equation (13) . I have calculated f (x, t) in terms of analytical functions for the gamma distributions of generation intervals [14] . Here I restrict the analysis to the SIR model. The extensions to other generation interval distributions is obtained after plugging in the specific function f (x, t). Now I analyze 1-dimensional topologies. Countries like Cuba have a 1-dimensional typology (Fig. 1a) . In this case the reproductive number is given by where R ii , R i−1i and R ii+1 are the intra-and inter-city reproductive numbers. To obtain analytical results, I will work with homogeneous cities and exchanges: R ii = a and R i−1i = R ii+1 = b. In this case R is a symmetric tridiagonal Toeplitz matrix. The eigenvalue problem of a tridiagonal Toeplitz matrix can be solved exactly [15] . The eigenvalues are where k = 1, . . . , n. From the latter equation we obtain the largest eigenvalue The intracity reproductive number a plus 2 times the intercity reproductive number b. The cos π n+1 is boundary correction. The type at the far left has no left neighbour and the type at the far right has no right neighbour. The boundary correction is irrelevant when n 1 (cos π n+1 = 1 + O(n −2 )). The value of ρ 1 can be used to determine whether the infectious disease dies out (ρ 1 < 1) or grows (ρ 1 > 1). . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. To characterize the contribution of all eigenvalues I go back to (13) . The diagonalization matrix for the tridiagonal Toeplitz matrix (18) is given by [15] P 1ij = 2 n + 1 sin ijπ n + 1 Since R 1 (a, b) in (18) is symmetric then P −1 = P T . In this case (13) is simplified tȯ We can work directly with this equation. We can arrive to a more explicit expression as well. Substituting the form of P 1 (21) and expanding the matrix sums we obtaiṅ In 2-dimensions we can take advantage of the cartesian graph product and the 1-dimensional solution to calculate the eigenvalues. First, I introduce the definition of cartesian product of weighted graphs. Cartesian product of graphs. Let G and F be a weighted graph with loops, adjacency-weighted matrices A and B and vertex count n and m, respectively. The cartesian product of G and F , denoted by G F , is the graph with adjacency matrix where I n is the identity matrix of size n 2 , ⊕ denotes the Kronecker sum and ⊗ denotes the Kronecker product. The Kronecker product is defined as [17] A The eigenvalues of the Kronecker sum equal the sum of the eigenvalues of the summed matrices [17] . If α k and β l are the eigenvalues of A and B, with associated eigenvectors u (k) and v (l) and transformation matrices P and Q, then are the eigenvalues of A ⊕ B, the associated eigenvectors and the transformation matrix diagonalizing A ⊕ B. We can decompose a 2-dimensional grid as the cartesian graph product between a 1-dimensional graph G 1 (a, b) with loops and another 1-dimensional graph G 1 (0, b) without loops (Fig. 3) , The adjacency matrix of G 2 (a, b) is Substituting (19) into (29) we obtain the eigenvalues of R 2 (a, b), λ n(k−1)+l = a + 2b cos πk n + 1 + cos πl n + 1 (34) . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted November 28, 2020. ; https://doi.org/10.1101/2020.11.24.20238337 doi: medRxiv preprint where k = 1, . . . , n and l = 1, . . . , n. From the latter equation we obtain the largest eigenvalue ρ 2 = a + 4b cos π n + 1 (35) Noting that R 1 (a, b) and R 2 (0, b) are both diagonalized by P 1 , we substitute P = Q = P 1 in (31) to obtain the transformation matrix in 2-dimensions Substituting (36) into (13) and bearing in mind that This equation is the multi-type mixing approximation for the average number of new infections in a 2-dimensional topology. We can iterate the cartesian graph product to generate d-dimensional grids The average number of new infections in these ddimensional grids is given bẏ with eigenvalues where k i = 1, . . . , n and i = 1, . . . , d, and largest eigenvalue ρ d = a + 2db cos π n + 1 (41) Applications for the 3-dimensional case may include the spreading of prion protein aggregates in the human brain. The basic reproductive number and its multi-type extension are defined in a contex where the number of infected individuals is much smaller than the population size. In a finite population the number of infected individuals may reach a significant percent of the population size. In that case the starting equation (3) is no longer valid. To investigate the validity of the analytical results and the population size effects, I will perform numerical simulations of the SIR model in 2-dimensional grids. The cities are assumed of equal population size H. The number of susceptible individuals at city (i, j) is stored in the variable S ij . A list L is created with elements storing the coordinates of all infected individuals. The size of L is the number of individuals in the infected state, I = |L|. γ = 1 and time is measured in units of 1/γ. Initial conditions: At t = 0 all individuals are in the susceptible state except from one individual at city (1, 1) . L is initiated with the coordinates (1, 1) and S ij = H for all (i, j) except for S 1,1 = H − 1. Dynamics: The Gillespie algorithm is used to update the system. At a total rate µ = (a + 4b + γ)I a new event happens. The time interval ∆t to this new even is extracted from the exponential probability density function µe µ∆t . A coordinate is selected at random from L and stored in (i, j). With probability γ/µ the selected coordinate is removed from L. Otherwise, the disease transmission rule is applied as specified below. The time is updated t → t + ∆t. Disease transmission: With probability a/(a + 4b) we set (i , j ) = (i, j), otherwise we set (i , j ) equal to one of the 4 neighbours (i+1, j), (i+1, j), (i, j −1) and (i−1, j) with equal probability. With probability S i ,j /H a susceptible individual in (i , j ) gets infected, the coordinates (i , j ) are added to L and the updates S i ,j → S i ,j − 1 and I → I + 1 are performed. The empty boundary condition S i,−1 = S n+1,j = S i,n+1 = S −1,j = 0 are used. Statistics: The number of new infections is recorded in time bins of size 1 and the average is calculated over multiple realizations. First I illustrate the transition to the multi-type branching approximation with increasing the cities population size. To this end I use the parameter set r = 1, a = 2 and b = 1/4. For H = 100 there is an evident linear increase ofİ as a function of time (Fig. 4 , circles). Furthermore, the numerical results are quite far from the multi-type mixing approximation (Fig. 4, line) . For H = 1, 000 the numerical solution gets closer to the multi-type mixing approximation but it follows a linear growth (Fig. 4, squares) . Yet, the linear range keeps reducing for H = 10, 000 and it is almost absent for H = 100, 000. As H increases we observe that the region characterized by an exponential growth increases. More important, within the exponential growth regime the numerical solution coincides with the multi-type mixing approximation (Fig. 4, line) . Second I report a lattice size effect. Based on equation (35), for a+4b 1 we can find scenarios where the reproductive number is smaller than 1 for L < L c and larger . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 28, 2020. ; https://doi.org/10.1101/2020.11.24.20238337 doi: medRxiv preprint For example, for a = 1.1/2 and b = 1.1/4 the multi-type calculation predicts a transition from decay for L < 4.13 to growth for L > 4.13. Figure 5 reports the average number of new infections as a function of time for different values of city sizes H = 100, 1,000, 10,000 and 100,000 and grid linear size L = 3, 4, 5 and 6. For H = 10, 000 and 100,000 there is confirmation of the multi-type mixing prediction.İ decays for L = 3, 4 (L < 4.13) whileİ growths for L = 5, 6 (L > 4.13). Here again there are city/subpopulation size effects. We do not observe the expectedİ growth for L = 5 when H = 100 or 10,00. This example teach us that some diseases may not generate outbreaks when the network of cities is small, but as they city network grows they generate outbreaks. For the largest city sizes H = 10, 000 and 100,000, I have plotted the analytical solution calculated from equation (39). At the early times that points from the numerical simulations fall in the multi-type mixing line. For longer times there is a deviation from the analytical results due to to the finite population size. I have demonstrated the use of the multi-type network approach and the cartesian product of graphs to calculate the epidemic threshold of spreading dynamics in ddimensional grids. These calculations can be deployed to estimate the epidemic threshold at the country level aggregating cities/regions of the oder of 100,000 habitants or larger. The largest eigenvalue of the reproductive number matrix will provide a good estimate of the epidemic threshold. The analytical formula for the average number of new infections will provide a good approximation to the initial dynamics of the outbreak. The numerical simulations in 2-dimensional grids uncovered a gradual transition from linear to exponential growth with increasing the cities/subpopulation size. In the exponential growth regime, the multi-type mixing approximation coincides with the numerical results. Given that super-spreaders and lockdown can lead to a power low dynamics [9, 12] , I conclude that geographical heterogeneity is not a necessary not a sufficient condition to observe a power law growth in the average number of new infections. Instead, given that in modern times most countries are characterized by a network of cities with more than 100,000 habitants, I expect the multitype mixing approximation to be the prevailing scenario. Infectious Diseases of Humans: Dynamics and Control Mathematical Epidemiology of Infectious Diseases: Model Building, Analysis and Interpretation The Theory of Matrices Linear Algebra and its Applications 197-198 Table of Integrals, Series, and Products Topics in Matrix Analysis