key: cord-0521244-ssou5910
authors: Oz, Yaron; Rubinstein, Ittai; Safra, Muli
title: Multivariate Generating Functions for Information Spread on Multi-Type Random Graphs
date: 2021-06-22
journal: nan
DOI: nan
sha: 8480897d8e4db59ed446fe8900a6b4b1fa935759
doc_id: 521244
cord_uid: ssou5910

We study the spread of information on multi-type directed random graphs. In such graphs the vertices are partitioned into distinct types (communities) that have different transmission rates between themselves and with other types. We construct multivariate generating functions and use multi-type branching processes to derive an equation for the size of the large out-components in multi-type random graphs with a general class of degree distributions. We use our methods to analyse the spread of epidemics and verify the results with population based simulations

Random graphs [1, 2] provide a framework for studying the spread of information in networks and have been successfully applied to a wide range of dynamical systems such as social networks, epidemic spread and the internet [3] [4] [5] [6] . In the simplest setup, one considers a complete graph where any two vertices are connected by a link with a prescribed probability p, independently for each pair of vertices. We will refer to such a link between two vertices as "transmission", which means that some information passed between the two vertices, and one often studies the fraction of the graph nodes to which information originating from some vertex is expected to reach. In the language of epidemic spread, which we will be using, such nodes are called "infected". Above a certain threshold value p = p c one has a giant connected component (GCC) or a large out-component of infected nodes in the case of a directed graph.

This type of graph (also known as Erdös-Reyni graphs) is often generalized in one of two ways. The first is by adding a more general degree distribution [7] [8] [9] [10] [11] [12] [13] [14] [15] . Since the existence of each edge in an Erdös-Reyni graph is i.i.d, the in and out degrees of all vertices are distributed according to a Poisson distribution. However, in many real-world applications the degrees of the vertices often exhibit a higher variance, resulting in different statistical behaviours of the resulting graphs. Newman et al. [7] presented a powerful mathematical tool using generating functions that allows one to analyse the spread of information on several simple graph structures with general degree distributions. This tool works by analysing the generating function of the size of connected components and used to calculate the size of the greatest connected component (GCC) of the graph. In [16] these tools were generalized to study percolations of several other families of graph.

Recently, the COVID-19 pandemic has spurred a lot of renewed interest in the study of the spread of diseases. In particular, due to the heterogeneous spread of COVID-19, which is often characterized by super-spreading events and super-spreading individuals, a great deal of effort has gone into assessing the effects of heterogeneity in epidemiology. Some of these papers focus on the early termination probability of an epidemic (the probability that the disease would disappear without intervention or herd immunity) when it is characterized by heterogeneous spread [9, 17] . Others focused on the effect of heterogeneity on the end of an epidemic when it is already widespread. In [11] [12] [13] the authors asked when we would reach herd immunity in the sense that the effective reproductive index R eff would be below 1. They use a variant of the SIR model to show that for negative binomial distributions, the percentage of the population infected before it reaches herd immunity can be characterized by a simple formula.

In [15] the authors use a graph based model to estimate the total number of infected individuals until the epidemic is completely eliminated due to the herd immunity. These models predict very different outcomes, due to the fact that at the point in time when R eff = 1, there can still be many infected individuals causing an "after-burn" effect. These models were compared and their limitations were explored in detail in [10] . Unlike most of the previously mentioned models, Nielsen et al. [14] assume that heterogeneity is only in the infectiousness (and not in susceptibility). In such a case, the heterogeneity is not expected to have a direct effect on the portion of the population reached by the disease, but can be used to model policies aimed at reducing the infectiousness of super-spreading individuals and super-spreading events.

In most of these examples, the graphs considered have a "single-type" structure, that is, all the vertices are equivalent. This is characterized by the fact that while the in and out degrees of the vertices are not always the same, once these are selected, any outgoing edge from any vertex can be connected to any other vertex with the same probability (weighted by that vertex's in-degree). In other words, for any vertex v with in-degrees i, and for any vertex u with out-degree o, the probability that the edge e = (u, v) is in the graph is equal to Pr [e ∈ E] = cio for some constant c. While being a good approximation in diverse cases, many real life networks are not complete graphs and are better approximated by a multi-type structure. That is, the vertices are partitioned to different types (communities) that have a different interaction between themselves and with other types.

For instance, in [18] , the authors use data on social interactions to estimate the rate at which members of different age groups are likely to infect one another with respiratory spread diseases. Indeed, they found that age cohorts have a significant influence on the network of interactions. For example, they found that most members of each age cohort are more likely to interact with members of their own cohort and that certain pairs of age cohorts interact more often than others. Additionally, some age cohorts are in general more active than others, and as a result we would expect an epidemic to affect them with differing degrees. Effectively modeling this phenomena could have a huge impact on our understanding of epidemics.

Multi-type graphs are most easily analysed when the in and out degrees are i.i.d Poisson distributions. This model was proposed by Ball and Clancy [19] and the size of the large out-components was derived. In [20] , this formula is presented with additional background (see Chapter 6) . In [21] a simple condition for the existence of a giant connected connected component was proven. However, these models do not take into account the heterogeneity which was also shown to have a significant impact on the epidemiological spread. One possible combination of these models was considered by Britton et al. [22] . In this model one attributes to each individual a type 1 ≤ i ≤ r and an activity level α ∈ R, and the probability for an infection between any two individuals u, v of types i, j respectively is Pr [(u, v 

For the COVID-19 pandemic, it is commonly assumed that the activity levels are distributed according to a Gamma distribution with a small value of k ≈ 0.1 [9] [10] [11] [12] [13] [14] [15] , and there is experimental evidence supporting this assumption. For instance, Lloyd-Smith et al. who estimates k ≈ 0.16 from experimental data on SARS-1 outbreaks [23] , and [24] who showed that COVID-19 infections have a higher variance (and therefore have a lower k). However, the methods used in [22] , are only applicable to discrete distributions with a small support and as a result they cannot be easily used to model Poisson activity levels. Instead, they consider a heuristic distribution where each type has three activity levels representing 25%, 50%, 25% of the population with activity levels of 1 2 , 1, 2 respectively.

The aim of this paper is to adapt the tools developed in [7] to the multi-type setting considered in [21] by introducing a tool to model and calculate the size of the large out-components of a more general class of multi-type graphs, which includes Poisson graphs with an arbitrary activity level distribution. This will allow us to use the activity level multi-type model introduced in [22] , with activity levels distributed according to a Gamma distribution. Our analysis will follow in the same general direction as the one in [7] , adapting many of its intermediate results to the multi-type setting as well.

In our work we focus on directed graphs as they are more relevant for most of our use-cases. For such graphs one needs to refine the definition of the GCC. One possible approach, as taken in [7] , is to look at greatest strongly connected component, i.e. the bow-tie model. However, keeping for instance with the example of the spread of a disease, one is interested in the percentage of the population infected when the disease starts from a small fraction of the population (i.e. in the out-component of those initially infected). We formalise the concept for general graphs and apply our framework in several settings, including a comparison of different activity levels in the multi-type setting considered in [22] , and discuss its limitations.

The paper is organized as follows: In Section 2, we define the main concepts of our framework and give an overview of the limitations and assumptions of our model. In Section 3, we show how these can be combined to find the size of the large out-components. In Section 4, we consider several examples of multi-type graphs and analyse them using the tools developed in Section 3. In Section 5, we apply our results to modeling the spread of epidemics. In Section 6, we discuss in detail the limitations of our framework. Section 7 is dedicated to a discussion of the results and their implication. In the appendices we outline in detail some technical aspects.

In this section we lay the groundwork for the construction of the generating functions of multi-type directed random graphs. Let G = (V, E) be a random graph representing a population of N = |V | individuals. We partition V into r disjoint sets V = V 1 · · · V r and for any vertex u of type i (i.e. u ∈ V i ) we define its r dimensional in / out degree vectors to be

and

We will say that G is an r-type graph if it is created by a process, which, after assigning to each vertex a type (with proportions π i = n i N ) and in and out degree vectors, connects any two edges of the same type with the same probability. Formally, we require that

Furthermore, for all i, we will denote by (I i,1 , . . . , I i,r ), (O i,1 , . . . , O i,r ) the random variables corresponding to the in and out degree vectors of a random vertex of type i. In other words:

) Throughout this manuscript, we will assume that the second moments of I i,j and O i,j are finite for all i, j. This condition is assumed both in our definition of sufficient independence (see Section 2.3) and for our analysis (see Section 3.1). This assumption holds for the class of graphs that will be the main focus of our applications. However, it does limit our model in some aspects and we consider this limitation in-depth in Section 6.2.

Let v ∈ V be a vertex in G = (V, E). We will define the dth neighbors of v, denoted by N d (v), to be the set of vertices that can be reached from v in exactly d steps. Furthermore, we will define N i,j (d) to be the random variable obtained by choosing a random vertex v of type i, and counting the number of dth neighbors of type j it has. Thus,

(2.5) Using this definition, we extend the definition of a reproductive index R eff from the study of simple single type graphs (on which the spread of information or diseases can be characterised by the SIR equation), to the rate of expansion of general families of random graphs. We will base our definition on an experimental method commonly used to approximate the value of the reproductive index in epidemics (see e.g. [25] [26] [27] [28] ). This method works by measuring the number of infected individuals as a function of time I(t) and fitting it to an exponential growth I(t) ≈ cR eff d(t) , where d(t) = t τ is assumed to be the depth of the infection graph as a function of time and τ is the time it takes a newly infected individual to infect others.

In general, the expected number of infected nodes at depth d reads:

Following the methods of [25] [26] [27] [28] , we fit N ave (d) to an exponential growth and define:

In order to help clarify these definitions we will consider a simple example. In Section 3.2.1, we will show how the R eff parameter can be computed for general degree distributions (assuming sufficient independence).

We will consider a simple case where both the in and out degrees of any vertex are distributed according to a delta distribution. Let G = (V, E) be a r = 2-type graph, with an evenly divided population (π 1 = π 2 = 1 2 ), where all the vertices of type 1 have in-degree (1, 1) and out-degree (1, 2) , and all the vertices of type 2 have in-degree (2, 0) and out-degree (1, 0).

In other words, every vertex of type 1 has 2 incoming edges (1 from another vertex of type 1 and 1 from a vertex of type 2), and 3 outgoing edges (1 to another vertex of type 1 and 2 to vertices of type 2). Furthermore, every vertex of type 2 has 2 incoming edges (both from vertices of type 1) and 1 outgoing edge (to a vertex of type 1). In Figure 1 we present a visual representation of such a graph with N = 10 vertices.

Let v be some vertex. Denote the number of dth neighbors of type 1, 2 of v by N 1 (d), N 2 (d) respectively. Because of our assumptions, these can be used to compute the number of d + 1th neighbors of v (so long as there are no repeat neighbors -see Section 3.1).

This forms a linear recurrence rule for

, the solutions of which can be shown to be of the form N i,j (d) = a i,j λ d 1 + b i,j λ d 2 , where λ 1,2 = −1, 2 are the roots of f (x) = x 2 − x − 2 (and also the eigenvalues of the matrix in equation (2.8)) and a i,j , b i,j are listed below.

where a = − 1 5 and b = 1 2 . Clearly the second term dominates and R eff = lim d→∞ (N ave (d)) 1 d = λ 2 = 2. In Section 3.2.1 we will show that in general the value of R eff is determined by the largest eigenvalue of a "weighted interaction matrix". In this example the interaction matrix would be M = 1 1 2 0 and its largest eigenvalue is indeed 2. Figure 1 : An example of a 2-type graph with a delta degree-distribution. This graph has r = 2 types of vertices (the red and blue circles). Each vertex of type 1 (blue) has in-degree (1, 1) and out-degree (1, 2), and each vertex of type 2 (red) has in-degree (2, 0) and out-degree (1, 0).

We would like the large out-component of our graph to model the spread of an information such as a disease or some other transmission, assuming that it starts from a small fraction of the graph. In other words, we require this component to include those vertices that are likely to be reached when spanning from a small fraction of the vertices when chosen at random.

We say that a vertex v is a member of the large out-component of a graph G if the set of vertices spanned by v on the reversed graph (which we call its in-component) is very large.

As we will see in the next sections, in random graphs, a constant fraction of the vertices would have bounded in-components, while the rest would have very large in-components. If we look at the set spanned by a small fraction of the vertices in the graph, any vertex with a bounded in-component is unlikely to be a member of this spanned set, while a vertex with an unbounded in-component, is unlikely not to have at least one of these starting vertices in its in-component. Therefore, in order to understand the fraction of the graph "easily" spanned, we will need to consider the fraction of the vertices in the graph with large in-degrees.

A major step in the analysis of [7] is to consider the distribution of the in-degrees of vertices when weighted by their out-degrees, i.e. giving vertices with higher out-degrees a larger weight when averaging. Generalizing this to the multi-type case is not straightforward, since it is not clear how one should take into account the r-dimensional multi-type weight. One possible approach is to take the total out-degree, but this can be unreliable. Consider, for instance, the case where one of the types has only incoming edges, and no outgoing ones. Clearly, edges to this type are "dead ends" and should not be taken into account. However weighting vertices by their total out-degree does take these edges into account, and can lead us to use an incorrect distribution on the vertices.

In order to avoid this issue, we will restrict the analysis to cases where it does not matter by which dimension of the out-degree we weight a vertex. We will call such graphs "sufficiently independent" (see formal definition below). Solving for this regime will still allow us to accomplish our main goal of combining the theory of multi-type graphs presented in [19] [20] [21] , with the theory of graphs that have more complex degree distributions and correlations between in and out degrees [10-15, 22, 23] .

We will do this using a model similar to the one considered by Britton et al. in [22] . As discussed above, the authors attribute to each individual a type and an activity level. The type of an individual corresponds to their age cohort (controlling which other individuals they are likely to interact with), and the activity level corresponds to their overall propensity to infect and to become infected. We will show that if each individual has a single activity parameter affecting their interaction with all of the other types in the same manner (as is the case in [22] ), then the graph is sufficiently independent. This allows us to model the behaviour of multi-type population with general activity level distributions (whereas the model proposed by Britton et al. only covers discrete activity distributions). Additionally, as will be clear from the definition, in cases where each type interacts with only one other type, the graph is always sufficiently independent. Therefore, bipartite graphs, which are often considered when modeling the spread of sexually transmitted diseases [8] , are also covered by this definition.

That being said, there may be many physical systems which are not sufficiently independent and it is possible to generalize our analysis to cover such cases. However, this would result in a rather complex model which is not likely to be applicable for most real world use cases where one does not have a sufficiently detailed distribution to begin with. In Section 6.1, we will further explore this limitation by showing an example of a graph for which sufficient independence does not hold and remark on how the methods shown in this paper might be adapted to that regime.

Formally, we will say that a distribution of in/out-degrees, i.e. a distribution of pairs of vectors in N r , is sufficiently independent if the distributions of the out-degree of a vertex is the same when weighted by any coordinate of its in-degree. Similarly, the in-degrees can be weighted by any of the out-degrees. Consider a random vertex v of type i. Denote by I i,j the distribution of its jth in-degree, i.e. the number of edges from vertices of type j to v, and by O i,j the distribution of the jth out-degree of v, i.e. the number of edges going from v to vertices of type j. We denote

. Keeping in mind that I i and O i can be correlated as they are the in and out degrees of the same vertex, our condition for sufficient independence is that for all i, k, l (such that O i,k , O i,l = 0 for equation 2.10 or I i,k , I i,l = 0 for equation 2.11) and for all x:

where ... is the mean value of the distribution. Under this assumption, we can define the in/out degree distribution weighted by the out/in degree distribution.

Definition 2. Let G be a sufficiently independent multi-type random graph with in and out degree distributions I i,j , O i,j , respectively. We define its biased in/out degree distributions as:

These are well defined even though we did not specify k precisely because of our independence assumption (2.10) and (2.11).

In order to generalize the results of [7] to the multi-type setting, we will first generalize their analysis of single variable generating functions to the multivariate regime. Working with multivariate generating functions instead of high dimensional probability distributions will allow us to complete our analysis with a greater deal of elegance.

Consider first a random graph with a large number N of single type of vertices. Let A be a random variable that corresponds to the distribution of the degrees of these vertices. Define a generating function by [7] :

where p α = Pr[A = α] is the probability that a vertex chosen at random will have a degree α. The normalization of the sum of the probabilities to one gives G A (1) = 1. For this reason as well (2.14) is well defined in the range |z| ≤ 1.

The generating function encodes the information of the probability distribution. The probabilities p α are obtained by appropriate derivatives of (2.14) with respect to z at z = 0:

while the moments of the distribution are obtained by the derivatives of (2.14) at z = 1:

In the following we generalize (2.14) in two ways, first by having multi-type vertices making z a vector and second by allowing multiple random variables by making A a vector.

Definition 3 (Multivariate Generating Functions). Let A : Ω → N r be a random variable whose values are non-negative integer values vectors of dimension r (where r is the number of types in our graph). Its multivariate generating function is:

where p α 1 ..αr = Pr[A = α] is the probability that A has the value α = (α 1 , ..., α r ). Let A = (A 1 , . . . , A m ) be a vector of such random variables, A i : Ω → N r . We define their r → m multivariate generating function as:

The next ingredient in the analysis is the product of the generating functions of independent variables. Consider the products of the generating function in the single-type case [7] :

and the summation is over all the k partitions of n. We generalize this in the multi-type case as follows. 

where the denotes point-wise multiplication.

Proof. As in the single-type case (2.20) this lemma can be shown by writing the coefficients of G A G B directly and seeing that they correspond to a convolution of the coefficients of G A and G B .

Consider next the composition of generating functions. In the single type case one has [7] :

(2.21) Equation (2.21) means that applying the generating function of A to the generating function of B is equivalent to producing A independent copies of the distribution B and summing over them. This can be generalised to the multivariate case in the following manner.

Definition 4. Let A, B be vectors of random variables with n → m and m → k multivariate generating functions G A , G B respectively. We define their composition:

. Let A, B be independent vectors of random variables (i.e. A i and B i are independent for all i), with r → m and m → k multivariate generating functions G A , G B respectively. For each i ∈ {1, . . . , r}, we define the random variable C i to be the result of the following random process:

Proof. For each value of b the generating function of the conditional distribution is 24) and the final generating function of C i is

In this section we generalize the analysis of [7] from single-type random graphs to multi-type random graphs [21] . We use the multi-type Galton-Watson branching process and derive the equations for the sizes of the large out-components. We show that this is indeed well defined, since a certain fraction of the vertices in the graph will have a large in-component, meaning that any large enough set will span them with high probability.

Following the same steps as the derivation in [7] , we will show that if the size of the connected component originating from some vertex v is bounded, then the probability that it contains a cycle is O N −1 . Moreover, we will show so long as the set of vertices that can be reached from v through d steps (where d is not necessarily O(1)) is only a negligible fraction of the graph, then it is "probably almost a tree" (wp 1 − o(1) the number of vertices is at least 1 − o(1) times the number of edges). Proving this would show an additive structure to the size of the sets spanned by vertices in the graph. Let v be a vertex in the graph, and define the dth neighbors of v, N d (v), to be the set of vertices that can be reached from v in exactly d steps. If the sub-graph containing them is tree-like, then

This additive structure in the components will be the basis of most of our analysis.

Let v be a vertex in the graph. Consider a BFS (breadth first search) walk on the vertices spanned by v (if necessary, abort the walk when it reaches depth d + 1), and let u be a vertex in this traversal. This walk begins at a starting vertex v, and enumerates the vertices spanned by it, in order of their distance from v.

In the step of the BFS process starting from u, the outgoing edges e = (u, t) from u are enumerated and for each e, we view its target t to see if it has already been visited in this BFS process. We will prove our claims by bounding the probability that the target of any step is already in the spanned set. By our construction of the graph, the probability of any given vertex w to the spanned set is determined only by its type and its in-degree. In other words, any given w ∈ span{v} might be a more likely candidate for being the target of an outgoing edge from u, either due to its high in-degree (w is selected from a distribution that is somewhat biased towards a higher indegree) or because of its type, but by our construction, it cannot have any other bias towards it. We will show that neither of these can skew the distribution too much.

Since π j fraction of the population are of type j, conditioning on the fact that w is of type j can only increase the probability of w appearing again by a factor of π −1 j . Furthermore, assuming u is of type i and w is of type j, w is only

EI i,j more likely to be selected due to its in-degree. In other words:

In principle, it is possible for some vertices to have a very large in/out degree. For instance, a common model for the spread of sexually transmitted diseases assumes a power-law degree distribution with power α ≈ 3.2 (see Section 6.2). In such a case we would expect to see some individuals with out-degrees of the order of N α −1 1. However, so long as the second moments of I i,j are finite, this becomes the exception rather than the rule.

Denote by S ⊆ span{v} the set of vertices already visited by the BFS. Averaging over w ∈ S of type j, their ith in-degree would be µ i,j = EI i,j 2 EI i,j . Since these are all finite, we can insert them into the Equation 3.2 to get:

This traversal has enumerated over all of the edges spanning from v and we have shown that on average, almost all of them lead to previously unexplored vertices. Therefore, using the Markov inequality, we can conclude that with high probability almost all of them lead to new vertices. In other words, the size of the connected component of a vertex is almost always close to the sum of the sizes of the connected components of its neighbors.

Consider an infection originating from a single individual. If, on average, an infected individual tends to infect less than one new individual, it is clearly unlikely that this infection would reach a significant portion of the graph (for instance, since the expectancy of infected individuals is less than 1, allowing us to bound this probability by Markov's inequality). Even if on average an infected individual tends to infect more than one new individual, the infection can still peter out early on (for instance if the patient zero happens not to be very infectious even though the average patient is infectious). However, once sufficiently many individuals are infected, it becomes extremely unlikely for the disease not to grow (due to the law of large numbers). This growth will not go completely unchecked, and will end once a large enough percentage of the population has been infected and herd immunity is reached. If the first moment of its biased degree distribution (see definition 2) is finite, then this happens after some constant fraction of the population has been infected.

Our goal in what we call a "Galton-Watson analysis" is to find the probability that a randomly chosen vertex will be among those who span a large portion of the graph. Analysing this probability will help us find the portion of the population expected to be in each of these large out-components. Naturally, this analysis depends on the distribution of the out-degrees of the vertices, but it can also depend on the joint in and out degree distribution. As we will show, when the in and out degree distributions are sufficiently independent, it suffices to know the out-degree distribution and the weighted out-degree distribution in order to find this probability.

Let 

Consider the random variables N (d) i,j defined in Section 2.1 that denote the distribution of dth neighbors of type j of a random vertex of type i. Let BN (d) i,j denote the same distribution over vertices weighted by their in-degree. Since the number of dth neighbors of a vertex depend only on the number and type of its out-going edges, these are well-defined for sufficiently independent graphs. For instance,

Next we consider the case where d = 2. Neglecting vertices that can be reached from our starting vertex in more than one way (see Section 3.1), the distribution of the number of second neighbors is the distribution of the sum of the number of first neighbors of each of the first neighbors of our starting vertex. Since the neighbors are independent of each other, we can apply the composition lemma (2.2). Therefore, the generating function of the distribution of the number of second degree neighbors of a random vertex reads:

(3.5)

In the general dth neighbors case, we have:

and:

We will work with the definitions of N ave (d) (2.6) and R eff (2.7) presented in Section 2.1. We can now use equation (3.7) to find the value of R eff for all sufficiently independent graphs. We will do this using a Taylor expansion of G N (d) near 1. Let ε be a small perturbation for the Taylor series:

Consider next the Taylor expansion:

where:

and v λ is the eigen-vector of M B O with eigenvalue λ. We get:

where c(λ) = v λ , 1 M O v λ , π λ = Θ(1) as a function of d. In Section 4.1 we will derive G O and G B O for Poisson distributions. In this case: Another commonly used notion of the reproductive index is as a threshold indicator for the spread of a disease: when R eff ≤ 1, the large out-component should be negligible and when R eff > 1 the large out-component should be a set fraction of the vertices. This aspect of the reproductive index was considered for multi-type graphs in [21] . In Figure 2 we consider a set of multi-type graphs with values of R eff near 1. It is clear that in the example computed for this figure, our definition of R eff also acts as a threshold indicator. We will comment on this in the discussion section.

Returning to our main problem, we are interested in the distribution of the sizes of spanned sets in the graph. Let S i and BS i be the distributions of the sizes of the sets spanned from uniformly random and weighted vertices of type i, respectively. BS i is well defined, since the distribution of the size of the set spanned from any vertex depends only on its out-degree, and we know how to weight the out-degree distribution by the in-degree due to the sufficient independence of the graph.

Consider the generating functions G S and G B S , analogous to H 0 and H 1 defined by Newman at al. [7] .

Applying the composition lemma once more, we obtain:

(3.14)

Given G O and G B O we could in principle solve (3.14), first for G B S using the second equation and then for G S using the first equation. However, this is a complicated task and we will follow and generalize the procedure taken in [7] . From (3.14) we have: (3.15) and the probability that the connected component originating from a random vertex of type i is not small reads:

In the previous subsection we derived an equation that relates the probability that the connected component originating from a random vertex is large to the parameters of the random graph. This is similar to the question of whether or not a Galton-Watson process will terminate. However, we will be interested in a slightly different question, namely: "what is the size of the largest connected component?". That is, "what is the probability that a random vertex can be reached by a large portion of the graph?". The difference between these two questions is not important when dealing with undirected graphs where a vertex can be reached by the giant connected component (GCC) if and only if the GCC can be reached by it. When the graph is directed, these questions are interchangeable by reversing the edges of the graph. Define I i and BI i to be the distributions of the in-degrees of a random vertex of type i and the in-degrees weighted by its out-degree, respectively (these are the equivalents of O and BO in the reversed graph). Since our definition of sufficient independence was symmetric, the reversed graph is also sufficiently independent and these are well defined. Similarly, we will define C i and BC i to be the distributions of the size of the component that spans a uniformly random and a weighted vertex. These are the equivalent of S i and BS i of the reversed graph, and as in the Galton-Watson analysis we may use their generating functions to find the probabilities that they are finite: 

In this section we will consider various degree distributions, show that they are sufficiently independent, and analytically construct the above generating functions.

Similar to the single-type case, the most natural model for multi-type random graphs is the case where each in/out degree is chosen from an independent Poisson distribution. That is why this model has been the focus of much of the previous work on multi-type graphs [19] [20] [21] [22] .

In the case of a multi-variate Poisson degree distribution, each vertex u of type i has an edge to each vertex v of type j w.p. p i,j . There are n j vertices of type j and p i,j scales as the inverse of the total population size, thus the distribution of the number of j neighbors of any i vertex is close to a Poisson(n j p i,j ). In the reversed case, the distribution of the number of j vertices of whom u is a neighbor is distributed according to a Poisson(n j p j,i ). Altogether this gives us the generating functions:

and similarly:

Furthermore, since the in / out degrees are independent, we have G B I = G I . Therefore:

From here, we can apply our main theorem to conclude that at the end of the epidemic, the fractions of the different types of the infected population are u where 

In general, if the in-out degree distributions to each type are independent, then the generating function is a product of generating functions: 

(4.5)

Thus for instance, if we consider a 3 type graph whose first type has in-degrees that are distributed according to a Poisson distribution with mean λ, a binomial distribution with parameters n, p, and an exponential distribution with parameter κ, then the multivariate generating function reads:

Another type of generalisation of the Poisson matrix distribution was introduced in [22] . In this paper the authors partition the population into 6 age cohorts and model the transfer of a disease between these cohorts by using the multi-type Poisson distribution, with a certain interaction matrix M . Furthermore, they assume each age cohort is in itself partitioned into 3 sub-categories based upon their levels of interaction with others. This leads to a multi-type problem with r = 3 × 6 = 18 types, thus requiring to solve an equation in 18 variables. An easier approach is to consider each age cohort as a single type with a non-Poisson degree distribution. Specifically, we assume as in [22] that each age cohort is made up of three activity levels: 50% of each age cohort have a normal activity level, 25% have a higher activity level (doubling both their probability of being infected and the expected number of secondary infections they will cause) and 25% have a lower activity level (decreasing both their susceptibility and their infectiousness to half that of a normal activity level). Considering each age cohort as a single type, it is easy to see that:

Each activity level also increases the out-degrees of the vertices, causing a bias proportional to the activity level. Therefore:

(4.8) In the case considered by [22] , which was chosen as a simple example that can be easily computed, it is still possible to use the naive approach. However, in larger settings this is likely to prove more difficult. On the other hand, even in the more general case where the activity level of an individual of each type is distributed according to some distribution ρ(x), one can still use our method to solve for the end of the disease without requiring an optimization over a large number of inputs using the identities:

and

(4.10)

Such graphs are sufficiently independent, since the only dependence between the in-degrees and the out-degrees is through the activity levels which are linearly proportional to any single in / out degree.

In this section we will use our results to improve and generalise the model proposed by Britton et al. [22] of an epidemic whose spread can be temporarily slowed down. We will first provide a brief description of this model and the numerical methods used in [22] . We will then simulate a simpler example, showcasing some of the tools we developed over the last few sections. Next, we will use these tools to reproduce the results of Britton et al. more efficiently and for more general degree distributions. Finally, we will compare the results of this model with other models when extended to the the multi-type setting.

The multi-type model of [22] received much attention and in this subsection we will explore its subtleties. A standard analysis of the spread of an epidemic in a single-type graph might follow one of two approaches. Either the model attempts to find the point at which an uncontrolled epidemic will end its spread by finding the expected size of the GCC or the large out-components of the epidemic's graph [15] , or it attempts to find the herd immunity point at which R eff = 1 [11, 12] . As was shown in [10] , finding the point at which R eff = 1 is akin to the outcome of an epidemic with limited intervention.

The model proposed in [22] generalizes this approach to multi-type graphs. It is assumed that some temporary counter-measures are taken that reduce the infectiousness by a multiplicative factor of 0 < α ≤ 1. However, when the counter-measures are lifted, the infectiousness returns to its previous state. The goal of this type of counter-measures is to reach herd immunity with only a small infected population.

If the value of α were to be set to 1, the epidemic would go on as usual and while we would reach herd immunity in the first outbreak, we would still see a great many cases even after R eff < 1. If the value of α were set extremely low, then during the first lockdown we would see no outbreak, but that would not prepare us for the lifting of the restrictions.

By choosing the value of α somewhere in the middle, one can reduce the total infected population, since the first wave will be greatly diminished by the counter-measures and the second wave will be diminished by the resistant population from the first wave. For single type graphs, it can be easily shown that the optimal value of α is such that after lifting the restrictions, the value of R eff is exactly 1 and that the resulting total infected population is equal to the forecast of the "R 0 model". The authors of [22] assume this also holds for the multi-type scenario and search for this value of α in several different scenarios.

In this section we will model the spread of an epidemic in a multi-type society. Details of the computations are provided in Appendix A. We will first apply the tools to a toy model with r = 2 types and then we will work with the r = 6 age cohorts considered in [22] , and consider different activity level distributions.

In order to find a solution to equations (3.18) , we need to solve a system of nonlinear equations. While in general this may be a difficult problem, in our case we found that continuous optimization techniques tend to converge to the solution. In particular, we use a variant of the gradient descent algorithm detailed in appendix A. In Figure 2 we test our gradient descent on a set of multi-type graphs. We set the basic interaction matrix: 1) and the distribution of activation levels from [22] , and multiplied the interaction matrix by a different factor each time to obtain different values of the reproductive index per equation (3.12) . This figure shows both the accuracy of our gradient descent, and the threshold phenomena where R eff ≤ 1 corresponds to u = 1 being the only root (meaning that there are no large out-components) and R eff > 1 corresponds to a second root moving further from 1 the higher R eff becomes (which translates to larger out-components). In Figure 3 , we compare the total infected for these age cohorts with several different activity level distributions. Using equations (3.18),(4.9),(4.10), we compute the expected fractional size of the large out-components of an infectious disease using the population sizes and the interaction matrix from [22] , with varying activity level distributions (the dashed lines).

For each pair of R 0 (initial value of R eff ) and activity distribution, we performed a population based simulation as detailed in Appendix A.2. In these simulations we produced a large population of individuals divided amongst a set number of types, with in-out degrees drawn from the appropriate degree distributions (each pair of types have a basic interaction rate which is then multiplied by the activity level of the infecting or infected individual and normalised to obtain the correct R 0 ).

In each simulation we define some small fraction of the population to be infected and the rest to be susceptible. We then propagate the infection and measure the fraction of the population reached by the disease. As can be seen in Figure 3 , the populations based simulations agree almost exactly with the predictions of the multivariate generating function model. We performed these computations for the case where the activity level distribution is: uniform over the segment (0, 1), a half-Gaussian, a Gamma distribution with parameter κ = 0.1 and a generalised Pareto distribution with parameter ξ = 0.25. In each case we fixed the value of R 0 to several values in the range (1, 5) (normalising the interaction matrix to produce any specific R 0 ). As we can see, the generalised Pareto and the Gamma distribution which are characterised by their higher variance produce a significantly smaller largest out-component than the uniform and half-Gaussian activity level distributions. This fits with the previous results of [10-12, 15, 22] . Figure 3 : The expected fractional size of the large out-components of an infectious disease using the population sizes and the interaction matrix from [22] , with varying activity level distributions (the dashed lines) compared to population based simulations for each of these parameter sets (the dots). The generalised Pareto and the Gamma distribution which are characterised by their higher variance produce a significantly smaller largest out-component than the uniform and half-Gaussian activity level distributions. The results of the population based simulations fit the predictions of the model.

In this section we will explore some of the limitations of our model. Its most prominent drawback is that it cannot predict the effects of correlations between the edges of neighboring vertices. For instance, a contact tracing analysis performed on the outbreak of COVID-19 in Italy showed that the majority of infections were between members of the same household [29] . Additionally, many physical models (such as the Ising model for instance) only permit interaction between objects that are close to each other (in some geometrical sense). It is easy to see that information spread through such a graph would not reach exponentially many vertices within a short amount of time.

Our model does not cover this type of graphs. A concrete example is considered for single-type graphs in Section 6 of [10] , where it is shown that a population comprised of close knitted families with weak interactions between families would result in a significantly worse final outcome of an epidemic than the one predicted by models that do not take these local interactions into account. This example holds for the multi-type regime as well.

The second limitation of our model is that it works only for graphs with sufficient independence. To help clarify this notion and show its necessity we will present a simple example without it. Consider for instance a graph with r = 2 types, each representing π 1 = π 2 = 1 2 of the population. Let every vertex of type 1 has exactly 2 outgoing edges to vertices of type 2 and no incoming edges. Finally, let half the vertices of type 2 have an in-degree of (4, 0) (that is two incoming edges from type 1 and no incoming edges from type 2) and an out-degree of (0, 0), and let the rest have an in-degree of (0, 2) and a similar out-degree. In Figure 4 we show a visual representation of such a graph with a population of N = 16.

In this simple example, it is clear to see that one could simply partition the second type into 2 very different types and continue with a simple analysis. However, such partitions are not always possible. For instance, if instead of having a discrete distribution with only two options for the behaviour of type 2 vertices, we had set a negative correlation between the first in-degree and the second out-degree, that would have already altered the sufficient independence assumption.

Continuing with the simple example, it is now unclear what the biased distribution of I 2,2 is. If we were to give each vertex a weight proportional to its total in-degree, then the vertices with 4 incoming edges from type 1 would dominate this distribution and we would estimate that an infected individual of type 2 is expected, on average, to infect 2 3 others. Therefore, given only this piece of information, we would expect that the probability of widespread infection within type 2 vertices to be very small. However, this is clearly not true. In this example, type 1 vertices have an in-degree of 0, so they cannot be infected and play no role in the spread of an epidemic. Some type 2 vertices take all of their in-degrees from type 1 vertices and have no out-degrees, so they also take no substantial part in the spread of the infection. But the rest of the type 2 vertices act as a small independent system with R 0 = 2, and are very likely to experience an outbreak.

In order to create a more general model, one could define B i O j,k and B i I j,k to be the distributions of in and out degrees from type j to type k, weighted by their ith out / in degree. This would give a slightly more complex additive structure of the form

(6.1)

While this additive structure could also be used to recreate many of our results in the more general setting, it would require us to produce as input r 3 distributions (whereas our construction required an input of only r 2 distributions which we set to have a very strong structure anyways). Figure 4 : An example of a graph without sufficient independence. This graph contains r = 2 types of vertices (the red and blue circles). Each vertex of type 1 (blue) has in-degree (0, 0) and out-degree (0, 2). Half of the vertices of type 2 (red) have in-degree (4, 0) and out-degree (0, 0) and the other half have in-degree (0, 2) and out-degree (0, 2).

Another limitation of our model is that we require the second moments of the degree distribution to be bounded. This limitation (which also holds for most single type models) has been explored in a number of previous cases. We will focus on the example of sexually transmitted diseases [8, 30] . These authors consider the spread of a disease between two populations (types in our nomenclature), where edges are only between the two types and the degrees are assumed to be distributed according to a power-law distributions (with power α commonly assumed to be around 3.2 [31] ). Furthermore, since such a disease is often modeled by an undirected graph, the biased in/out distributions in our model should be:

Therefore, both BO i,j and BI i,j are distributed according to a power-law with power α − 1. Furthermore, since a vertex of any type is only be connected to vertices of one other type (in this model), these distributions are sufficiently independent. The authors of [8] show that when the power-law is sub-cubic (i.e. the tail of the degree distribution disappears slower than Pr [x] ∝ x −3 ), no amount of prevention would stop the spread of such a disease. In our formulation this is expected when one notices that the second moments of I i,j and O i,j are unbounded. For instance, if the power-law α of O i,j were less than 3, then the power-law α − 1 of BO i,j would be less than 2, and the average value of BO i,j would be unbounded (which would result in an infinite R 0 if our model were relevant to this scenario). Stricter power-laws (i.e. α > 3) would cause BO i,j to have a bounded expectation, resulting in bounded eigenvalues and a bounded value for R 0 , which fits the results of previous models which show that in these cases the disease can be controlled with counter-measures [8] .

Previous results have modeled the spread of information in graphs and diseases in populations for either single-type graphs with general degree distributions [7, [9] [10] [11] [12] [13] [14] [15] , or for multi-type graphs limited to Poisson [19] [20] [21] or Poisson-like [22] degree distributions. In this work we derived a method for computing the size of the large out-component of multi-type graphs with a more complex degree distribution and presented an example of its application to the case of epidemiological research.

In [10] , the authors compare two types of models for the spread of a disease. The first is the R 0 type model which estimates the percentage of the population infected before herd immunity (defined as the point where R eff = 1). This model is a good estimate for the percentage of the population infected when the spread of the disease is mitigated by reactive countermeasures that limit the peak infected population.

However, if no countermeasures are taken whatsoever, then at the precise moment when R 0 reaches 1 there will be many infected individuals. While new infection chains are not expected to grow, those already infected will continue to infect others resulting in a slowly diminishing after-burn effect which can almost double the total infected population.

The second type of models is the GCC type model, which takes into account the after-burn effect and provides a good estimate for the total infected population when no countermeasures are taken. While both models can be resolved for single-type graphs using the methods shown in [10] , and for Poisson multi-type graphs as shown in [22] , our solution for the general multi-type scenario resolves only the GCC type model. We expect that methods similar to those shown in [22] can be used to generalise our results to solve this class of problems as well and leave it for future work.

Another interesting problem is the role of R 0 : In our analysis we showed a method by which the concept of a reproductive index (in the sense of the rate by which a disease expands as a function of its generation) can be generalised to classes of random graphs. Furthermore, we gave a formula for R 0 and showed that for Poisson multi-type graphs it is equivalent to the formula given by [21] .

However, in single-type graphs, the same R 0 also plays another role in the study of diseases, it is a predictor of whether an outbreak will occur or not (there should be an outbreak if and only if R 0 > 1). It can be easily shown that with our definition of R 0 for multi-type graphs, R 0 > 1 is a necessary condition for an outbreak. For instance, we can use the convexity of G BI to obtain the following inequality:

is the spectral decomposition of 1− u to the eigenvalues and eigenvectors of M BO . For this inequality to hold, we must have Re (λ) > 1 for at least one of the eigenvalues of M BO . However, it still remains to be proven that it is impossible for R 0 to be larger than 1 without having an outbreak. So far, to the best of our knowledge, this has been proven only for the Poisson distribution in [21] .

In all of our models we used the interaction matrix and the relative cohort sizes from [22] . In particular, these age cohorts which correspond to the age groups 0-5, 6-12, 13-19, 20-39, 40-59 and 60+ were assumed to represent 7.25%, 8.66%, 11.24%, 33.23%, 22.67% and 16.95% of the population respectively. The basic interaction matrix used was: We considered 4 basic distributions of activity levels:

• Gamma Distribution with the PDF f (x) = 1 Γ(k)θ k x k−1 e − x θ , where k was set to 0.1 (as in [10] ) and θ was scaled for the desired R 0 .

• Generalised Pareto Distribution with the PDF f (x) = 1 σ 1 + ζ x−µ σ −(1/ζ+1) , where µ and ζ were set to 0 and 0.25 respectively and σ was scaled for the desired R 0 .

• Half Normal Distribution with the PDF f (x) = 2 π 1 σ exp − x 2 2σ (for all x > 0), where σ was scaled for the desired R 0 .

• Uniform Distribution with the PDF f (x) = 1 L for all x in the segment [0, L] and 0 otherwise, where L was scaled for the desired R 0 .

Béla Bollobás and Bollobás Béla. Random graphs. Number 73

Random graphs

Networks: an introduction

Piet Van Mieghem, and Alessandro Vespignani. Epidemic processes in complex networks

The web as a graph: Measurements, models, and methods

Random graphs with arbitrary degree distributions and their applications

Spread of epidemic disease on networks

Superspreaders and high variance infectious diseases

Heterogeneity and superspreading effect on herd immunity

Persistent heterogeneity not short-term overdispersion determines herd immunity to covid-19

Herd immunity thresholds for sars-cov-2 estimated from unfolding epidemics

Epidemic dynamics in inhomogeneous populations and the role of superspreaders

Covid-19 superspreading suggests mitigation by social network modulation

Beyond r 0: heterogeneity in secondary infections and probabilistic epidemic forecasting

Percolation on sparse networks

Superspreading events in the transmission dynamics of sars-cov-2: Opportunities for interventions and control

Using data on social contacts to estimate age-specific transmission parameters for respiratory-spread infectious agents

The final outcome of an epidemic model with several different types of infective in a large population

Stochastic epidemic models and their statistical analysis

Pandemic spread in communities via random graphs

A mathematical model reveals the influence of population heterogeneity on herd immunity to sars-cov-2

Superspreading and the effect of individual variation on disease emergence

Full genome viral sequences inform patterns of sars-cov-2 spread into and within israel

Estimating the unreported number of novel coronavirus (2019-ncov) cases in china in the first half of january 2020: a data-driven modelling analysis of the early outbreak

Real-time tentative assessment of the epidemiological characteristics of novel coronavirus infections in wuhan, china, as at 22

Early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia

Pattern of early human-to-human transmission of wuhan 2019 novel coronavirus (2019-ncov)

Nicolò Navarin, et al. Suppression of covid-19 outbreak in the municipality of vo, italy

Likelihood-based inference for stochastic models of sexual network formation

The web of human sexual contacts

The work is supported in part by the Israeli Science Foundation Center of Excellence, the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (Grant agreement No. 835152), BSF 2016414, the IBM Einstein Fellowship and John and Maureen Hendricks Charitable Foundation at the Institute for Advanced Study in Princeton.

In this appendix we detail the numerical algorithms used to produce the results for Figures 2 and 3. These included a continuous optimization technique used for finding the solutions to equation (3.18) , and a population based simulation whose results were then compared with the output of the model.

Gradient descent is a technique used for minimizing functions. Its base premise is that by moving against the direction of the gradient of the function, one can most rapidly reduce its value. We utilized this base premise but changed the heuristic by which the distance travelled along this gradient was chosen. Our goal in this choice was to advance as far as possible against the gradient, without running the risk of overshooting the root.Our strategy was to frame our root finding problem as the minimization of a function which could be approximated by a quadratic form near its roots. We did this by choosing to minimise the function:Let r be the correct solution and define ε = u − r. Assuming that the Jacobian matrix J of G B I at the root is not equal to the identity, our first order approximation for f when ε is sufficiently small would be:We followed these steps to determine how far we went along the gradient at each iteration of the descent Let u n be the n th step of the gradient descent. We define the direction of the step:We then consider the simpler function g n (x) = f ( u n + xd n ). Assuming that f is indeed close to a quadratic form at u n , g n will also be close to a quadratic form g n (x) ≈ ax 2 + bx + c (for some a, b, c). We use the first and second order derivatives of g at 0 to estimate its minimum x n = −g n (0) g n (0) . However, to be on the safe side we cut the length of the step in half. Putting it all together, this gives us the following formula for the each step of the gradient descent:

We performed population based simulations similar to the ones in [10] . In these simulations we modeled a population of size N = 1E6 individuals with a certain activity level specified for each experiment. These were then normalised to give the desired R 0 using equation (3.12) . Each individual also carried a state which could be either SUSCEPTIBLE, INFECTED or RECOVERED, and a cohort corresponding to an age group (similar to the age cohorts in [22] ). The percentage of the population in each age cohort and the interaction between age cohorts were set according to the data in [22] . A small percentage of the population (100 individuals chosen at random, weighted according to their activity levels) was then set to INFECTED. From there we set an iterative process where at each step an INFECTED individual A would set any SUSCEPTIBLE individual B to INFECTED with probability p = A.inf ectiousness × B.susceptibility × InteractionM atrix[A.cohort, B.cohort]. At the end of the iteration we would set A to RECOVERED. We repeated this process until the entire population was either SUSCEP-TIBLE or RECOVERED, and then counted the SUSCEPTIBLE and RECOVERED individuals.