key: cord-0710482-h81qb5yv
authors: Rizi, Abbas K.; Faqeeh, Ali; Badie-Modiri, Arash; Kivela, Mikko
title: Epidemic Spreading and Digital Contact Tracing: Effects of Heterogeneous Mixing and Quarantine Failures
date: 2021-03-23
journal: Physical review. E
DOI: 10.1103/physreve.105.044313
sha: 6c917eeedef5fa82cf14c9409a8419b841f99102
doc_id: 710482
cord_uid: h81qb5yv

Contact tracing via digital tracking applications installed on mobile phones is an important tool for controlling epidemic spreading. Its effectivity can be quantified by modifying the standard methodology for analyzing percolation and connectivity of contact networks. We apply this framework to networks with varying degree distributions, numbers of application users, and probabilities of quarantine failures. Further, we study structured populations with homophily and heterophily and the possibility of degree-targeted application distribution. Our results are based on a combination of explicit simulations and mean-field analysis. They indicate that there can be major differences in the epidemic size and epidemic probabilities which are equivalent in the normal SIR processes. Further, degree heterogeneity is seen to be especially important for the epidemic threshold but not as much for the epidemic size. The probability that tracing leads to quarantines is not as important as the application adoption rate. Finally, both strong homophily and especially heterophily with regard to application adoption can be detrimental. Overall, epidemic dynamics are very sensitive to all of the parameter values we tested out, which makes the problem of estimating the effect of digital contact tracing an inherently multidimensional problem.

Until effective vaccines are widely deployed in a pandemic era, carefully timed non-pharmaceutical interventions [1] such as wearing face masks [2] , school closures, travel restrictions and contact tracing [3] [4] [5] [6] [7] are the best tools we have for curbing the pandemic. Contact tracing is an attempt to discover and isolate asymptomatic or pre-symptomatic (exposed) individuals. In the absence of herd immunity, contact tracing is a potent low-cost intervention method since it puts people into quarantine where and when the disease spreads. Therefore, it can have a significant role in containing a pandemic by relaxing social-distancing interventions [8] , providing an acceptable trade-off between public health and economic objectives [9, 10] , developing sustainable exit strategies [11, 12] , identifying future outbreaks [13] , and reaching the 'source' of infection [14] .

Thanks to the emergence of low-cost wearable health devices [15] [16] [17] [18] [19] [20] [21] [22] and mobile software applications, digital contact tracing can now be deployed with higher precision without the problems of manual contact tracing, such as the tracing being slow and labor-intensive or people's hesitation to give identifying data about their contacts due to blame, fear, confusion, or politics. On the other hand, smartphones and wearable devices also offer continuous access to real-time physiological data, which can be used to tune other non-pharmaceutical or pharmaceutical strategies. Modern apps enable us to monitor COVID-19 symptoms [23] [24] [25] , identify its hotspots [26] , track mosquito-borne diseases such as Malaria, Zika and Dengue [27, 28] , and detect microscopic pathogens.

In both forms-manual [4, 5, [29] [30] [31] [32] [33] [34] [35] [36] [37] and digital [38] [39] [40] [41] [42] [43] [44] -contact tracing has been commonly considered as an effective strategy and different empirical data sets have validated this claim in short-time population-based controlled experiments [38, 40] . It has been estimated that for every percentage point increase in app-users, the number of cases can be reduced by 2.3% (in statistical analysis) [45] . However, such a linear view of the benefits of the app usage is likely too simplistic and ignores the complexities disease spreading, especially in heterogeneous populations [46] [47] [48] [49] . For instance, degree-heterogeneity in the contact network [50] can alter epidemiological properties in the form of variance in final outbreak size [51] , vanishing epidemic threshold [49, 52] , hierarchical spreading [53] , strong finitesize effects [54] and universality classes for critical exponents [55] . Moreover, the existence of super-spreaders dictates the extent to which a virus spreads in a bursty fashion [56] [57] [58] , especially when there is high individuallevel variation in the number of secondary transmissions [53, 59, 60] . Therefore, to evaluate the effectiveness of contact-tracing, degree-heterogeneity and app adoption of super-spreaders [61, 62] should be taken into account. Note that in some parameter settings, contact tracing may not be effective enough [8, 63, 64] .

A potentially important factor in the effectiveness of the contact tracing apps is related to how the app-using and non-app-using populations are mixed. Several studies have shown that similar people with similar features are more likely to be in contact with each other than with people with different types of features. This phenomenon is known as homophily [65] [66] [67] . It has been reported in app adoption directly [40] , and indirectly through correlation in app adoption and other features exhibiting homophily, such as jobs, age, income and nationality [68] [69] [70] . Therefore, the fraction of population that adopts the app is not the only important factor for reducing the peak and total size of the epidemic, but also the amount of homophily in app adoption can potentially have a significant role.

Since the World Health Organization has declared the COVID-19 outbreak as a Public Health Emergency of International Concern, network scientists have developed different approaches towards analyzing epidemic tracing and mitigation with apps. Using the toolbox of network science, different groups have investigated the effectiveness of contact tracing based on the topology and directionality of contact networks [14, 44, [71] [72] [73] [74] [75] [76] [77] . Recently, a mathematical framework aimed at understanding how homophily in health behavior shapes the dynamics of epidemics has been introduced by Burgio et al. [78] . This study expanded the model of Bianconi et al. [71] and computed the reproduction number and attack rate in a homophilic population using mean-field equations.

Our study investigates the effect of varying app coverage on the epidemic's threshold, probability and expected size in homogeneous and heterogeneous contact networks with and without homophily or heterophily in app adoption. Further, we explore the effect of distributing the apps randomly and preferentially to high-degree nodes [71] in these scenarios. Our main focus is on the epidemic threshold and the final size of the epidemics. Therefore, we assume the dynamics of the epidemic to be governed by the simple SIR model [50] . This model can be easily mapped to a static bond-percolation problem [79, 80] so that the epidemiological properties can be measured based on the topological structure of the underlying network [50, 73, [81] [82] [83] [84] . Note that, more complex disease transmission models, such as SEIR models in which there is an infected-but-not-contagious period E, are also covered by this formalism [79, 85] . The difference in the spreading framework with the app to the normal one is that the infection cannot spread further if it passes a link between two app-users (app-adopters). That is, the infection process model needs to include the memory of the type of node it is coming from. We then extend the percolation framework such that we can add memory [86, 87] to it in order to keep track of the infection path. This leads to the observation that the epidemic size is not the same as the epidemic probability as it would be in this model without the app-users [88] .

Our results are largely based on mean-field-type calculations of the percolation problem, which are confirmed by explicit simulations of SIR epidemic process and measurements of component sizes in finite networks. Our findings show that: 1) the number of app-users has a direct effect on the epidemic size and epidemic probability and the difference between these two observables is larger in high-degree targeting strategy; 2) epidemics can be controlled to a much better degree in the high-degree targeting strategy; 3) even though degree-heterogeneity can strongly affect or even eliminate the epidemic threshold, high-degree targeting strategy can compensate this effect and increase the threshold significantly; 4) increasing heterophily from random mixing always increases the outbreak size and lowers the epidemic threshold; 5) increasing homophily does the opposite until an optimum, that is below the maximum homophily case, is reached; and finally 6) the probability of contact tracing succeeding in preventing further infections is not as crucial as the fraction of app-users, but can still have significant effects on the epidemic size and epidemic threshold. The only exception is when the apps are distributed to heterogeneous networks with the high-degree targeting strategy.

We employ a SIR disease model on networks with additional dynamics given by the disease interactions in the presence of the disease tracking application. In the model, without the tracking application, an infected (I) node will eventually infect a neighboring susceptible (S) node with a transmission probability p independently of other infections. The simulations are performed with a model where the infected nodes try to infect their susceptible neighbors with independent Poisson processes with rate β and go to the removed state (R) after fixed time τ . The fixed recovery time ensures that every infected individual, regardless of app adoption, can infect a susceptible neighbor independently with a transmission probability p = 1 − e −βτ [79, 89] . These assumptions allow us to study the SIR processes using component size distributions of undirected networks where parts of the links are randomly removed [79, 85, [88] [89] [90] : an epidemic starting from a single node can reach any other node exactly when there is a path of such transmitting links connecting them, i.e., they are in the same component in a network where the potential contact links are removed with probability p. Thus, the epidemic threshold, epidemic probability and epidemic size can be read from percolation simulations [79, 85, [88] [89] [90] (see Section I B). Note that without fixed recovery time, the presence of spreading paths through neighboring links would not be independent, and this would not be a bond percolation problem in an undirected network where edges are removed independently. However, the epidemic threshold, final epidemic size, and the expected outbreak size below the epidemic threshold would still be correctly predicted by this methodology [88, 89] .

We model the effect of applications to the disease spreading as follows: if an app-user infects another appuser, that second node will get infected but will quarantine themselves with probability p app . The quarantined user will have no further connections that would spread the infection they received from the other app-user. A substantial deviation from a realistic spreading case in our model is that the quarantine does not prevent the disease spreading to the quarantined node through a third node. That is, we only model the primary infection path from the other app-user causing the alarm but do not stop the possible concurrent secondary infection paths from a third node. Strictly speaking, this simplification in the modeling returns a lower bound on the effective-ness of the app-based contact tracing, but given that our contact network models are sparse random graphs (see Section I C) that do not contain local loops, the difference can only be observed if a large enough fraction of the population is infected at the same time. Critically, this does not affect the epidemic threshold but could have implications for parameter regions where the epidemic size is large, depending on the quarantine durations.

The SIR spreading process can be mapped to a slightly more complicated percolation problem in the presence of apps [44, 71] . To model app-user quarantines, one needs to delete the links between two app-users with the probability of successful quarantine due to the app, p app . This ensures that we ignore the infection paths through two app-users when one of them is successfully quarantined. However, removing these links also removes the second app-user from the component, even though they are infected. To correct this, we need first to find the network components and then extend them by including all app-users outside of the component connected to another app-user (and considering the probability p that the link is kept). See Fig. 1 for an illustration of this process, which leads us to two definitions of components: normal and extended. 

In the SIR model without apps, the component size distribution can be used to describe the late stages of the epidemics approximately. Given an initially infected node, the size of the component it belongs to determines the size of the outbreak. In an infinitely large population, we say that an outbreak is an epidemic if it spans a nonzero fraction of the population. The relationship between percolation and the final epidemic size is straightforward if the population is large enough that it can be approximated with an infinite undirected transmission network [79, 88] . In this case, the percolation threshold gives the epidemic threshold and below it, an outbreak always spans only a zero fraction of the population because all the components are of finite size. Above the percolation threshold, there is a single giant component that spans s max = S max /N fraction of the nodes. This is equivalent to both the size of the epidemic, given that there is one, and the probability that there is an epidemic starting from a single initially infected node [79, 88] ; s max is the fraction of nodes that can be reached from the giant component (epidemic size) and the probability that randomly chosen node belongs to the giant component (probability of the epidemic). The expected epidemics size in a fraction of eventually infected nodes is, in this case, given as s 2 max . When we introduce apps to the spreading process, the equivalence of the epidemic size and epidemic probability breaks down. Both the normal component and the extended component become important. The component size still gives the probability that there is an epidemic, as is the case without the apps. However, the epidemic size, given that there is one, is now given by the extended component size s max . The expected epidemic size is then given by s max s max .

Similar relationships hold for finite-size systems. For example, the expected size of the epidemics from single source becomes

where S c is the normal size and S c is the extended size of the component c and N is the total number of nodes. In this formula, S c /N gives the probability that the initially infected node is in the component c and S c gives the size of the epidemic if a node in component c is chosen.

We aim to study how the network topology, amount and distribution of app-users over the network affect the epidemics. We study networks with degree distribution P (k) and average degree k such that each node is an app-user with probability π a and not an app-user with probability 1 − π a . We distribute the app-users with one of two strategies: 1) uniformly at random or 2) by distributing the apps in the order of their degree such that the high-degree nodes get the apps first.

We use three different models to generate the network topology. We use i) Poisson (ER) random graphs [91] to model homogeneous contact patterns and ii) scalefree networks generated with the Chung-Lu model (CL) [92, 93] to model heterogeneous networks. In generation of CL networks, the expected degree of each node is drawn from a continuous power-law distribution P (k) ∝ k −3 such that the minimum expected degree is set to a value that gives us the expected average degree k of our choice. Given a sequence of expected degrees W = {w 1 , w 2 , ..., w n }, Miller algorithm [94] assigns a link between node u and node v with probability p uv ∝ w u w v . This algorithm returns a network without multiple links with almost the same power-law degree distribution.

We model homophily (and heterophily) with regards to apps usage with iii) a modular network model (MN) introduced in Ref [95, 96] with two groups of nodes: appusers and non-app-users. This model starts with a degree sequence produced either by the ER or CL models and connects the nodes depending on which groups they belong with probabilities π aa (app-user to app-user), π an (app-user to non-app-user), π na (non-app-user to app user), and π nn (non-app-user to non-app-user). We only need to fix one of these probabilities, π aa , and other types of links are formed with probabilities π an = 1 − π aa ,

where π a is the probability that a person is an app-user and the second equality comes from the balance between the number of links from app-users to non-app-users and from non-app to app-users, that is, π a N π an k = (1 − π a )N π na k . The numerical simulations of the MN work by randomly choosing a group for half edges with the given probabilities and matching them to each other uniformly randomly. This can lead to self-links and multi-links, which these are discarded after the randomization procedure.

While there is no correlation between the app adoption status in homogeneous (ER) or heterogeneous (CL) networks above, in the third model (MN), the existence of homophily or heterophily of the network structure is determined by comparing π aa to its value for the neutral case with no homophily or heterophily. In the absence of homophily or heterophily, π aa = η a , where η a is the ratio of the number of links that emerge from appusers to the total degree; this is because if the nodes were connected purely at random, the probability that a link from an app-user connects it to another app-user equals the ratio of the number of stubs that app-users have to the total number of stubs, i.e., η a . In the case of a random selection of app-users η a = π a , since both app-users and non-app-users have on average the same number of stubs and the fraction of stubs that app-users have equals the fraction of app-users in the system, i.e., π a . In a high-degree targeting strategy, the number of stubs that app-users have on average is larger than that of non-app-users. In that case, η a can be calculated from the degree distribution (see Sec. II A). When π aa > η a , app-users are more likely to be connected to each other than in a network in which a fraction of η a of them being uniformly randomly placed. On the other hand, when π aa < η a nodes are more likely to be connected to the nodes of the other type (heterophilic network). In the heterophilic regime, for some pairs of (π aa , π a ), networks are not realizable because of the constraints explained in Sec. II A. The white region in Fig. 6 shows the region of π aa -π a plane that networks cannot be created in that parameter space.

The epidemics are studied here with various methods of approximation. We employ analytical computations based on mean-field-type approximations to efficiently analyze our models' wide parameter space and provide explicit formulas for our main observable quantities. Here an approximation based on branching processes [97] can be used to determine the critical point. Following Ref. [44] , a more detailed calculation based on percolation arguments will give us the component sizes which can be related to the final epidemic size and epidemic probability. Simulations of the network connectivity then complement these mean-field approximations. Finally, we explore the accuracy of the mean-field approximations via explicit simulations of the SIR model.

To study the behavior of the epidemic dynamics, we form consistency equations for the giant component size. In Ref. [44] the governing equations for the size of the epidemic and the transition point were obtained for the case of random networks in the absence of homophily. Here we derive the analytical results for the more general case of the spectrum of heterophilic to homophilic networks, a special case of which is the non-homophilic networks of Ref. [44] . We consider that app-users and non-appusers might be connected together with a pattern different from pure random chance using the modular network model (MN). We aim to write the self-consistent equations for the probability, u n , that following a link to a non-app-user does not lead to the giant component and probability u a , that following a link to an app-user does not lead to the giant component. Using these probabilities, the relative size of the giant component s and the relative size of the extended giant component s can be obtained, where s is, in fact, the fraction of nodes infected through non-appusers, while s also includes individuals who caught the infection through an app-user before they could quarantine themselves (see Sec. II C 1).

We need to know the probability u n (u a ), that a randomly chosen link leading to a non-app-user (app-user) is not in the giant component. The probability that a nonapp-user (app-user) is not connected to the giant component via a particular neighboring node is equal to the probability that that non-app-user (app-user) is not connected to the giant component via any of its other neigh-bors. A non-app-user is connected to another non-appuser with probability π nn = 1 − π na and to an app-user with probability π na . So, a link leading out from a nonapp-user does not lead to the giant component if it leads to another non-app-user that is not in the giant component (which happens with probability (1 − π na )u n ) or an app-user that is not in the giant component (which happens with probability π na u a ). That is, the total probability for following a link out from a non-app-user not leading to the giant component is u n→ = (1−π na )u n +π na u a . Since the degree of neighboring nodes is disturbed according to the excess degree distribution q k , the probability that a non-app-user that is encountered by following a link to it is not connected to the giant component via any of its k neighbors is k q k u k n→ . This probability is, by definition, u n , leading to the self-consistent equation below for u n :

where g 1 is the generating function for excess degree distribution [50] . To find u a , we can use the same treatment, except that we should consider how app-app connections depend on the probability of success in contact tracing [44] . If p app is the probability the apps work as expected, then 1 − p app is the probability that the app-user does not effectively quarantine after being been in contact by an infectious app-user. Therefore, u a can be expressed as the self-consistent equation below:

Note that π na is determined by the free parameters π a and π aa as we already showed that π na = πa 1−πa (1 − π aa ). Given u n and u a , the average probability that a node belongs to the giant component, or equivalently the fraction of the network occupied by the giant component, is now given by:

where g 0 is the generating function for degree distribution. We can approximate s by writing:

where, as opposed to Eq. 4, the third term is not a function of p app and the reason is that Eq. 4 assumes that if the app works (which happens with probability p app ) then the probability that a link connected to an app-user does not lead to the giant component is 1 (while if the app does not work it is u a ). However, whether the app works or not, the probability that an app-user does not get infected from another app-user is u a . When apps work, if the second app-user is infected, she quarantines herself and does not infect any other node).

In the case of including a transmission probability p which is less than 1 (in the above equations it was assumed the links are transmitting with probability 1), Eqs. 2 and 3 will change to:

u a = 1 − p + pg 1 ((1 − π aa )u n + π aa (p app + (1 − p app )u a )) . (7) When the fraction π a of nodes selected to adopt the app are all the highest degree nodes in the network, these nodes all have a degree higher than k a − 1 such that they include some of k a nodes and the rest are comprised of all nodes with degree larger than k a . Then for the fraction η a of the links protruding from the app-users (which are the top π a fraction of nodes) we can write:

where r * is the fraction of degree k a nodes that are app-users and in Eq. 9 we absorbed r * into p k so that p k a,right = r * p ka represents the fraction of nodes in the network that have degree k a and are app-users (so in Eq. 9, k a,right takes the value k a ).

Then for a network with homo/heterophily:

and

A special case of which are networks with neutral (nonexisting) homophily, where π aa is obtained to be equal to η a and accordingly π na = η a , therefore,

and

These results predict the behavior of the epidemic dynamics in the thermodynamic limit. Therefore they describe the dynamics very well when the network size is large enough.

B. Mean-field approximation for the branching process

An alternative to writing the consistency equations for the giant component size is to assume that a branching process governs the epidemic dynamics. Then, a straightforward way of finding the epidemic threshold in the SIR model is to find the critical point of a branching process, where the branching factor is given by the expected excess degree q. In the epidemic setting, the branching factork e = pq gives the expected number of people one infected person infects during the epidemic process. Note that the branching factor has been used as the definition of the basic reproduction number R 0 [88] , but is different from the basic reproduction number when it is defined in the networks as R 0 = β/γ k [80] . In the SIR model with the app, we need to duplicate the populations so that we separately track the ones without the app (S n , I n and R n ) and with the app (S a , I a and R a ).

Given that the apps are uniformly distributed to π a fraction of the nodes andk e is the branching factor, we can write a mean-field approximation based on the branching process as follows:

By defining a = π nnke , b = π anke , c = π nake and d = π aake (1 − p app ), the difference equations can be written as:

where X t = I The steady state X t+1 = X t is possible if all the eigenvalues λ of the transition matrix A (whether real or complex) have an absolute value which is less than 1;

Without contact tracing, there is a chance of epidemic, given the initial reproductive number isk e > 1. In the case of app adoption, the critical value of app-users π c a that is needed for reducing the reproductive number can be derived by setting λ = 1 which leads to:

When apps work perfectly, the epidemic threshold is given by:

For each value of π a there is a non-trivial optimum value π opt aa that leads to the largest epidemic threshold in terms of the branching factor, which is:

The critical app adoption can be also calculated as:

In the absence of homo/heterophily, π aa = π a , Eq. 20, gives the same result as of Ref. [44] , such that:

Vazquez [97] also provides a clear way of combining different intervention strategies and shows how our specific results about application homophily are affected by other interventions.

Next, we describe how to extract the giant component in simulated networks and how these simulation results can be used to find the critical points of the disease spreading process. The component sizes can also be used to find the epidemic size distributions as described in Section I B.

In each simulation run, we simulate one network structure G and distribute the apps to the nodes according to one of the models described in Section I C. From the original network G, we keep each link with probability p = 1 − e −βτ , which is the probability of infection going through a link without apps. We also remove all the links between two app-users with probability p app and call the resulting network G a . The components of graph G a are the normal components.

The extended components can be reached by going through every normal component and extending it. For every app-user α in the component C, we go through the neighbors n α = {α 1 , α 2 , , ..., α k } in the original network G. If α i is an app-user and not in the component α i / ∈ C, we add it to the component extension C with probability p. The total set of infected nodes, if starting from a node in C, will be C ∪ C . As these are disjoint sets, we can compute the size as S C = |C| + |C | and S c = |C|.

In numerical simulations of finite size systems, we can use the peak of a susceptibility measure to find the critical transition point. Theoretically, susceptibility [84] is a measure of fluctuation in the component sizes, which is singular at the epidemic threshold (the critical point). In network percolation studies, it is defined as the expected growth in the size of the giant component when a random link is added to the network. Therefore, susceptibility in an ordinary percolation problem can be written as:

where S c is the size of the component c, c max = argmax c S c is the largest component.

Here, we are dealing with two types of components, and as is shown in Fig 2D, the fraction of the sum of component sizes and network size S /N can be larger than one. Susceptibility should be a monotonically decreasing function in the supercritical regime. However, plugging the extended component sizes into Eq. 25 results in a growth in the tail of susceptibility, turning it to a non-monotonic function in the supercritical regime. Therefore, this formulation of susceptibility is not suitable in the current case since the maximum of Eq. 25 could lead to estimates of critical points that are very far from the actual one. Instead, we can use the expected growth in the extended giant component, which can be computed as:

where S c and S c are the size and the extended size of the component c and c max = argmax c S c is the largest component measured in the extended size.

Finally, we will perform explicit simulations of the spreading processes to confirm the theoretical results we arrived at via the approximations we presented above. The effect of tracking applications can be integrated into compartment model simulation by introducing separate susceptible and infected compartments for people with and without the app. The interactions between people with no app installed is similar to those of the normal SIR process, namely, susceptible individuals with no app (S n ) can become infected (I n ) by being in contact with infected people that either do not have the app installed (I n ) or have it installed (I a ). However, if a susceptible individual with the app (S a ) comes into contact with an infected individual with app (I a ), they will become infected but they will also receive infection notification from the app which means they will be quarantined (I q ). Quarantined individuals cannot infect anyone else. Eventually, all the infected individuals will move to the recovered compartment after a constant predetermined amount of time (1/γ) has passed from the beginning of their infection. The recovered compartment is divided into three compartments R n , R a , and R q to track which infected compartment the node is originating from.

The set of all reactions can be written as follows:

Note that while edge reactions are governed by Poisson processes happening at a constant rate β, unlike most common SIR models, node reactions are governed by constant cutoff time 1/γ and happen exactly 1/γ units of time after the infection of the node. As interactions in the simulation are bound to take place over edges of a static network, with nodes belonging to each of the compartments, as shown in Sec. III, the results are similar to a component size simulation (which are described in Sec. II C) on a network with effective connectivity ofk e = k (1 − e −β/γ ). As only the ratio between β and γ plays as a parameter in the model, we set the value of γ to 1.

In each simulation, starting from a single infected node and running the simulation in discrete time steps of 10 −4 units until no further reaction is possible, the final number of nodes that end up in R q , R a and R n determine total size of infection corresponding to the extended component size S of the component that the initial seed node belongs to. The final combined size of the R n and R a component, however, represents the size of the component S n that the seed node (index case) would belong to, had we removed app-app links. By adding I a and I q compartments, as compared to normal SIR processes, and linking them to the state of the source of infection and the internal state of each node, we include information about the history of the spreading agent more than one step back in the simulation of the spreading process.

We will next illustrate using the theory and simulation introduced in Sec. II how the various parameters affect the epidemic sizes and epidemic probabilities. The simulation studies are done in networks of 10 4 nodes and averaged over 10 realizations. We use two network topologies: homogeneous networks (Erdős-Rényi networks) with expected degree k = 10 and random networks with expected degree sequence driven from power-law degree distribution p(k) ∝ k −3 , with a minimum degree cutoff adjusted such that the average degree is set to 10 [94] .

The difference between the epidemic probability (normal component size) and the epidemic size (extended component size), as given by Eqs. (4) and (5), is a phenomenon specific to epidemics in the presence of appadaptors. Breaking the equivalence of these two measures can have practical consequences, as illustrated in Fig. 2A . The difference between these two grows with the fraction of app-users π a . For example, when π a = 0.8 and the epidemic probability (the normal component size) is s max ≈ 0.5, the epidemic size (the extended component size) reaches s max ≈ 0.8. This is also reflected in the expected epidemic sizes (see Fig. 2B and Eq. (1). Despite the two component definitions differing from each other, they still display the transition at the same point and this point can be measured numerically using the susceptibilities defined in Eqs. (25)-(26) (see Fig. 2C ).

The extended component size is not a conserved quantity like the normal component size in the sense that the sum of component sizes S would always sum to the number of nodes N . Instead, the sum of component sizes can be significantly larger than the number of nodes (see Fig. 2D ) and the maximum value it can reach grows with the number of application users π a . The deviation from S /N = 1 reaches its maximum with disease parameters higher than the threshold values, but when the disease reaches a large enough population, the fraction S /N starts to decay, reaching S /N = 1 when everybody belongs to the normal giant component.

The assumption in Section III A is that i) apps work perfectly and ii) an app-user always self-isolates before having a chance to spread the infection, meaning that there are no quarantine failures, p app = 1. It is of practical significance to investigate the effects of quarantine failures [45] on the epidemic threshold and epidemic size. Fig. 3 shows that in the absence of major quarantine failures, epidemic tracing and mitigation with apps can still be a valid strategy if the app adoption level in a society is high enough. The effect of app adoption rate π a is more important than the rate at which apps function, but both need to be relatively high in order for the apps to have a significant impact.

Even if we are above the epidemic threshold, the apps can be useful. Especially when the application adoption π a is high, the quarantines can be very unreliable and the outbreak size ( Fig. 3B-C) and epidemic probability (Fig. 3D ) both remain small. Again, overall, app adoption and quarantine reliability are essential, with the app adoption rate being more important. Note thatke = 1.8 is chosen as an illustrative example of a parameter region with interesting behavior in the various component sizes: it is large enough such that without any intervention, there is a wide epidemic spreading, but small enough such that the spread can be controlled without extreme measures.

Real networks are degree-heterogeneous and this heterogeneity has a strong effect on the final outbreak size and the epidemic threshold. Fig. 4 shows the expected epidemic sizes with two different strategies in app adoption, random and high-degree targeting, for different fractions of app-users π a in the network. In homogeneous networks, Fig. 4A , contact tracing decreases the expected epidemic size and pushes the epidemic threshold forward. These effects can be further amplified by shifting to the high-degree targeting in app adoption. With 80% of appusers, the epidemic threshold can move fromk e = 1 tō k e = 4, which means at that point expected epidemic size is zero, while without contact tracing it would be almost 1. Note that in homogeneous networks, the effective average degree of the contact networkk e , has good correspondence to the reproduction number of the infection.

In networks with degree-heterogeneity, the epidemic threshold vanishes in normal SIR processes. This effect holds in contact-traced epidemics if we distribute the apps uniformly randomly. However, from Fig. 4B it is clear that contact tracing can significantly reduce the expected epidemic size even when the apps are randomly distributed and the epidemic threshold remains unchanged. With the high-degree targeting strategy, it is possible to move the epidemic threshold. Comparing the expected epidemic size at different values ofk e < 3 shows that in real-world situations, app adoption of superspreaders is of significant importance. Since hubs become the app-users, this strategy has drastic effects on the size and threshold of the epidemic, such that the threshold gets pushed from somewhere near zero to a valuek e > 5 with the app adoption rate π a = 0.8. Therefore, the reproduction number can be much more controlled in the high-degree targeting strategy.

In previous sections, there was an assumption that app-users are distributed with random mixing patterns; the fact that one of the connections of a node is an app-user has no effect on the probability of that node FIG. 5. The effect of homophily/heterophily in app adoption in homogeneous networks as described in Sec. III D. Homophily (heterophily) region is below (above) the diagonal πa = πaa. Expected epidemic size atke = 1.8 for (A) random app adoption and for (C) high-degree targeting strategy. The epidemic threshold for (B) random app adoption and for (D) the high-degree targeting strategy. Thresholds are from theoretical results given by Eq. 21 and expected epidemic sizes are from percolation simulations. The empty white region is the spectrum that having such a homo/heterophilic population is impossible.

being an app adopter. Next, we explore how homophily/heterophily affects epidemics based on app usage using the modular network model (MN). A Swiss experiment has reported that while a small fraction of π a = 0.2 of people have used the app, the inside connections between them was high enough such that π aa = 0.7 [40] . Fig. 5 illustrates that increasing heterophily leads to a lower epidemic threshold and larger epidemic size for a fixedk e . Increasing homophily from random mixing is initially preferable, but the optimum lies between random mixing and full homophily. For the expected epidemic size, strong heterophily is especially detrimental (see Fig. 5A for the homogeneous network and with random app adoption and in Fig. 5C for high-degree targeting strategy). The optimum value for heterophily/homophily is evident for the epidemic thresholds in Fig. 5B and Fig. 5D , respectively, for the random and high-degree targeting strategies. Fig. 6B gives a more clear picture of existence of an optimum value for the epidemic threshold in the case of homophily. According to Eq. 21, for each fraction of app-users π a in the network, the epidemic thresholdk c (π a , π aa ) can be maximised by controlling the homophily in app adoption π aa . The pattern in the Fig. 6B is very similar to the convex pattern in Fig. 5B , even though they are calculated using different approximations and approaches (see Sec. II A and Sec. II B).

Another view on the effect of homophily and heterophily is given by finding the critical fraction app-users π c a that is needed to go beyond the epidemic threshold as a function of (π aa andk e ). Fig. 6A depicts this relationship based on Eq. 23 and shows that π c a is not monotonic function of π aa but there is an optimal value of π aa giving the lowest fraction apps that are needed to stop the epidemic. Note that in a network without homophily or heterophily π c a increases monotonically as the function of the effective connectivityk e (see the inset of Fig. 6a) . for each πa which is given by to Eq. 22. The pattern here is consistent with another approximation shown in Fig. 5B , while epidemic threshold values are slightly different due to different levels of approximations. Note that here we display the epidemic threshold for all values of πaa and πa such that 0 ≤ πna ≤ 1 so the networks with some of these parameters can be created in practice [95] .

In this article, we have developed two flexible analytic approximations to SIR epidemics in the presence of contact tracing apps. First, we use a branching process to derive explicit analytical solutions for the epidemic thresholds. Second, we expand the framework of using self-consistent equations to analyze digital contact tracing [44] , which is an alternative to other approaches [71] . Contrary to the conventional SIR spreading, a full picture of the late-state epidemics in the presence of digital contact tracing is not given by a single observable (the component size), but one also needs two variables (normal and extended component sizes). These correspond to the probability of the epidemic and the epidemic size, which are equivalent in the SIR process. Here we see that the two quantities can be significantly different if the number of application users is high.

Our numerical results illustrate that the effects of digital contact tracing can be very sensitive to the network structure, how applications are distributed among the population, and how well the tracing works. Realistic estimates of the effects of digital contact tracing can only be achieved if one can choose correct parameter ranges in a high-dimensional parameter space. In this study, we had 6 of such parameters: the shape of the degree distribution, average degree, amount of heterophily/homophily, application prevalence, quarantine probability and targeting strategy. While we were able to establish and confirm basic laws governing individual parameters and some combinations of parameters, exploring such a parameter space fully for possible compound effects is out of the reach in simulations. However, these effects can be largely revealed by inspecting the analytic equations we derived.

There are several open questions for which this study and other studies only hint at the results. There are types of network structures we ignore here. For example, the heterophily and homophily could be constructed in the network in slightly different ways. For example, a case study using a realistic agent-based model [69] has recently considered, among many other modeling choices aimed at precise calibration on the French population, the contributions of individuals of different ages. One could also develop a more realistic version of our stylized model to systematically analyze the effects of homophily caused by an age-based contact structure and different scenarios of app adoption within that structure. The age-based approach would also allow one to estimate the benefits of applications relative to the risk groups in this model.

Overall the problem of digital contact tracing offers not only a practical problem to solve but also an interesting theoretical puzzle because it introduces memory to the epidemic process. This memory is limited to one step within the tracing model we use here, but one could also use multi-step tracing, where also the second neighbors of infected nodes are quarantined in the case that the first neighbors have already passed on the infection. Further, here we ignore effects such as quarantines that do not directly stop the infection from one application user to another from spreading further. However, in the case of a strong group structure in the network, there could be situations where a non-application user A infects application user B, who alerts another application user C, who actually gets infected by A and stops the spreading because of the quarantine. Analyzing such more complicated phenomena can provide challenges for network scientists for years to come.

The simulations presented above were performed using computer resources within the Aalto University School of Science "Science-IT" project. AF acknowledges funding by Science Foundation Ireland Grant No. 16 

The heterogeneity in the number of contacts could also be modeled with other distributions, for example, the negative binomial distribution. This would have the advantage of having a non-divergent second moment supported by empirical evidence. However, we aimed to illustrate the effects of degree heterogeneity and not perform a systematic analysis. We already have many different random network models and combinations of parameters related to the app distribution, how well it works, and disease parameters. The equations we give make it possible for one to do such analysis if needed. Therefore we limited our main discussion to the differences observed in a power-law network with exponent −3 compared to the results for homogeneous networks. However, to satisfy the curiosity of the reader interested in extreme heterogeneity, we have now added Fig. 7 showing the expected epidemic size for exponent −2.5 as suggested by the referee.

About quarantine failures, as it was shown in Fig. 3 , Fig. 8 and 9 also show that contact tracing can yield very good results in terms of reducing the epidemic threshold and expected epidemic size if everything goes right at least for 50% and half of the people use the apps. This effect is more prominent if we go for the high-degree targeting strategy, especially in heterogeneous networks, as shown in Fig. 8d and 9d. Fig. 10 and 11 show that there is an optimum value for homophily in app adoption as it was shown in Fig. 5 and Fig. 6 . The only exception is when we follow a high-degree targeting strategy in heterogeneous networks. In this case, we can see the hub effect on the epidemic threshold and size. 7 . The expected epidemic size computed with theoretical results introduced in Sec. II A for heterogeneous networks with degree distribution P (k) ∝ k −3 (solid lines) compared with ones with P (k) ∝ k −2.5 (dotted lines) as a function of the effective connectivityke when apps are distributed uniformly randomly. Results are normalised to the network size N and shown for πa ∈ [0, 0.2, 0.4, 0.6, 0.8] with different colors. Note that by lowering the exponent, epidemic thresholds get closer to zero and the expected epidemic sizes decrease since there more low-degree nodes in the network. Therefore, by lowering the exponent, while we can add more degree heterogeneity in the network, the physics of the phenomena does not change. The epidemic threshold as a function of quarantine probability papp and app adoption rate πa. The effect of quarantine failures in homogeneous networks with (a) random app adoption (b) and high-degree targeting strategy. Also, for heterogeneous networks with a power-law degree distribution with (c) random app adoption (d) and high-degree targeting strategy. All threshold values larger than 5 are shown with the same color.

FIG. 9. Expected epidemic size in the case of quarantine failures. Expected epidemic size atke = 1.8 for homogeneous networks with (a) random app adoption (b) and high-degree targeting strategy. Also, for heterogeneous networks with a power-law degree distribution with (c) random app adoption (d) and high-degree targeting strategy. In (b) and (d) the pattern is different due to the effects of hubs. When doing a high-degree targeting strategy, quarantine failures are more significant since the infected ones are highly influential on the spreading dynamics. 10 . The effect of homophily/heterophily in app adoption on the expected epidemic size. Expected epidemic size atke = 1.8 from percolation simulations for homogeneous networks with (a) random app adoption (b) and high-degree targeting strategy. Also, for heterogeneous networks with a power-law degree distribution with (c) random app adoption (d) and high-degree targeting strategy. The empty white region is the spectrum that having such a homo/heterophilic population is impossible.

FIG. 11. The effect of homophily/heterophily in app adoption on the epidemic threshold and optimum pattern for homophily. Epidemic thresholds for homogeneous networks with (a) random app adoption (b) and high-degree targeting strategy. Also, for heterogeneous networks with a power-law degree distribution with (c) random app adoption (d) and highdegree targeting strategy. The empty white region is the spectrum that having such a homo/heterophilic population is impossible.

Non-pharmaceutical interventions during the covid-19 pandemic: A review

Face masks effectively limit the probability of sars-cov-2 transmission

Modeling infectious diseases in humans and animals

Selective epidemiologic control in smallpox eradication

Contact tracing performance during the ebola epidemic in liberia

Contact tracing and the control of human immunodeficiency virus infection

Contact investigation for tuberculosis: a systematic review and meta-analysis

Modelling the impact of testing, contact tracing and household quarantine on second waves of covid-19

The impact of covid-19 and strategies for mitigation and suppression in low-and middle-income countries

Projecting the transmission dynamics of sars-cov-2 through the postpandemic period

Beyond covid-19: network science and sustainable exit strategies

Preparing for a responsible lockdown exit strategy

An early warning approach to monitor covid-19 activity with multiple digital traces in near real time

The effectiveness of backward contact tracing in networks

Assessment of physiological signs associated with covid-19 measured using wearable devices

Wearable sensors for covid-19: a call to action to harness our digital infrastructure for remote patient monitoring and virtual assessments

Prevalence of asymptomatic sars-cov-2 infection: a narrative review

The impact of biosensing in a pandemic outbreak: Covid-19

Wearable sensor data and self-reported symptoms for covid-19 detection

Pre-symptomatic detection of covid-19 from smartwatch data

High-resolution measurements of face-toface contact patterns in a primary school

Rapid isothermal amplification and portable detection system for sars-cov-2

Direct detection of sars-cov-2 using crispr-cas13a and a mobile phone

Loss of smell and taste in combination with other symptoms is a strong predictor of covid-19 infection

Detecting covid-19 infection hotspots in england using large-scale self-reported data from a mobile application: a prospective, observational study

Hands-free smartphone-based diagnostics for simultaneous detection of zika, chikungunya, and dengue at pointof-care

Citizen science and smartphone e-entomology enables low-cost upscaling of mosquito surveillance

Controlling emerging infectious diseases like sars

Case isolation and contact tracing can prevent the spread of smallpox

Contact tracing and disease control

Infectious disease control using contact tracing in random and scale-free networks

Partner notification to prevent pelvic inflammatory disease in women: cost-effectiveness of two strategies

Factors that make an infectious disease outbreak controllable

Severijnen, Modeling prevention strategies for gonorrhea and chlamydia using stochastic network simulations

The effectiveness of contact tracing in emerging epidemics

Contact tracing and epidemics control in social networks

A population-based controlled experiment assessing the epidemiological impact of digital contact tracing

Automated and partly automated contact tracing: a systematic review to inform the control of covid-19

Early evidence of effectiveness of digital contact tracing for sars-cov-2 in switzerland

Contact tracing for covid-19-a digital inoculation against future pandemics

Digital proximity tracing on empirical contact networks for pandemic control

Examining sars-cov-2 interventions in residential colleges using an empirical network

Effect of manual and digital contact tracing on covid-19 outbreaks: a study on empirical contact data

The epidemiological impact of the nhs covid-19 app

Evolution and structure of the Internet: A statistical physics approach

Evolution of networks: From biological nets to the Internet and WWW

Statistical mechanics of complex networks

Epidemic spreading in scale-free networks

Beyond r 0: heterogeneity in secondary infections and probabilistic epidemic forecasting

Absence of epidemic threshold in scale-free networks with degree correlations

Velocity and hierarchical spread of epidemic outbreaks in scale-free networks

Epidemic dynamics in finite size scale-free networks

Critical percolation on scale-free random graphs: new universality class for the configuration model

A statistical network analysis of the hiv/aids epidemics in cuba

Superspreading and the effect of individual variation on disease emergence

Why do some covid-19 patients infect many others, whereas most don't spread the virus at all

Why your friends have more friends than you do

Threshold effects for two pathogens spreading on a network

Secondary attack rate and superspreading events for sars-cov-2

Superspreading sars events, beijing

Quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing

The fallibility of contact-tracing apps

Birds of a feather: Homophily in social networks

Modelling the influence of human behaviour on the spread of infectious diseases: a review

An experimental study of homophily in the adoption of health behavior

Drivers of acceptance of covid-19 proximity tracing apps in switzerland: panel survey analysis

Anatomy of digital contact tracing: role of age, transmission setting, adoption and case detection

Tracking and promoting the usage of a covid-19 contact tracing app

Message-passing approach to epidemic tracing and mitigation with apps

Contact tracing in configuration models

The role of directionality, heterogeneity and correlations in epidemic risk and spread

Tracking the outbreak and far beyond: How are public authorities using mobile apps to control covid-19 pandemic, Multidisciplinary Perspectives of Communication in a Pandemic Context

Modeling the combined effect of digital exposure notification and non-pharmaceutical interventions on the covid-19 epidemic in washington state

Optimising the mitigation of epidemic spreading through targeted adoption of contact tracing apps

Epidemiological changes on the isle of wight after the launch of the nhs test and trace programme: a preliminary analysis

Homophily in the adoption of digital proximity tracing apps shapes the evolution of epidemics

Spread of epidemic disease on networks

Epidemic processes in complex networks

Error and attack tolerance of complex networks

Resilience of the internet to random breakdowns

Dynamical processes on complex networks

Critical phenomena in complex networks

Epidemic percolation networks, epidemic outcomes, and interventions, Interdisciplinary perspectives on infectious diseases

Random walk with memory on complex networks

Effects of memory on information spreading in complex networks

Epidemic size and probability in populations with heterogeneous infectivity and susceptibility

Second look at the spread of epidemics on networks

On the critical behavior of the general epidemic process and dynamical percolation

Random graphs

The average distances in random graphs with given expected degrees

Connected components in random graphs with given expected degree sequences

Efficient generation of networks with given expected degrees

Cascades on correlated and modular random networks

Gleeson, Dynamics on modular networks with heterogeneous correlations

Multitype branching and graph product theory of infectious disease outbreaks