key: cord-0477772-mp0irx0m
authors: Kong, Quyu; Garcia-Herranz, Manuel; Dotu, Ivan; Cebrian, Manuel
title: Contact Tracing: Computational Bounds, Limitations and Implications
date: 2021-02-26
journal: nan
DOI: nan
sha: 9e81b2a5837b36bfd23963e69e139bfd07d99da3
doc_id: 477772
cord_uid: mp0irx0m

Contact tracing has been extensively studied from different perspectives in recent years. However, there is no clear indication of why this intervention has proven effective in some epidemics (SARS) and mostly ineffective in some others (COVID-19). Here, we perform an exhaustive evaluation of random testing and contact tracing on novel superspreading random networks to try to identify which epidemics are more containable with such measures. We also explore the suitability of positive rates as a proxy of the actual infection statuses of the population. Moreover, we propose novel ideal strategies to explore the potential limits of both testing and tracing strategies. Our study counsels caution, both at assuming epidemic containment and at inferring the actual epidemic progress, with current testing or tracing strategies. However, it also brings a ray of light for the future, with the promise of the potential of novel testing strategies that can achieve great effectiveness.

The COVID-19 pandemic has posed significant threats to public health globally since 2020. It spreads rapidly around the world and it is hard to control due to a large proportion of pre-symptomatic and asymptomatic cases [40] . Although preventive measures, such as social distancing [50] and lockdowns [19, 51] , have shown to be effective in slowing down the disease spreading, they come at the cost of economic downturn and the risk of a second wave after lifting restrictions. In order to respond to such time-critical emergencies while avoiding strict measures, testing of potential infections via contact data and quarantining of positive cases have been proposed as alternative measures [2, 38] . However, limited testing/tracing resources have imposed challenges on the mass deployment of such measures.

In this work, we tackle an extensive evaluation of different random testing and contact tracing strategies to ascertain their efficiency. To answer this, we first build a simulation tool involving two major components -an extended stochastic epidemic model simulator with tracing and quarantining actions, and a contact network simulator that takes both 0 and the dispersion parameter into account. These allow us to identify the characteristics of each strategy for a large set of parameter combinations. Moreover, by introducing two ideal (despite unreal) strategies, we can propose a * Work done during an internship at the Max Planck Institute for Human Development † Corresponding authors classification of an epidemic (along with the initial time point and testing resources) with levels: 1) can be contained with classic contact tracing (either forward, backward or both), 2) can be contained with contact tracing with priorities, 3) can be contained with smart testing strategies and 4) cannot be contained with testing.

Overall, we find that: our superspreading networks better characterize the actual dispersion effect; backward contact tracing is slightly better at finding positive cases for epidemics with small dispersion parameter with a small number of tests; and that there is both a gap between classic contact tracing and ideal contact tracing and between it and ideal testing, that make exploring smart contact tracing and testing strategies worth it. We also show the limitations of contact tracing as a proxy for the actual epidemic status and propose random testing as a better estimate for such inference. Finally, our models assume full knowledge of the contact network, akin to using a perfect digital tracking app; other modeling decisions attempt at conveying optimal computational conditions for contact tracing and testing.

Compartmental epidemic models, such as the Susceptible-Infected-Recovered (SIR) model [26] , are classical models used for gaining insights into disease transmission dynamics. Such models define several infection states for individuals in a population, and the dynamics of individuals moving between states can be modeled deterministically or stochastically [3] .

Many works have explored epidemic simulations over random networks [28, 43, 44] . The key issues to consider in this context are the stochasticity of epidemic diffusions and the network architecture.

Stochastic simulations of epidemics are conducted with the Gillespie algorithm [28] . The algorithm samples, at each step, from the total rates of all possible events, the next event time and an event type-e.g., an infection event or a recovery event. Every simulation continues until no infectious individuals can be found in the population.

Keeling and Eames [24] provide a review of properties of epidemics over a variety of common network structures, including Erdős-Rényi random networks, lattices, small-world networks, scale-free networks, and spatial networks. Others [27, 39] [2, 36, 56] seek to construct networks from real-world contact data. In this work, we employ random networks with controllable epidemic parameters to analyze the limits of contact tracing and random testing strategies.

The number of secondary infections of individuals has been shown to be heterogeneous, and numerous "superspreading events" have been reported [53] . Lloyd-Smith et al. [37] proposed to model this phenomenon with a negative-binomial distribution parameterized by the basic reproduction number 0 and a dispersion parameter . The dispersion parameter controls the superspreading effect of a given disease, e.g., = 0.16 was estimated for SARS [37] where lower values indicate a stronger superspreading effect. Specifically, modeling efforts on the superspreading effect of the ongoing COVID-19 pandemic have been focused on estimating the dispersion parameter [12] , understanding the causes [46] and evaluating control strategies [23] .

As a tool for containing infectious epidemics, contact tracing is the process of identifying individuals who have been in contact with known infected cases. This process usually involves intensive manual efforts from specialists in interviewing infected patients and reconstructing their potential contacts [18] . Prior work has targeted improving the efficiency and accuracy of this process through communication traces [16] or dedicated digital apps [1, 5] .

Effective contact tracing is considered to be critical in controlling an ongoing epidemic [48, 49] , which successfully contained SARS [35] . However, this tool has experienced failures before, such as the British foot-and-mouth epidemic [17] . Therefore, a rich set of literature has studied the effectiveness of applying contact tracing in controlling epidemics [7, 22, 29] and, especially, in the ongoing COVID-19 pandemic [2, 9, 18, 25, 32] .

The simplest random networks, the Erdős-Rényi (ER) networks [15] , have been shown to reduce the epidemic models to their classical fully mixed variants [24] . Therefore, in the context of epidemics, they can be fully characterized given the number of nodes and the 0 , , and parameters. In the same spirit, we have developed a new method to generate random networks taking into account the superspreading effect. Given the number of nodes , the epidemic model parameters and the dispersion parameter , we can fully define the node degree distributions of such networks, which we call superspreading random networks.

We show that the node degree distribution of a superspreading random network is defined as

where (· | , 0 ( + ) ) is a negative binomial distribution parameterized by the dispersion parameter and mean 0 ( + ) . We then apply the configuration model [45] to generate random networks. The detailed derivation is explained in Methods. Given the assumption of infection independence (see Methods), we seek to ascertain that desired dispersion parameters are achieved and maintained throughout the exponential phase of the epidemic. Fig. 1a shows that this is the case up to = 1 regardless of the recovery and infection rates and that it degrades at lower infection rates and higher recovery rates. In contrast, in ER networks, dispersion parameters cannot be controlled. As shown in Fig. 1b , it is always above 1, and, in general, it increases with higher infection rates and lower recovery rates. Both Figs. 1a and 1b show results for the first 100 infected nodes with 10 initial infected nodes. Similar results for higher numbers of infected nodes are shown in Supplementary Fig. 1 . As it can be seen, using ER networks for an epidemic in which is known to be low (such as COVID-19 or SARS) would lead to a misrepresentation of the distribution of the infections.

Prior works [37, 46] typically model heterogeneous secondary infection numbers with variances in individual infectiousness, i.e., posing a Gamma distribution on infection rates. Supplementary Fig. 2 shows that, for low 0 values and especially for low recovery rates, the difference between the intended dispersion parameter and the achieved dispersion parameter in the first 100 infected nodes is significant and much larger in this case than for the superspreading network.

Finally, we show in Fig. 1c the node degree distribution of superspreading networks for different values of the dispersion parameters, 0 , infection rates, and recovery rates. Other network measures, such as clustering coefficients, are shown in Supplementary Fig. 3 .

See Methods for a complete description of the computational setup used throughout this section.

Positive rates from testing are widely adopted for estimating underlying infection statuses in populations [55] . Therefore, in this part, we evaluate the correlation between positive rates (see Methods) and actual infected numbers on each day. Fig. 2 depicts the averaged daily correlations for SIR model (A similar study is depicted in Supplementary Fig. 4 for the SEIR model).

First, we can see that, in general, for small 0 , small , large and a small number of tests, the correlations are relatively poor, so positive rates from all strategies are misrepresenting the actual infection. These correlations become better as the 0 , and the number of tests increase.

For RT (Fig. 2a) , we observe close proportions between daily confirmed rates and actual infection rates in general. We can conclude that random testing, for a sufficiently large 0 and number of tests, is a good estimator of the actual epidemic progress. Note that correlations always improve as the number of daily tests increases, even though the testing is also affecting the epidemic since positives are being quarantined. For a scenario in which testing does not affect the epidemic ( = 0) see Supplementary Fig. 5a for SIR or Supplementary Fig. 5b for SEIR.

Figs. 2b and 2c show the correlations for forward and backward contact tracing, respectively. We can see that in both cases, positive rates often overestimate the actual infections, especially for low and values. Moreover, BCT overestimates more than FCT, especially when the number of tests is low and the epidemic has a low dispersion parameter. This also means that BCT is better at finding positive contacts under these conditions. Also, we see instances in which CT daily positive rates increase when more daily tests are deployed, whereas actual infections in the population decrease. For example, when 0 = 10, = 0.05, = 0.5 in Fig. 2b, 1 , 000 daily tests lead to higher proportions of positive cases than 100 daily tests, which results in a false estimation of the epidemic.

In conclusion, positive rates from contact tracing can be misleading and should be considered with caution. However, positive rates from random testing with a sufficiently large number of daily tests show more promise as a proxy for the actual epidemic progress (A similar trend can be seen in Supplementary Fig. 4 for the SEIR model).

Here, in order to classify different epidemics, we introduce two novel strategies that represent both the ceiling of contact tracing and the ceiling of testing. Fig. 3 shows a cartoon representation of the effect of these strategies on a toy infection network and their differences with both random testing and contact tracing. We call these ideal strategies oracles, and they are defined in Methods.

When comparing average total infections, Fig. 4a shows that, for the networks simulated with = 0.1, around 20% of the population may be infected in the worst scenario without any intervention and less than 10% of population when 0 < 2.5. On the other hand, when intervention operations are applied, epidemics with higher and 0 are more difficult to contain. Only GOT contains epidemics within a small number of daily tests, which becomes more difficult as 0 grows. COT performs slightly better than the other contact tracing strategies (forward and backward). However, BCT does not show obvious advantages over FCT. In general, contact tracing does not seem to have a considerable impact in the course of the epidemic, and much less so random testing. Also, note that all epidemics result in much larger infected populations when simulations occur over an ER network. For SEIR models in Supplementary Fig. 6 , it can also be found that incubation periods (when individuals are not contagious but they are detectable) result in more containable epidemics when compared to the same parameter sets in SIR models.

The nature of the superspreading events when is low renders a lower portion of the population infected at the end of the epidemic (Figure 3a ). However, since superspreading events will have more impact in denser communities (connected components in the network), especially in those where initially infected individuals were found, we explore the average percentage of the infected population in the top 5 largest communities in the network. Figure 3b shows that this has no effect for large values or in ER networks but shows noticeable differences, especially for = 0.1. Note that, for example, 0 = 2.5 and = 0.05 with no interventions reaches only 10 to 15 percent of infected individuals at the end of the epidemic, but around 80 percent within the densest communities.

From Fig. 4c and Supplementary Fig. 6c , we see that high and small lead to longer epidemic diffusion times, peaking at 0 = 2.5.

Here we see an opposite trend to Fig. 4a with longer times for smaller 0 and lower values. Also, contact tracing seems to have a greater impact on times than on the total infected population. When comparing SEIR models to SIR models, longer days are observed overall due to the incubation periods.

Finally, Fig. 4 helps us classify parameter sets in terms of how containable they are. Thus, we can see that epidemics with 0 = 1 can be contained even without any interventions; however, contact tracing does have an impact in the densest communities and in the final time of the epidemic. For = 0.1 tracing interventions are capable of partially containing the epidemics with a sufficient number of resources. For larger values and as 0 increases, epidemics are not containable with RT or any CT strategies. Only the global oracle, equipped with large numbers of daily tests, can have an impact. Lastly, for extreme 0 values (10 and beyond) and large values (above 1 and for ER networks), there is no intervention that can contain the epidemic.

Many countries have established daily risk level thresholds that are accompanied by different measures. For illustration purposes, we choose the levels defined by the Spanish government (see Methods). Fig. 5 depicts the maximum threat level achieved during an epidemic for each of the epidemic parameter values, the number of daily tests, and different contact tracing/testing strategies. Fig. 5a shows the actual levels achieved. As it can be seen, for almost all cases, the maximum threat level is reached at some point in time during the epidemic (except for low 0 values). Fig. 5b shows the maximum threat level when considering all positive tests as the actual total positives in the population. This inference consistently underestimates the maximum threat level. On the other hand, Fig. 5c shows the maximum threat levels if those are inferred from the positive rates found during testing. As expected from our correlations section results, this scenario consistently overestimates threat levels (except RT).

Finally, we constructed a mixed strategy in which 100 daily tests were devoted to random testing regardless of the strategy (for 10 daily tests we devoted 5 to random testing and for 100 daily tests 50 were devoted to random testing). Fig. 5d shows the maximum threat levels reached with this new strategy when only positive rates from random testing are used. Although far from perfect, we can see that this strategy shows maximum threat levels very similar to the actual ones shown in Fig. 5a .

Note that this mixed strategy could potentially result in different infection levels, since not all tests are devoted to the given strategy. However, Supplementary Fig. 7 shows that the differences between this strategy and the non-mixed ones (compare with Fig. 4 ) are negligible.

As an example and proof of concept, we seek to quantify the effect of intervention strategies on several known diseases with their reported epidemic parameters, including Measles [6, 47, 54] , H1N1 [11, 20, 31] , Ebola [4, 34] , SARS [10] and COVID-19 [2, 12] . Their detailed parameters are listed in Supplementary Table 1. In our simulations, we fix the population number to = 100, 000 and 0 = 10 initial infections. Again, we vary the number of daily tests from 0 to 10, 000. Supplementary Fig. 8 shows the same results for the cumulative cases.

In Figs. 6a to 6d, overall, the operations with oracle strategies contain epidemics better, while random testing requires much more daily tests to control the epidemic spreading. In particular, both contact tracing operations (forward and backward) lead to similar average daily infection curves. When comparing different diseases, SARS (Fig. 6c) is the most containable epidemic, while Measles cannot be controlled due to its extreme 0 value and its infection processes in all simulations finished shortly compared to other epidemics as shown in Fig. 6e . These two are the extreme points where an epidemic can be controlled with CT and where it cannot be controlled even with GOT, respectively. For the rest, H1N1 (Fig. 6c ) can be contained with relatively few resources, and Ebola (Fig. 6b) and Covid-19 might need one order of magnitude more to reach containment. Note that SEIR models are easier to contain under the assumption that E individuals are not contagious, but they will be tested positive. Finally, note that the global oracle can control all epidemics with few resources, except for Measles.

Supplementary Fig. 9 and Supplementary Fig. 10 show the same results when we disregard the dispersion parameter and run the simulations over ER networks, both for the epidemic curve and cumulative cases, respectively. It can be seen that final infections are much higher than those found in superspreading networks with low dispersion parameters.

Of relevance, with the parameters for COVID-19 from [2, 12] , only around 10% of the population is infected. A reasonable amount of daily tests can reduce this proportion to around 2 − 5%. This is mainly due to the low dispersion parameter and the static nature of our network simulations (and no re-introductions). We believe this is in accordance with what most countries are suffering nowadays in terms of infected population per wave [30, 33, 52] .

Finally, Supplementary Fig. 11 shows the daily threat levels for each of the epidemics. Note that the maximum threat level is reached at some point during the epidemic in all cases (even when the final proportion of infected individuals is low -see Supplementary Fig. 8 ). As expected, inferring threat levels from positives found in tests underestimates the actual infections, whereas inferring threat levels from positive rates from all tests overestimate them. In general, inferring threat levels from positives found only in random testing (mixed strategy) is a much closer estimate to the actual levels.

In this paper, we study the computational bounds of contact tracing and random testing. We introduce random networks that take into account the superspreading effect with controllable dispersion parameters for the first time. We first find that backward contact tracing is slightly better than forward contact tracing for low dispersion parameters and a small limit of daily tests. We then find the limitation of contact tracing as a means to describing the actual epidemic status. Afterward, we provide a classification of epidemics in terms of how containable they are, in which we find that there is a gap between classic contact tracing and optimal contact tracing, and between this an optimal testing. The implications are exploring both smart contact tracing and smart testing techniques is worth it. We also see that the length of the epidemic can be misleading and that contact tracing also has an impact on it.

Finally, we want to address the fact that other recent works on contact tracing models [2, 18, 32] for COVID-19 show more optimistic results than what transpires from our study regarding the impact of contact tracing. We believe three main factors contribute to this difference: (1) The superspreading effects controlled by the dispersion parameter are not explicitly considered in these works, while our study is done over networks where we control the dispersion parameter for the simulations. (2) Testing and/or tracing resources are unlimited for these references, while we have analyzed various levels of available testing or tracing resources up to 10% of the total population of daily tests. (3) Lastly, most of these papers ( [2, 18] ) consider between 35% and 70% of infected individuals as triggers of contact tracing, while for this work, only hospitalized individuals (which is 5% of infected individuals throughout the main text) and positives found via random testing or contact tracing are considered.

For illustration purposes, we turn our focus to the exhaustive work carried out in [2] to try to ascertain objectively whether these three issues can explain the differences observed. First, we see that, when we run the epidemic diffusion process as stated in [2] (see Methods) over the network provided in https://github.com/ aaleta/NHB_COVID, the resulting dispersion parameter is = 2.5, which is higher than the estimated value of 0.1 ( [12] ). Second, again with the same parameters from [2] , Supplementary Fig. 12a shows the number of tests performed each day. It can be seen that the testing peak goes over 500 tests (for 0 = 1), which represents over 5% of the total population of daily tests to achieve a final 91% of the infected population. Finally, Supplementary Fig. 12b brings [2] framework closer to our own by limiting the number of daily tests and setting contact tracing trigger cases to hospitalizations (with a probability of 2%). Here we can see how the impact of contact tracing is drastically reduced.

In conclusion, our study provides a cautionary tale of contact tracing. It also suggests a separation between tracing as a means to containing an epidemic and testing as a means of inferring the progress of the epidemic. Moreover, it highlights the need for more intelligent testing strategies in order to contain most of the possible (future) epidemics. 4 

Bartlett and Plank [8] reveal the analytical connection between the epidemic parameters and network parameters for Erdős-Rényi random networks [14] , i.e.,

where a network is parameterized by the average node degree . When → ∞, the node degrees of these networks are Poisson distributed with a mean value 0 ( + ) .

In this section, we describe the algorithm for simulating random contact networks that lead to negative binomial distributed secondary infections with given parameters.

To generate a random network, we first seek to derive its node degree distribution. We provide here some definitions including several probability generating functions (PGFs) following [44] :

• Given a random network with its degree distribution P( = ) = , its PGF is 0 ( ) = ∞ =1 and the average degree

• The excess edge of a vertex is defined as the number of remaining edges connected to the vertex when follow a random edge to that vertex. This corresponds the remaining neighbor counts when disease diffuses to a new individual. The probability that a vertex at the end of a random edge has excess degree − 1 is given in [44] as < > . Therefore, the PGF for the excess degree of a vertex is

• More importantly, given the probability that a infected individual infects his/her neighbor , the PGF of the number of infected neighbors is

Similarly, the PGF for the excess occupied degree is ′ 1 ( ) = 1 (1 + ( − 1) ) We are then able to derive the degree distribution. The assumption of a negative binomial (NB) distributed secondary infections indicates that ′ 1 ( ) is also the PGF of an NB distribution. Given the average secondary infection 0 and the dispersion parameter , we have

where the second step is due to the binomial theorem. With a change of variable, we get

where 1 ( ) is simply another NB with mean 0 and dispersion parameter . We denote its probability mass function as −1 which, as mentioned above, is equal to < > = −1 . Due to ∞ =1 = 1, we are able to compute the average node degree

which can be easily solved numerically. We then need to define the infection probability given the infection rate and recovery rate. For an infected individual, assuming his/her recovery follows a rate and he/she is infecting a neighbor with a rate , we can easily derive the probability of the neighbor being infected as + . If we further introduce a relaxed assumption that all neighbors are infected i.i.d., we then have = + .

Given the derived degree distributions, we can then simulate a random network by applying a configuration model [45] . We note that any self-loops or parallel edges are removed from the generated networks.

Contact networks simulated via the method described in Methods is based on an assumption that one infectious individual infects his/her neighbors independently. However, this relaxed assumption may lead to errors between chosen dispersion parameters and their empirical values in simulations. Here we explore such discrepancies via simulations.

Nelakonda and Rhomberg [42] show that clustering coefficients of the configuration model are defined as:

where is the number of nodes. This indicates that the clustering coefficients are 0 in the limit of large networks.

Here we detail the main concepts of our experimental setup. As it will be seen, all modeling decisions are geared towards a more favorable scenario for the impact of contact tracing rather than towards a more realistic one. Therefore, the following results are best-case scenarios for all strategies. The main modeling considerations are as follows:

• Compartmental models. We simulate outbreaks with two epidemic models, SIR and SEIR. The SIR model assigns three possible infection statuses to individual nodes, susceptible ( ), infected ( ), and recovered ( ), whereas the SEIR model further introduces the exposed ( ) for modeling incubation periods of diseases. While most nodes are in the susceptible status ( ), 0 number of nodes are initialized as infectious nodes ( ) at = 0, who spread the disease to their neighbor nodes at a rate . The dynamics of infectious periods (i.e., from to ) and incubation periods (i.e., from to ) of individual nodes are defined by rates and , respectively.

Besides, we introduce a hospitalized ( ) compartment in Experimental setup section to model the possibility of infected individuals disclosing their statuses via hospitalization. The rate of to is , and it can be directly calculated from the probability of hospitalization . Moreover, the basic reproduction number, 0 , is an important epidemic quantity that defines the average secondary infections caused by a single infected individual. Throughout the paper, results for SIR models are shown in the main text figures, while those for SEIR are depicted in supplementary figures. The main decision about the SEIR model is that individuals who are in incubation periods cannot infect but can be detected as positive in a test. Both models are extended with a Hospitalization rate, where some infected individuals become hospitalized. Hospitalized individuals are considered as automatic positives who trigger contact tracing.

• Tracing and testing setups. In our experiments, we explore the limits of contact tracing and random testing in containing epidemics. Given a fixed number of tests per day, these strategies are applied daily in epidemic simulations, and different individuals are then tested for their infection statuses. We assume the test results are provided immediately without false positives and false negatives. We disregard the number of tracers and assume instead the limiting resources are the daily tests. Calculating the number of tracers to daily tests is not discussed in this work. • Random testing (RT). This operation devotes resources to randomized testing within a population. Random tests can be performed in real-life scenarios by contacting random people and asking them to take a test or by announcing voluntary tests at specific locations. Here, we assume all infection statuses will receive an equal probability of selection for tests and also that recovered individuals can be chosen for such random tests (only individuals that have tested positive and still have not recovered are not selected). These two decisions together allow us for a unique implementation that accounts for both real-life scenarios at the same time. • Contact tracing (CT). During simulations, we maintain a queue of individuals to be tested and traced. Neighbors of positive cases, found by contact tracing, random testing, or hospitalization, are recorded in the queue. The probability of discovering a contact of an infected node is denoted as and is set to 1 (akin to using a perfect contact tracing app [1] ). We note that, when the queue becomes empty, all remaining testing resources are devoted to random testing, so that all tests are performed every day. Two possible strategies arise when individuals in the queue are prioritized differently: Forward contact tracing (FCT) orders the tests for individuals according to the times when they are added to the queue. On the other hand, backward contact tracing (BCT) prioritizes the tests to neighbors of positive cases, up to one hop in networks. This latter approach has been proposed for epidemics with low dispersion parameter [13, 21, 41] , aiming to find sources of the infections.

• Quarantine. We assume a scenario where nodes that have tested positive are quarantined-i.e., removed from the contact network-with a probability that we denote as . This probability is set to 1. • Parameter exploration. We quantify the effect of different intervention operations on a wide range of possible epidemics by varying all epidemic parameters. We also explore the space of initial infected (with no re-introductions) and the number of daily tests. A complete 

Since we cannot select (for testing) individuals that have already tested positive but are not yet recovered (which we call below), the positive rates are calculated as follows:

The contact tracing oracle (CTO) prioritizes individuals in the testing queue so that the actual infected ones are visited (tested) first. This provides an upper bound on the best contact tracing strategy.

The global oracle testing (GOT) assumes the availability of information about all newly infected individuals who are then tested and quarantined. Thus, the contact tracing queue is filled only with infected individuals. This assumption leads to an upper bound for future smart testing strategies.

We note that, although these two oracle scenarios are unrealistic in practice, they provide ideal upper-bounds for potential new strategies that might arise in the future.

We use the Spanish Government threat levels, which are based on the total infection number, in the last 14 days, per 100, 000 individuals. These levels are the following: (1) less than 25 infected, (2) between 25 and 50, (3) between 50 and 150, (4) between 150 and 250 and (5) over 250 infected individuals per 100, 000 individuals. In practice, these levels also take into account positive rates and ICU occupancy, however, we ignore these here for the sake of simplicity.

We apply the network provided in https://github.com/aaleta/NHB_ COVID by Aleta et al. [2] . The provided network is unweighted and has 10, 000 nodes and an average node degree of 10. In terms of the simulations, we implement the epidemic parameter sets following the statistics detailed in [2] where a 50% probability of the discovery of symptomatic infections is applied and 40% of contacts are successfully traced. The only two differences with [2] are: testing and quarantining are performed immediately (whenever tests are available) and no household contacts are automatically quarantined (since these are unknown in the network provided).

The simulation algorithm was implemented in Python based on the sample code provided by Kiss et al. [28] 1 . We extended the simulation with multiple containment strategies described in this paper. We use the R programming language for batch running the simulations by invoking the Python modules and for analyzing and plotting the simulation results. Code availability The source code is available at https://github. com/qykong/testing-strategies. The simulation code and all parameters used for related works can be found in the repository. ) shows a network where some nodes are shaded indicating that they are infected, and some of those include a cross symbol, indicating they are hospitalized, i.e., they are triggers of contact tracing. This network is a representation of a snapshot of the status of an epidemic in a given moment in time. (b) shows the four strategies considered for the toy network in (a): Random testing -nodes are selected at random, the resulting list of nodes to test contains 1 infected individual that will test positive and be quarantined. Contact tracing -contacts of all hospitalized nodes are included in the list in no particular order, we see that 3 infected individuals are in this list, although some non-infected nodes will be tested before them. Oracle tracing -this is an ideal contact tracing strategy, where we assume there is an oracle to tell us beforehand who in the testing queue is infected and we prioritize testing these nodes. Global oracle -in this ideal strategy we assume an oracle who tells us the actual infection status of the whole network so we only add infected nodes to the testing queue. Note that the differences between forward and backward contact tracing cannot be shown in a single day snapshot of the epidemic. Note that the number of positive nodes found for each strategy is dependent on the number of tests available per day. In this example, if the test limit were 6, the positives found would be 1,3,3 and 5, respectively; if the limit were 5, the positives found would be 1,2,3 and 5, respectively. are discovered (Our setup). Following [2] , only 40% of the contacts are discovered during contact tracing. The forward tracing strategy is considered in both figures, backward contact tracing shows identical results. 27 Supplementary Tables Table 1: Parameter used for 

A survey of covid-19 contact tracing apps

Modelling the impact of testing, contact tracing and household quarantine on second waves of COVID-19

An Introduction to Stochastic Epidemic Models

Ebola superspreading. The Lancet Infectious Diseases

Digital contact tracing technologies in epidemics: a rapid review

Measles elimination efforts and 2008-2011 outbreak

Contact tracing to control infectious disease: when enough is enough

Epidemic dynamics on random and scale-free networks

The past, present and future of digital contact tracing

SARS outbreaks in Ontario, Hong Kong and Singapore: the role of diagnosis and isolation as a control mechanism

A new approach to characterising infectious disease transmission dynamics from sentinel surveillance: application to the Italian 2009-2010 A/H1N1 influenza pandemic

Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China

Implication of backward contact tracing in the presence of overdispersed transmission in COVID-19 outbreak

On random graphs I

On the evolution of random graphs

Epidemic contact tracing via communication traces

Transmission intensity and impact of control policies on the foot and mouth epidemic in Great Britain

Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing

Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe

Estimation of the Basic Reproduction Number of Novel Influenza A (H1N1) pdm09 in Elementary Schools Using the SIR Model

Gonorrhea transmission dynamics and control

Contact tracing and epidemics control in social networks

Chopping the tail: how preventing superspreading can help to maintain COVID-19 control

Networks and epidemic models

The Efficacy of Contact Tracing for the Containment of the 2019 Novel Coronavirus

A contribution to the mathematical theory of epidemics

Disease contact tracing in random and clustered networks

Mathematics of epidemics on networks

The effectiveness of contact tracing in emerging epidemics

Report 41: The 2020 SARS-CoV-2 epidemic in England: key epidemiological drivers and impact of interventions

Epidemiological and clinical characteristics of influenza A (H1N1) v infection in children: the first 45 cases in Cyprus

Impact of delays on effectiveness of contact tracing strategies for COVID-19: a modelling study

The first wave of COVID-19 in Israel-Initial analysis of publicly available data

Statistical inference in a stochastic epidemic SEIR model with control intervention: Ebola as a case study

Transmission dynamics and control of severe acute respiratory syndrome

Measurability of the epidemic reproduction number in data-driven contact networks

Superspreading and the effect of individual variation on disease emergence

Quantifying the Effects of Contact Tracing, Testing, and Containment

Percolation and epidemics in random clustered networks

Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship

Contact tracing in stochastic and deterministic epidemic models

Lecture Notes: Social Networks: Models, Algorithms, and Applications

Spread of epidemic disease on networks

The structure and function of complex networks

COVID-19 superspreading suggests mitigation by social network modulation

Assessing the transmission dynamics of measles in Japan

Contact tracing in the context of COVID-19

Contact tracing during coronavirus disease outbreak

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study

When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes

Final round of coronavirus study confirms that 5.2of Spanish population has antibodies

Superspreading sars events

Theoretical examination of the pulse vaccination policy in the SIR epidemic model

Substantial underestimation of SARS-CoV-2 infection in the United States

Modeling Epidemics Spreading on Social Contact Networks