key: cord-0619051-z37x27g5 authors: Wan, Jiayue; Zhang, Yujia; Frazier, Peter I. title: Correlation Improves Group Testing date: 2021-11-15 journal: nan DOI: nan sha: 2e87985eed2e082b81c0b72eedfe5e6f43183913 doc_id: 619051 cord_uid: z37x27g5 Population-wide screening to identify and isolate infectious individuals is a powerful tool for controlling COVID-19 and other infectious diseases. Testing an entire population, however, requires significant resources. Group testing can enable large-scale screening, but dilution degrades its sensitivity, reducing its effectiveness as an infection control measure. Analysis of this tradeoff typically assumes pooled samples are independent. Building on recent empirical results in the literature, we argue that this assumption significantly underestimates group testing's true benefits. Indeed, placing samples from a social group into the same pool correlates a pool's samples. Hence, a positive pool likely contains multiple positive samples, increasing a pooled test's sensitivity and also tending to reduce the number of pools requiring follow-up testing. We prove that under a general correlation structure, pooling correlated samples together (called correlated pooling) achieves higher sensitivity and requires fewer tests per positive identified compared to independently pooling the samples (called naive pooling) using the same pool size within the classic two-stage Dorfman procedure. To the best of our knowledge, our work is the first to theoretically characterize correlation's effect on sensitivity and test usage under models of general correlation structure and realistic test errors. Under a 1% starting prevalence, simulation results estimate that correlated pooling requires 12.9% fewer tests than naive pooling to achieve infection control. Thus, we argue that correlation is an important consideration for policy-makers designing infection control interventions: it makes screening more attractive for infection control and it suggests that sample collection should maximize correlation. The SARS-CoV-2 virus has killed over 5 million people while causing enormous economic losses. Large-scale screening has proven effective in curbing the virus's spread (Mercer and Salit 2021 , Xing et al. 2020 , Barak et al. 2021 ) through promptly identifying and isolating infected individuals and their contacts (Cleary et al. 2021 . Nevertheless, screening the entire population requires a massive amount of resources. Diagnosis of SARS-CoV-2 infection is commonly performed using polymerase chain reaction (PCR) tests, which require chemical reagents, machine time and trained medical personnel. The scale and complexity of this demand makes populationwide screening hard, if not infeasible, for many countries (GRID COVID-19 Study Group 2020). A promising solution to this conundrum is group testing. The Dorfman procedure, the first group testing protocol proposed in 1943 to screen enlisted soldiers for syphilis (Dorfman 1943) , pools multiple samples together and tests each pool using a single test. Samples from a pool testing negative are cleared, while samples from a pool testing positive receive individual followup tests. Especially in low-prevalence settings, group testing can save significant test resources compared to individual testing (Kim et al. 2007 ). Group testing has been successfully implemented for large-scale screening in multiple communities worldwide and yielded promising results in controlling the spread of SARS-CoV-2. In May 2020, the city of Wuhan employed pooled testing to screen nine million population over ten days (Fan 2020) . Furthermore, repeated screening of the entire population helps to achieve persistent infection control. In fall 2020, Cornell University conducted repeated surveillance testing on 5-7K Cornell students and employees per day using pools of size five and reopened its campus safely (Lefkowitz 2020) . Going forward, repeated community-based screening may be necessary for longterm containment of COVID-19 under low prevalence (Barak et al. 2021) . Repeated large-scale screening using group testing provides the most value when the sensitivity, i.e., the probability of correctly identifying positive samples, and efficiency, i.e., the number of individuals screened per PCR test, are high. High sensitivity helps identify the positives accurately, while high efficiency permits more frequent screening under limited resources. Together, they enable early identification and isolation of positives as well as quarantine of close contacts, which prevents further disease spread and contributes to better infection control. However, pooled tests face a tradeoff between sensitivity and efficiency due to the dilution effect. The concentration of virus particles (called the viral load ) in the sample of an infected individual is diluted when it is pooled with negative samples (Wein and Zenios 1996) . The dilution effect lowers the overall viral load in the pooled sample and hence the sensitivity. For larger pools, the decrease in sensitivity due to dilution is stronger. On the other hand, reducing the pool size to avoid sensitivity degradation often results in a lower efficiency, which reduces the benefit of group testing in resource consumption. Past analyses of this tradeoff, such as Kim et al. (2007) and Westreich et al. (2008) , assume that samples in the same pool are independent. In practice, however, human behavior and logistical constraints in sample collection naturally create correlation. Specifically, if a person is infected, this increases the likelihood that others in their immediate social circles are infected (Vang et al. 2021 , Rader et al. 2020 , Lan et al. 2020 . Most notably, the transmission probability among people in the same household has been estimated in a meta-study to be 16.6% (Madewell et al. 2020 ) and in some literature as high as 44.6% (Boscolo-Rizzo et al. 2020) . Moreover, in large-scale screening, those frequenting the same testing center usually live, work, or socialize close to each other. As a result, members of the same social group are often placed into the same pool (Barak et al. 2021 ). Intuitively, correlation should cause group testing to be more sensitive than it would be if samples were independent. Under correlation, a pool containing a positive sample more likely contains multiple positive samples. This increases the viral load in the pooled sample and improves the overall sensitivity. Recently, Barak et al. (2021) finds that the sensitivity of group testing performed in Israel was higher than independent sampling would suggest and conjectures that correlation was the cause. Analysis and simulation results in Comess et al. (2021) point to the same intuition, though it assumes a higher prevalence in correlated pools due to additional network transmission. Intuition also suggests that correlation should improve the efficiency of the Dorfman procedure by concentrating positives in fewer pools. Thus, a smaller number of pools should test positive, requiring fewer follow-up tests. This has been observed in simulation studies (Lendle et al. 2012 , Deckert et al. 2020 ) though current understanding is limited as recent theoretical analysis (Augenblick et al. 2020 , Lin et al. 2020 ) assumes that tests are error-free, while testing errors in practice have an important effect on test utilization in group testing. Basso et al. (2021) allows for testing errors, but assumes a fixed sensitivity that does not depend on the pool size or the number of positives in the pool, thus ignoring the dilution effect and the effect of correlation on sensitivity. Thus, while the benefits of correlation for both sensitivity and efficiency are important, current theoretical understanding is limited. In this paper, we address this gap in the literature. We prove that under a general correlation structure in the population and other mild assumptions, pooling correlated samples together (called correlated pooling) in the two-stage Dorfman procedure achieves higher sensitivity compared to independently pooling the samples (called naive pooling) using the same pool size. We also prove that correlated pooling uses fewer tests per positive identified. Notably, we show an example in which correlated pooling has strictly lower efficiency than naive pooling, where we recall that efficiency is the number of people screened per test. Because correlated pooling has higher sensitivity, some pools with positive samples test positive under correlated pooling but negative under naive pooling. This effect can dominate the reduction in the number of positive-containing pools caused by correlation. As a result, when the two effects are combined, correlated pooling can have more pools testing positive than naive pooling and thus require more tests. This stands in contrast to the theoretical results in the literature on efficiency (Augenblick et al. 2020 , Lin et al. 2020 , Basso et al. 2021 , which find that correlation always improves efficiency. This discrepancy in qualitative conclusions follows in part from the literature's focus on a limited set of models for testing error in which correlation does not improve sensitivity. We argue that tests per positive identified better quantifies a procedure's utility for screening than efficiency, and so correlated pooling remains attractive for infection control despite this example. Our results on sensitivity and test utilization (efficiency and tests per positive found) assume that there are no false positives. While false positives do occur, we argue that they have little impact on sensitivity and test utilization in practice, since the specificity, the probability of correctly declaring a negative sample as negative, is typically quite high in reality (e.g., Public Health Ontario 2020 finds a PCR specificity of 99.99%), making false positives quite rare. Specificity, however, is of independent interest when screening many people, since even a specificity as high as 99.99% will create false positives once enough people are tested. False positives waste public health resources, cause economic losses, disrupt personal lives, and increase the risk of infection during treatment (Gupta and Malina 1999, Healy et al. 2021) . To address this, we briefly analyze the specificity of correlated pooling implied by a generalization of our main model allowing for false positives. We show that correlated pooling improves specificity compared to both naive pooling and individual testing. For a typical prevalence, pool size, and a specificity of 99.99% for a single PCR test, we estimate the specificity of the Dorfman procedure with correlated pooling to be at least 99.999%. This is a ten-fold improvement in the false-positive rate (1 − specificity) compared to individual testing, from 10 −4 to 10 −5 . As a consequence of these insights, we argue that group testing is significantly more useful for bringing an epidemic under control than is argued in classical analyses. We consider a use case where group testing is applied to large-scale screening and positive individuals are isolated once identified. We think of achieving epidemic control as a situation where the number of active infections in a population stabilizes or declines. In our case study focused on intra-household correlation, pooling samples from the same household together results in higher sensitivity and efficiency, both contributing to better epidemic control. For example, at a representative prevalence of 1%, naive pooling has a sensitivity of 77.6% and an efficiency of 5.87, while correlated pooling has a substantially higher sensitivity of 82.6% and efficiency of 6.33 (after tuning the pool sizes separately for each pooling strategy). Because of these improvements, correlated pooling requires 12.9% fewer tests than naive pooling to achieve epidemic control. This difference has the potential to have a substantial impact on real-world policy-making. As discussed earlier, within-pool correlation arises naturally in repeated large-scale screening. Consequently, policymakers that base their decisions on the independence assumption may undervalue group testing and adopt overly conservative policies (e.g., imposing a full lockdown instead of using pooled testing to control viral spread). On the other hand, if they do account for withinpool correlation, they may conduct large-scale group testing that fully utilizes the resources while keeping the economy open. Furthermore, correlation that occurs naturally can be augmented in implementation by encouraging people from the same social group to get tested together. To summarize, our contributions in this paper are: • We formulate a general model of correlation in pools, derived from an asymptotic analysis of a more general population-level model of infections spreading across a population and how pools are formed from members of the population. • We prove that under the general model and other mild assumptions, using correlated pooling in the two-stage Dorfman procedure achieves (1) higher sensitivity; (2) fewer tests per positive identified compared to naive pooling. Our work is the first to study sensitivity or efficiency theoretically under a general correlation model. • We provide a counterexample to the claim that correlated pooling always improves efficiency, i.e., the number of individuals screened per PCR test, clarifying that claims in the literature that correlated pooling always improves efficiency do not necessarily apply outside of the limited class of previously considered models. • We conduct a case study with realistic data showing that correlation significantly improves the effectiveness of group testing for epidemic control. As a consequence of these insights, we argue that classical analysis assuming independence underrepresents the power of group testing for infection control. Moreover, group testing should be more widely used in large-scale screening and implemented in a way that maximizes correlation. The rest of this paper is organized as follows: Section 2 reviews related work in more detail. Section 3 establishes a mathematical model for a single pool and proves our main theoretical results: that correlation improves sensitivity and tests per positive identified, but can reduce efficiency. Section 4 supplements Section 3 by deriving our model of correlation in a single pool model from an asymptotic analysis of a model for a larger population, showing that our pool-level model is well-justified in a population-wide screening context. Section 5 presents realistic models for viral loads and PCR sensitivity and shows that under these specific models correlated pooling has better efficiency than naive pooling. Section 6 performs a case study where within-pool correlation is induced by household transmission. Section 7 concludes the paper and discusses future research. Group testing was proposed by Dorfman (1943) to screen enlisted soldiers for syphilis during World War II. The Dorfman procedure combines multiple samples together and tests the pooled samples, so that samples in a pool testing negative are cleared and samples in a pool testing positive are tested individually for identification. This significantly increases efficiency by screening multiple individuals with a single test. Since then, many generalizations of the Dorfman procedure have been developed and studied theoretically. Group testing is also widely applied in the surveillance and control of infectious diseases. For a review of recent theory and applications, see Kim et al. (2007) and Aprahamian et al. (2019) . The COVID-19 pandemic has raised further interests in applying group testing to infection control and population-wide surveillance (Mercer and Salit 2021) . Multiple studies have used modeling and simulation to explore the value that group testing offers in scaling up testing, such as Cleary et al. (2021), , Pilcher et al. (2020 ), Eberhardt et al. (2020 , and Mutesa et al. (2020) . A thorough review of the literature in using group testing for COVID-19 mitigation is provided by Yu et al. (2021) . Despite the extent to which group testing can increase testing capacity, it may have a negative impact on sensitivity. First, the inherent error rate of individual assays, along with the design of the pooling protocol, may lead to failure to detect some positive samples (Graff and Roeloffs 1972 , Kim et al. 2007 , Westreich et al. 2008 . Moreover, sensitivity may decrease due to the dilution effect, that is, a pool dominated by negative samples may test negative, causing its positive members to be missed. The dilution effect was first modeled by Hwang (1976) and subsequently explored by Wein and Zenios (1996) , Zenios and Wein (1998) , Weusten et al. (2002) , Nguyen et al. (2019) for HIV detection and by Hung and Swallow (1999) for prevalence estimation. Many studies have assessed the dilution effect in SARS-CoV-2 tests from both mathematical and empirical perspectives. Pilcher et al. (2020) assumes a temporal viral load progression in infected individuals, which, together with the detection limit of PCR tests, defines a "window of detection"; under this setting, pooling is equivalent to raising the detection limit of the test and shortening the effective window of detection. proposes a similar quantification of decrease in sensitivity due to dilution based on a mathematical model for PCR. Some experimental studies (Yelin et al. 2020 , Lohse et al. 2020 ) evidence that pooling up to around 30 samples does not result in a loss of sensitivity, while Bateman et al. (2020) observes an increasing deterioration of sensitivity in pooling 5, 10, and 50 samples. Most of the aforementioned literature assumes that the infection statuses of the samples within a pool, whether binary or not, are independent from each other. However, as we described in the introduction, correlation between samples is often present in reality and can potentially be leveraged for our advantage to combat the dilution effect. One important cause of correlation is transmission within households. The secondary attack rate (SAR), i.e., the probability that an infectious person in a household infects another given household member, is significant for many infectious diseases (Carcione et al. 2011 , Whalen et al. 2011 , Odaira et al. 2009 , Meningococcal Disease Surveillance Group 1976 , Glynn et al. 2018 . For SARS-CoV-2, a meta-analysis (Madewell et al. 2020) of 40 studies finds an average SAR of 16.6% and a 95% confidence interval of 14.0%-19.3%. Beyond household transmission, correlation in infection statuses among members of the same social group has also been observed among college students belonging to the same fraternity or sorority (Vang et al. 2021) , people living in the same neighborhood (Rader et al. 2020) , and co-workers (Lan et al. 2020) . Relatively little research explores group testing of correlated samples. Barak et al. (2021) , a large-scale observational study, observes that individuals in the same social groups are often pooled together in large-scale screening. It discovers that samples with low viral load, each of which would have likely been missed if it were the only positive sample in the pool, were detected when pooled with high-viral-load samples. This led to higher sensitivity than would have occurred with pools containing independently infected samples. Augenblick et al. (2020) uses a simple example with pairwise correlation and perfect test accuracy to illustrate reduced test consumption in the Dorfman procedure. Lendle et al. (2012) uses simulations to show that correlation improve the efficiency of hierarchical and matrix-based group testing. Deckert et al. (2020) uses simulation to show that pooling individuals with similar prevalence levels reduces costs. Building on these observations, Lin et al. (2020) models sample collection at a testing site as a regenerative process and calculates the cost efficiency of different testing protocols, while assuming perfect test accuracy. Basso et al. (2021) models a constant pairwise correlation in infections using a Beta-Binomial distribution for the number of positives in a pool, and demonstrates that this correlation improves efficiency. The paper does not study the effect of correlation on sensitivity: it assumes a fixed sensitivity for pooled tests and does not model the effect of pool size or correlation on sensitivity. The closest paper in the literature to ours is Comess et al. (2021) , which is qualitatively motivated by similar considerations but makes theoretical contributions that are different in nature. There are two major distinctions between our work and Comess et al. (2021) . First, Comess et al. (2021) considers a specific model of correlation in which all participants in a pool are close contacts of each other and infections are acquired in a community infection stage followed by homogeneous secondary infections within the pool. As a result, the prevalence in the correlated pool is higher than that in a naive pool (which only assumes community infection). Hence, the model in Comess et al. (2021) is best suited to understanding the joint effect of increasing secondary transmission while pooling related samples together. We argue, however, that the choice of pooling strategy should be based on a comparison of their properties while holding the population's prevalence steady. This is the approach we take in our paper. Second, Comess et al. (2021) theoretically studies a different but related metric of test consumption. A theoretical result therein (Observation 5) defines an efficiency metric that assumes 100% sensitivity of the pooled test and shows that the metric is identical for both pooling methods. This metric, though theoretically tractable, does not fully capture the difference in test consumption in practice, as is reported in their simulation results. Nevertheless, much of the intuition described in Comess et al. (2021) is consistent with our results. In particular, Observation 5 claims that the sensitivity is no worse under correlated pooling than under naive pooling. We prove a similar result in our Theorem 1. In addition, the simulated efficiencies in Figures 6 and 8 of Comess et al. (2021) , though not discussed by the authors, indicate that correlated pooling can have lower efficiency than naive pooling, which we demonstrate is possible in Section 3.4. Beyond viral testing, group testing with correlation has been studied in the signal processing community. For example, graph structures may induce correlation among nodes and edges (Ganesan et al. 2017) or impose constraints on pool formulation (Cheraghchi et al. 2012 ). In this section, we build a mathematical model for a single pool with and without correlation among samples. We show that within-pool correlation in the classical two-stage Dorfman procedure (Dorfman 1943) results in higher test sensitivity and lower or comparable test consumption for identifying positives, which offers great value to epidemic mitigation given limited testing capacity in a global pandemic. In Section 4, we will discuss how this single-pool model is well-justified in the context of large-scale screening. We study the performance of pools with and without correlation among its samples under the Dorfman procedure, the most widely studied and adopted group testing protocol. Each pooled test or individual test is performed using a polymerase chain reaction (PCR) test. We consider a pooled test with its pool size equal to n and the prevalence in the pool equal to α. That is, each sample is positive with marginal probability α. We operate in the asymptotic regime of α → 0 in our subsequent analyses. We focus on this regime for two reasons. First, pooling is most preferable under low prevalence (Kim et al. 2007 , Pilcher et al. 2020 . Like most of the literature studying group testing, we continue to focus on this regime. Second, when screening is applied promptly and regularly since the early stage of the epidemic, the population-level prevalence will likely be consistently low. Despite this focus, we will show in Section 6 that the benefit offered by within-pool correlation is robust, even when the prevalence is as high as 10%. We consider two pooling strategies, naive pooling and correlated pooling, where viral loads of the samples in the same pool are independent under naive pooling and may be correlated under correlated pooling. Let {V i } n i=1 denote the viral loads of the samples in a pool of size n. Hereafter, we use the compact notation V 1:n to denote V 1 , · · · , V n . If individual i is infected, then V i > 0; Let P i,α and E i,α denote the probability and expectation operators, respectively, under pooling method i (where i = 0 is naive pooling, and i = 1 is correlated pooling) and prevalence level α. We drop the subscripts when the probability or expectation does not depend on the pooling method or the prevalence level. We state two assumptions and one condition necessary for proving the theoretical results at the pool level. Assumptions 1 and 2 are always assumed, whereas Condition 1 is assumed when explicitly stated. Section 4 will show that Assumptions 1, 2 and Condition 1 hold in a natural population-level model of pooling. Condition 1. In the correlated pool, conditioning on the presence of at least one positive in the pool creates a strictly positive probability that at least two individuals in the same pool are infected, even in the limit as prevalence approaches 0. Mathematically, lim α→0 P 1,α (S > 1 | S > 0) > 0. Condition 1 is motivated by the within-pool correlation arising from pooling members of the same social group together, as described in Section 1. It describes the main feature that differentiates the two pooling strategies: as prevalence approaches zero, the probability that a positive-containing pool contains more than one positive samples diminishes for naive pools (under Assumption 2) but persists for correlated pools (under Condition 1). Having defined the two pooling strategies, we now model the test outcomes. 1 In reality, it is possible that the viral load varies across different body parts of the same individual and sampling practice can induce further noise in the sample viral load. Here we conflate individual viral load and sample viral load, assuming homogeneity of viral load within the same individual and no loss/noise in sampling. Hence, Vi > 0 is a surrogate for whether an individual is infected or not. and a negative result with probability 1 − p(v). We refer to p(v) as the sensitivity function. Here we assume p(0) = 0, i.e., no false positives; later in Section 6.2.2, we argue that a small individual test FPR, e.g., 0.01% (Public Health Ontario 2020), implies an FPR of correlated pooling low enough for its deployment in repeated large-scale screening. We further assume that p(v) > 0 for v > 0, p is monotone increasing in v, and that the result of a PCR test, whether individual or pooled, is conditionally independent from any other PCR test given its sample viral load. We define the following variables for the outcomes of a two-stage Dorfman procedure with pool size n. For a pooled test with viral loads V 1 , · · · , V n in the input samples, we assume pooling leads to a dilution factor equal to the pool size 2 . Hence, the pooled test returns positive with probability p(V n ) whereV n = 1 n n j=1 V j . Let Y = Ber p V n denote the outcome of the pooled test in the first stage. Let W j = Ber (p(V j )) denote what the outcome of the individual test for sample j with viral load V j will be, if it is performed. Let D = n j=1 Y W j denote the number of positives identified in the pool, i.e., the number of positive samples that test positive in the second stage given the pool tests positive in the first stage. We note that the conditional independence assumption above implies that the pooled test and individual tests are conditionally independent given the viral loads in the participating samples, i.e. Y ⊥ ⊥W j | V 1:n , j = 1, · · · , n. The sensitivity of a group testing protocol is critical for epidemic control. As discussed earlier, pooled tests face a loss of sensitivity due to the positive sample(s) getting diluted in the pool. Here, we examine the sensitivity of the two-stage Dorfman procedure with (i.e., correlated pooling) and without (i.e., naive pooling) correlation among samples. To achieve this, we first define a metric for the sensitivity of a group testing protocol. Definition 1. Let β 0,α and β 1,α denote the overall false negative rate, or the fraction of positive samples that are falsely declared negative in the two-stage Dorfman procedure under prevalence α in the naive and correlated pools, respectively 3 . That is, . 2 Though assuming a dilution factor of n here, our theoretical results are easily generalizable to other dilution factors. 3 Our goal here is to evaluate a pooling strategy when screening an entire population using many pools. Therefore, rather than 1 − Ei,α D S , which is the expected false negative rate of a single pool, we focus on 1 − E i,α [S] , which we argue is the right metric for population-wide false negative rate. Indeed, in the asymptotic regime in the populationwide screening context described in Section 4, under mild assumptions, the number of positives found per pool converges in probability to Ei,α[D] and the number of infected individuals per pool converges in probability to Ei,α [S] . Thus, by the continuous mapping theorem, as long as Ei,α[S] > 0, the fraction of positives found converges in probability to E i,α [S] . Other metrics in Section 3 are defined with the same logic. Under this metric, we present our main result in Theorem 1. We show that under a general class of sensitivity functions, the two-stage Dorfman procedure using correlated pooling achieves a lower overall false negative rate than naive pooling in the low-prevalence setting. is strictly monotone increasing and Condition 1 holds, then the inequality is strict. Here we provide a proof sketch of Theorem 1. A complete proof is given in Appendix A.1. Proof sketch of Theorem 1. For both i = 0, 1, we can show that the overall false negative rate is given by For naive pooling, the V i 's are i.i.d. As α → 0 + , the probability that a positive pool contains multiple positive samples vanishes, and we can show that lim α→0 + β 0,α = 1 − E p 1 n V 1 p(V 1 ) | V 1 > 0 . For correlated pooling, a positive pool contains multiple positives with non-negligible probability, When ≥ 2, we haveV n > 1 n V 1 because there exists at least one other sample with positive viral load. Assuming p(v) is a monotone increasing function in v, we obtain p(V n ) ≥ p( 1 n V 1 ), which, combined with p(V 1 ) > 0 given V 1 > 0, implies that A ≥ A 1 . Therefore, taking α → 0 + gives lim α→0 + β 1,α ≤ lim α→0 + β 0,α . The inequality is strict if p(v) is strictly increasing in v and Condition 1 holds. In addition to accuracy, test consumption is another key consideration. In large-scale screening with limited resources, the ability to screen many individuals with few tests significantly expands the testing capacity, which translates to better epidemic mitigation (Mercer and Salit 2021) . In this section, we investigate the test consumption of the two-stage Dorfman procedure under naive and correlated pooling. One metric commonly used in literature is efficiency, i.e., the number of individuals screened per PCR test (Kim et al. 2007 , Westreich et al. 2008 ). However, a higher efficiency does not necessarily indicate better epidemic control. In reality, it is through the identification and isolation of positive cases that screening most directly mitigates the epidemic spread. Thus, in lieu of the standard metric we examine the test consumption of a testing protocol by the number of positive cases identified per PCR test consumed, as it better captures the value of a testing protocol in epidemic control. Recall that in a size-n pool, n followup individual tests are performed if the pool tests positive, i.e., Y = 1; no followup individual tests are performed otherwise. We now formally define the metric we use for test consumption. Definition 2. Let γ 0,α and γ 1,α denote the expected number of positive cases identified per PCR test consumed under prevalence α in the naive and correlated pools, respectively. That is, , i = 0, 1. (1) Note that γ i,α differs from efficiency in the numerator. The numerator of efficiency is the pool size n, while the numerator of γ i,α is E i,α [D], the number of positive cases identified in the pool. , it follows that γ i,α is directly proportional to efficiency and sensitivity (i.e., 1 − β i,α ) of the pooling strategy. We will see in Section 6.3 that γ i,α is the key metric for measuring the effectiveness of a group testing protocol in epidemic control. To understand the behavior of γ i,α , we can rewrite Equation 1 in Definition 2 as , i = 0, 1. In the first term on the right hand side (RHS) of Equation 2, we have Theorem 1 showed that lim α→0 + β 1,α ≤ lim α→0 + β 0,α . Therefore, to examine the number of positive cases identified per PCR test, it suffices to focus on comparing the second term of the RHS of Equation 2. We formally define this quantity. Definition 3. Let η 0,α and η 1,α denote the expected number of followup individual tests consumed per positive case identified under prevalence α, in the naive and correlated pools, respectively. That is, , i = 0, 1. In Theorem 2, we show that in the low prevalence setting, the two-stage Dorfman procedure using correlated pooling consumes no more followup tests per positive case identified than using naive pooling by a constant fraction. This fraction is determined by the viral load distribution among infected individuals, the PCR test mechanism, and the pooling strategy. By Equation 2, this bound also applies to γ −1 i,α . In a relatively simple case where a PCR test result deterministically reports whether the sample viral load exceeds a threshold value, correlated pooling consumes no more followup tests per positive case identified than naive pooling. This is formulated in Corollary 1. Corollary 1. Suppose the sensitivity function is p(v) = 1{v ≥ u 0 } for some non-negative constant u 0 . Then, lim α→0 + η 1,α η 0,α ≤ 1. In reality, the sensitivity of a PCR test, albeit not exactly a step function of the sample viral load v, closely resembles one in that it increases rapidly from zero to one within a narrow range of v. (See, e.g., Figure 1 in Section 5.3.) Section 5.3 further shows that, under a realistic sensitivity function, viral load distribution and pool size, the bound in Theorem 2 is almost equal to one. Existing literature claims that within-pool correlation leads to better efficiency (Comess et al. 2021 , Augenblick et al. 2020 , Lendle et al. 2012 , Deckert et al. 2020 , Lin et al. 2020 , Basso et al. 2021 . While the claim is true under simplified assumptions such as noise-free tests, we show in this section that it does not hold in general. We first relate efficiency to metrics investigated in previous sections. For any prevalence α, efficiency can be expressed in terms of β i,α and η i,α as follows: We identify scenarios where correlated pooling could have lower efficiency. First, Theorem 2 showed that η i,α may be higher under correlated pooling, so by Equation 3, it is certainly possible that the efficiency is also lower under correlated pooling. Second, even in settings where correlated pooling has lower η i,α and β i,α , the product η i,α (1 − β i,α ) may still be higher under correlated pooling, which leads to lower efficiency. Indeed, in Appendix B, we construct a stylized example where both of the above scenarios can occur, resulting in correlated pooling having lower efficiency than naive pooling. The example also shows that lim α→0 + η 1,α can be strictly larger than lim α→0 + η 0,α , which necessitates the (1 + δ) bound in Theorem 2. The goal of this section is to show that our pool-level model in Section 3 is well-justified in a population-wide screening context. To achieve this, we first describe a model at the population level and use it to justify Assumptions 1, 2 and Condition 1 made in Section 3. We consider a population of N individuals, where each individual is associated with a unique index in {1, . . . , N }. We call this the "population index". We slightly abuse the notation and let are correlated random variables and follow some joint distribution. This can model, for example, the spread of disease in a population based on geographic locations and demographics. We let α denote the overall prevalence, the probability that a person chosen uniformly at random from the population has a positive viral load. In Section 3 we used α to denote the prevalence in an individual pool. Later, we will see that the prevalence in an individual pool constructed as described in this section is equal to the overall prevalence. Section 3 defined correlated pooling and naive pooling at the pool level. Here we describe how these two pooling strategies are implemented at the population scale. For simplicity, we assume N is a multiple of n so that we can divide this population into N n groups of size n 4 . Individuals assigned to the same group will participate in the same pool in the two-stage Dorfman procedure. Let any pooling assignment be represented by A := {A j : j = 1, . . . , N/n}, a random partition defined below of {1, · · · , N } into N/n groups of size n. Pool j contains samples of viral loads Naive pooling and correlated pooling form the pools in the following manner: Naive Pooling: In naive pooling, each pool is formed by picking n individuals uniformly at random from the population without replacement. Correlated Pooling: In correlated pooling, pools are formed in ways that preserve correlation among samples in a pool. The within-pool correlation either occurs naturally or can be enhanced by explicit measures. For example, at testing centers established on college campuses and in neighborhoods, samples are most likely from groups of people that live, study, and work in proximity to each other, preserving correlation. Moreover, test kits can be mailed to households for selfcollection (Stanford Medicine 2020). Samples from the same household can then be transported to laboratories together and tested in the same PCR test, preserving correlation. For both correlated and naive pooling, once the pools are formed, we reorder the samples in each pool by applying independent random permutations of 1 through n. We think of the social structure of the population (which has not yet been specified in our model) as influencing both the pooling assignments and the identities of infected individuals in the population. Below we will model one aspect of this social structure, namely the set of close contacts associated with each individual, as deterministic. We assume the pooling assignments are being chosen independently from the viral loads in the population. Indeed, dependence on the social structure does not necessarily break this independence in our formal model. We define the probability measures P 1,α respectively. Here, we use a superscript (N ) to index quantities computed for a size-N population; as in Section 3, we use subscripts i, α to index quantities computed under pooling method i (where i = 0 is naive pooling, and i = 1 is correlated pooling) and prevalence level α. The subscript i (or α) is dropped when the quantity does not depend on the pooling method (or prevalence level). 4 If N is not a multiple of n, for modeling purpose we fill the empty spaces in the last pool with artificial negative samples. Then all of our subsequent analyses still apply, with the prevalence among the pools changed toα = αN N/n ·n ∈ (α(1 − O( 1 N )), α) which converges to α when N → ∞. 5 From a measure theoretic perspective, the random quantities A, Vi are mappings from the event space to the (measurable) state space. The mappings themselves do not depend on N but the distributions of these random quantities under P To study properties of the pools constructed, it is sufficient to choose a pool uniformly at random and analyze its properties. Formally, let J be chosen uniformly at random from {1, . . . , N/n}. That is, A J is a pool chosen uniformly at random. Mirroring notation used in Section 3, we let S = i∈A J 1{V i > 0} be the number of positives in the randomly chosen pool. In Section 4.3, we assume the joint distributions on {V i : i ∈ A J } and S induced by P (N ) 0,α and P (N ) 1,α (for naive and correlated pooling respectively) have a limit as N goes to infinity. We show that these limiting joint distributions satisfy the assumptions and conditions assumed for P i,α in Section 3 for i = 0, 1. Thus, we consider P i,α to be equal to the limit of P (N ) i,α as N → ∞. Hence, we can view the analysis of a single pool in Section 3 as being an analysis of the randomly chosen pool A J for large N , and thus producing quantities equal to population-level averages. Having established the population-level model and justified the use of a randomly chosen single pool for analyzing a pooling strategy, we examine the distribution of the sample viral loads in the chosen naive and correlated pool. First, we show that sample viral loads in the two pools are identically distributed for each N , which justifies Assumption 1 in Section 3. It follows that the pool prevalence in Section 3 is equal to the population-level prevalence α here in Section 4. Then, we take N to the asymptotic regime to justify Assumption 2 in the cases where screening is implemented at a large scale, e.g., in a city or town. We argue that samples in a naive pool are asymptotically independent as N → ∞. To achieve this, we first define a measure of association between the viral loads of one individual and a group of individuals. See Appendix C.2 for details. We then assume that for any subset of a pool, the number of individuals in the remaining population with association stronger than some threshold scales sublinearly in population size (Assumption EC.1). This enables us to derive the following asymptotic independence property of naive pooling, which justifies Assumption 2 in Section 3. Proposition 2. Under Assumption EC.1, the viral loads of individuals in a randomly chosen naive pool are asymptotically independent as N → ∞. We now characterize the correlation between sample viral loads in a correlated pool based on the notion of "close contacts". Infected individuals and their close contacts are assumed to be correlated in infection status and are likely to be placed into the same pool under correlated pooling. We formulate these assumptions mathematically. Assumption 3. For each individual i in the population, let C i denote the set of his/her close contacts. We model C i as deterministic. The following hold: 1,α (j is in the same pool as i) ≥ c 2 ∀j ∈ C i . This holds for any α and any N . Assumption 3 captures important features of the spread of infectious diseases and the correlated pooling strategy. The first sub-assumption prescribes that each individual in the population either (i) could never be infected due to social isolation; or (ii) could be infected but both the lower and upper bounds of infection risk are on the same order as the population-level prevalence. The second sub-assumption is well-justified since for an individual with non-zero infection risk, he/she must have at least some human-to-human contact. The third sub-assumption is supported by ample evidence in the literature for transmission between infected individuals and their close contacts (World Health Organization 2020, Madewell et al. 2020 ). The fourth sub-assumption describes the key feature assumed for correlated pools, namely that individuals that are close contacts of each other are placed into the same pool with a non-vanishing probability even as N goes to infinity. This is justified because in large-scale screening using group testing, correlation either arises naturally or can be enhanced through explicit measures, as discussed in Section 4.1. Assumption 3 allows us to derive the following property of correlated pools, which justifies Condition 1 in Section 3. Finally, we argue in Appendix C.5 that the metrics studied in Section 3 (β i,α , γ i,α and η i,α ) are appropriate for evaluating a group testing protocol in the population-wide screening context. Specifically, we show that quantities computed at the population level (e.g., the fraction of positive samples in the population missed by testing) converge in probability to their corresponding poollevel metrics in Section 3. Our arguments rely on two mild assumptions (not used elsewhere in this paper), one on the In this section, we present a realistic model for viral load among infected individuals. We then describe a realistic sensitivity function for the PCR test that captures the randomness in the subsampling and pooling processes, an aspect overlooked by most existing literature studying group testing protocols. We find that, under these two models, correlated pooling consumes no more followup tests per positive identified than naive pooling. In Section 6 we will use these models to perform a realistic case study of correlated pooling implemented based on household membership. We first specify a probability distribution governing viral loads across infected individuals. One way to quantify the viral load in a sample is with the so-called Ct value. A PCR test amplifies the viral RNA copies in a sample by approximately doubling them in each cycle of the reaction. The minimum number of cycles required for the RNA copies to reach a detectable threshold is called the cycle threshold, denoted Ct (Heid et al. 1996) . The lower the initial viral load in the sample, the more duplicating cycles it requires to become detectable, and the larger its Ct value is. In our simulation, we assume the viral load of any individual is independent from the viral loads of all other individuals given his/her infection status. This assumption is mild given the heterogeneity in the individual biological response to the virus, which we consider independent. Hence, for each infected individual, we can sample his/her viral load from the distribution specified in Table EC .3. Whether an individual is infected can be detected by a PCR test. To investigate the performance of a pooling strategy, we now specify how the sensitivity of PCR tests depends on viral load. Most existing mathematical models of group testing treat false negatives of PCR tests in an oversimplified way, either assuming a fixed false negative rate or one that is a simple deterministic function of the sample viral load. (See Section 2.1.) In reality, before entering the PCR machine, a sample undergoes multiple steps of processing (e.g., subsampling and extraction), each of which introduces stochasticity into the amount of viral RNA that remains. Based on the sample handling methodology described in Wyllie et al. (2020) and the mathematical modeling for liquid partitioning in Basu (2017), we lay out the steps in a size-n pooled test. We discuss the randomness associated with each step and how it impacts the final test outcome in Appendix D.2. Our modeling of the PCR test is one instantiation of the general sensitivity function p(v) discussed in Section 3.1 with one exception that in this realistic model p(v) = 0 for very small v. Recall that Theorem 2 derived a bound 1 + δ, where δ = , for the ratio of test consumption of correlated pooling to naive pooling. We now examine this bound in a realistic setting, given the PCR model and the viral load distribution in Sections 5.1 and 5.2. We show that in this setting, correlated pooling consumes no more followup tests per positive identified than naive pooling for a wide range of pool sizes and PCR test sensitivities (80% − 97.5%). Since the distribution of S in the pool is not specified in our model, we give an upper bound δ for δ which can then be estimated directly using Monte Carlo simulation: derivation of δ is given in Appendix E. Using Monte Carlo simulation with 10 6 replications, we find that across a wide range ofβ and pool sizes, δ is consistently close to zero. The maximum value of δ is 8.96 × 10 −5 (95% CI: (8.90 × 10 −4 , 9.02 × 10 −5 )), obtained when n = 2 andβ = 2.5%. As n increases, the relaxed bound converges to 1, suggesting that in this realistic setting correlated pooling consumes no more followup tests per positive identified than naive pooling. For detailed methodology and results of the simulation, see Appendix E.3. Now we provide intuition for why δ is small. We first examine a representative curve of PCR test sensitivity versus sample viral load underβ = 5%. Based on the viral load distribution among infected individuals given in Table EC .3, whenβ = 5%, the PCR test sensitivity grows rapidly from 0 to 1 over a narrow range of log viral load in the sample (as shown in Figure 1 ). Specifically, a log 10 viral load of 3.45 gives a PCR test sensitivity of 0.3%, while a log 10 viral load of 3.65 gives a PCR test sensitivity of 99.8%. The fraction of infected individuals that have log 10 viral load between 3.45 and 3.65 is only 2.8%, indicating that the majority of positive samples either test positive with high probability (if the log 10 viral load is above 3.65) or test positive with low probability (if the log 10 viral load is below 3.45). Though not depicted here, the p(v) curves corresponding to differentβ follow the same pattern. Based on the above observations, we argue that correlated pooling's test consumption per positive identified nearly meets or exceeds that of naive pooling in practice. We first observe that P 1,α (Y = 1 | S D = 0, S = n), which is in the numerator of δ , is small. If a pool contains only n positives that would all test negative individually, i.e., S D = 0, then they likely all have viral loads below the narrow region where an individual test's sensitivity rises. Thus, the viral load in the pool, which is the average of the viral loads of these positive samples, is likely also below the narrow region, making it likely to test negative, i.e., Y = 0. On the other hand, we argue that P(Y = 1 | S D = S = 1), which is in the denominator of δ , is reasonably large. In other words, if a pool contains only one positive sample and it would test positive individually, then the pool is likely to test positive. With its viral load drawn from the distribution described in Table EC .3, a positive sample that would test positive individually has its viral load way above the narrow region with a reasonably large probability. Hence, even when such a sample is diluted by a factor equal to the pool size, the pooled sample likely still has its viral load above the narrow region and is likely to test positive, i.e., Y = 1. We support this argument with the numerical results presented in Appendix E.3, Table EC.6. To illustrate our analysis of correlated pooling above, we simulate a setting where within-pool correlation is induced by household transmission. We demonstrate that correlated pooling consistently outperforms naive pooling in terms of both sensitivity and efficiency. More importantly, we show that correlated pooling implemented at a large scale enables more effective epidemic control. The improvement in sensitivity and efficiency achieved by correlated pooling relative to naive pooling depends on the joint distribution of viral loads across samples in the pool. To model this effect realistically, we propose a model where individuals belong to households, infections are transmitted between household members, and correlated pooling is implemented based on household membership. This is a representative and pragmatic instance of correlated pooling in large-scale screening, as it is logistically practicable to collect samples by households and subsequently pool households together. We first describe the simulation setup in detail, separating our discussion into assumptions about within-household transmission and pooling assignment. Then, we demonstrate that correlated pooling consistently outperforms naive pooling in both sensitivity and efficiency. We further analyze how the demonstrated advantages of correlated pooling translate to realworld policy-making. In particular, we focus on population-wide screening as a mitigation measure We gather the household size distributions of four countries from census data and assume that all probability mass on H > 6 is allocated to H = 6 (Table EC.7). We also explore variants of the U.S. census data, in which we either add to or subtract from the weight on household size of one and adjust the weights on other household sizes accordingly (Table EC. A household is said to be infected if one person is infected as the index case in the household. We assume different households are infected independently with probability p h , i.e., correlation through other social groups is considered negligible. Within each infected household, we assume transmissions occur independently with secondary attack rate (SAR) q. That is, given a positive index case in a size-h household, the remaining h − 1 members become infected independently with probability q. We consider the following possible values for q: [0. 166, 0.140, 0.193, 0.005, 0.446 ]. These are the estimated mean, 95% CI lower and upper bounds, minimum and maximum values of household SAR from 40 studies, respectively, reported by a meta-analysis (Madewell et al. 2020 ). The distribution of household size H, probability that a household is infected p h , and secondary attack rate q together yield an expected prevalence in the population, which matches the overall population-level prevalence α: We now describe the steps for simulating correlated infections within households, given a fixed population-level prevalence, SAR, and household size distribution: 1. Compute the household infection probability p h using Equation 4. 2. Generate households with sizes drawn from the household size distribution. 3. Let each household be infected independently with probability p h , with one member selected uniformly at random as the index case. 4. In each infected household, generate secondary infections. 5. Assign to each infection a viral load sampled from the distribution described in • Naive pooling: we perform an independent random permutation on all the individual samples from the population and place them sequentially into pools regardless of household membership. • Correlated pooling: we aim to place samples of individuals from the same household in the same pool. A collection of partially full pools are maintained and households are added sequentially. To add a household, we look for the first unfinished, capacity-permitting pool and place all samples of the household into this pool. If this is infeasible, we split the household across two or more pools. As is in the Dorfman procedure, samples in the same pool undergo one pooled test. All individuals in the pools testing positive take followup tests. We assume the amount of sample collected from each individual is enough so that no re-sampling is required if the followup test is necessary. This implies that the viral loads in the subsamples used for the pooled test and followup test are equal. As is assumed in Section 5.2, the subsample for the pooled test is smaller than that for an individual test by a factor of the pool size, which results in dilution in the pooled sample. We compare the performance of naive pooling and correlated pooling by conducting multiple numerical experiments under different sets of parameters. We investigate the robustness of correlated pooling's advantage over naive pooling. First, we pick a set of parameters as the baseline setting, shown in Table 1 . We consider this to be a representative setting for a medium-sized town in the early stage of an epidemic. The choice of pool size is informed by empirical implementations of group testing for COVID-19 (Fan 2020, Lefkowitz 2020, Barak et al. 2021 ). We focus on two metrics to evaluate the performance of a group testing protocol, namely sensitivity (i.e., 1 − FNR) and efficiency. Both are important for epidemic mitigation, as high sensitivity helps identify the positives accurately, while high efficiency permits more frequent screening under limited resources. Here we present efficiency as the metric for test consumption because it is most widely used. The performance in the metric γ i,α proposed in Section 3.3, the number of positive cases identified per PCR test, can be inferred by taking the product of sensitivity and efficiency. The performance of naive pooling and correlated pooling in the Dorfman procedure under the baseline setting over 2000 iterations is shown in Table 2 . As a reference, only using individual testing has a sensitivity of 95% and an efficiency of 1. Correlated pooling has better performance in terms of both sensitivity and efficiency than naive pooling. This is because correlated pooling in general has more positive cases in a positive-containing pool (due to correlation among samples from the same household). As a result, a sample with low viral load, which might otherwise be missed in naive pooling, is more likely to be "rescued" by other positive samples in the same pool in correlated pooling, leading to higher sensitivity. (This is referred to as the "hitchhiker effect" in Barak et al. (2021) .) Meanwhile, the clustering of more positive cases in the same pool also implies a smaller number of pools that contain positive samples and require followup tests, resulting in a higher efficiency of correlated pooling. We demonstrate that the advantage of correlated pooling is robust against deviation in parameter values from the baseline setting in Appendix F.2. While such improvement may seem small, it can have a significant impact on real world policy making. We will show in Section 6.3 that, when pool size is optimized for both pooling strategies separately, correlated pooling enables more effective epidemic control than naive pooling. 6.2.1. Sensitivity Versus Efficiency Across Pool Sizes Under the same population-level prevalence, we anticipate test accuracy and efficiency will vary when we choose different pool sizes. take the values given in Table 1 . In most scenarios (except when under high prevalence and large pool size), correlated pooling outperforms naive pooling in both sensitivity and efficiency. When prevalence is low (e.g., 0.1%, Figure 2a) 6.2.2. Test Specificity As discussed in Section 1, false positives pose challenges to large-scale screening, including waste of public health and economic resources, disruption of personal lives, and increased exposure risk during unnecessary treatment. Though false positives are not explicitly included in our modeling, here we argue that they are not a significant concern if pooling is used. In particular, we demonstrate that group testing has substantially lower false positive rate (FPR) than individual testing, and, moreover, correlated pooling achieves a lower FPR than naive pooling. For our discussion, we start by assuming that false positives originate mainly from lab contamination that occurs independently across tests. We assume any PCR test on a negative sample has a small constant FPR (e.g., 0.01% as reported in Public Health Ontario (2020)), which is much smaller than the probability that a typical positive-containing pool tests positive. Under these assumptions, the probability that a negative sample in an all-negative pool is declared positive is negligible (e.g., 10 −8 ) compared to when it is in a positive-containing pool. Hence, we estimate the FPR of a testing protocol by the fraction of negative samples that receive individual tests, assuming they are all in positive-containing pools. This can be directly inferred from our simulation results. First, we compute the fraction of samples in the population receiving individual tests using frac indiv = efficiency −1 − 1 n . Second, we estimate frac pos, indiv , the fraction of samples that are positive and receive individual tests, using α · sensitivity 8 . We take the difference of the above two quantities to estimate frac neg, indiv , the fraction of samples that are negative and receive individual tests. Multiplying this difference by 0.01% then gives frac neg, indiv pos , the fraction of samples that are negative and test positive in individual tests. Finally, we divide the frac neg, indiv pos by 1 − α, the fraction of samples that are negative, to obtain the estimate for FPR. We summarize the above calculations for correlated pooling and naive pooling in Table 3 based on the simulation results for the baseline setting in Table 2 . We see that both pooling strategies achieve an FPR on the order of 10 −6 , with correlated pooling slightly outperforming naive pooling. In our regime of discussion, the FPR roughly scales linearly with pool size and prevalence. Hence, for a prevalence of up to 1% and a pool size of up to 20, we expect the FPR of either pooling strategy to be at least as good as 10 −5 . This is a ten-fold reduction from the FPR of individual testing. Such specificity is sufficiently high in many uses of repeated screening for infection control. We also argue that false positives from PCR tests have little impact on efficiency, i.e., they incur only a small number of extra tests. In the pooled stage, 0.01% of the all-negative pools are expected to test positive and require follow-up tests for their samples. As the number of samples in all-negative pools is upper bounded by N , the extra tests due to PCR false positives translate to a less than 10 −4 increment in the number of tests per person. Besides, sensitivity is not affected by false positives of PCR tests. In this section, we study how the improvement in sensitivity and efficiency due to correlated pooling translates to more effective epidemic control. In specific, we show that, when used for repeated large-scale screening, correlated pooling requires 12.9% fewer tests per day than naive pooling to stabilize or reduce the number of active infections in a population with 1% prevalence. We consider a setting where policy makers of a city wish to choose a pool size and screening frequency in using group testing for population-wide screening. We represent the epidemiological dynamics with a deterministic SIR model (Kermack and McKendrick 1927) that incorporates screening. We let S and I denote the fraction of susceptible individuals and active infections in the population, and let R denote the fraction of population "removed" due to either natural recovery or being detected and isolated in screening followed by recovery. We assume, for simplicity, that an infected individual is infectious and a recovered individual does not become susceptible again. We also assume a constant fraction of the non-isolated population is screened every day. We use a set of three discrete-time equations to represent the disease dynamics, where a time step corresponds to a day: where b I is the rate of transmission given an interaction between a susceptible and an infected person; b R is the rate at which an infected individual recovers on any day (we assume b I > b R , since the epidemic dies out naturally even without intervention if b I ≤ b R ); f is the frequency of screening for non-isolated individuals, i.e, those in the S and I groups. We first derive the critical screening frequency required to control the epidemic, i.e., stabilize or reduce the number of active infections. To quantify the epidemic growth, we define the "growth factor" λ at time t as the ratio of the number of new cases at time t to the number of cases removed at time t: λ(t) = b I · S(t)I(t) (b R + f · sensitivity) · I(t) . According to Equation 5, the number of infected individuals grows when λ(t) > 1 and declines when λ(t) < 1. We further construct a time-invariant upper bound on λ(t) by setting S(t) = 1: . Alternatively, λ can be interpreted as the growth factor in the early stage of the epidemic, where the majority of the population is susceptible, i.e., S(t) ≈ 1. Since λ(t) ≤ λ for all t, any screening frequency f that results in a λ less than 1 also implies λ(t) < 1 for all t. Therefore, we use λ = 1 as a threshold that characterizes whether the epidemic is brought under control. At this threshold, the screening frequency has a critical value f * satisfying A larger value of f would reduce λ even further, but it would increase test consumption, a key quantity of practical concern. Hence, we next use f * to derive the minimum test consumption required for epidemic control. For a screening frequency f , test consumption per day satisfies: test consumption per day ∝ screening frequency × # tests consumed per person = f × efficiency −1 . By Equations 6 and 7, minimum test consumption per day ∝ f * × efficiency −1 where we recall that both sensitivity and efficiency depend on the pool size, prevalence level, and pooling choice (whether correlated or naive pooling is adopted). As discussed in Section 3.3, γ i,α , the expected number of positives identified per PCR test, is directly proportional to sensitivity × efficiency. Hence, for a certain pool size, γ −1 i,α provides a proxy for the minimum test consumption that enables epidemic control. Therefore, one should maximize γ i,α (or, equivalently, sensitivity × efficiency) when optimizing the pool size for a group testing protocol in real-world decision making. Table 4 compares the optimal naive pooling and correlated pooling policies (by choosing a pool size that maximizes sensitivity × efficiency) under different prevalence levels. The last column of Table 4 illustrates the reduction in minimum test consumption required for epidemic control using the optimal correlated pooling policy relative to the optimal naive pooling policy. For example, when prevalence is 1% and we only consider correlation among samples from the same household, a pool size of 12 is optimal for both naive pooling and correlated pooling in terms of maximizing sensitivity × efficiency. Using Equation 8, we derive that compared to the optimal naive pooling policy, the optimal correlated pooling policy uses 1/4.56−1/5.23 1/4.56 = 12.9% fewer tests. We argue that such difference has a substantial impact on policy-making in real-world scenarios. As illustrated earlier, correlation in infection statuses exists due to interaction within social groups such as households, schools and offices, and is preserved to some extent in pools. Hence, policies informed by analyses ignoring correlation tend to be too pessimistic because the predicted test consumption overestimates the reality. We describe two possible resulting scenarios below: • The available testing capacity of the city meets the minimum test consumption required by the optimal correlated pooling strategy but not the optimal naive pooling strategy. Assuming naive pooling, the policy maker decides that no screening policy can permit safe reopening and thus issues a lockdown. However, had the policy maker taken into account the existence of within-pool correlation, he/she could have safely reopened the economy with a feasible screening policy. • The available testing capacity of the city meets the minimum test consumption required by the optimal naive pooling strategy, so the policy maker decides to reopen. However, since naive pooling underestimates the actual efficiency, the policy maker chooses a lower screening frequency than allowed by the available testing capacity. Had the policy maker accounted for correlation, he/she could have picked a higher screening frequency and achieved better epidemic mitigation. Furthermore, if the naturally-induced within-pool correlation is weak, explicit measures can be taken to facilitate correlated pooling. For example, one can mandate that individuals from the same household get tested together so that their samples can be placed in the same pool without many logistical difficulties. For a city with limited resources, such measures could enable a safe reopen with population-wide screening, while it may not be feasible otherwise. In this paper, we proved that under a general correlation structure in the population and other mild assumptions, for the same pool size, correlated pooling achieves higher sensitivity than naive pooling and consumes comparable or fewer tests per positive identified compared to naive pooling. We used numerical experiments to quantify the advantage of correlated pooling over naive pooling in both sensitivity and efficiency and substantiated its real world implications for epidemic control. Our work can be extended in several directions in future research. First, we focused our study on correlated pooling in the standard two-stage Dorfman procedure. However, in other group testing protocols, such as hierarchical and combinatorial group testing, each sample is assigned to multiple pools. Within-pool correlation structures are more convoluted in these settings. Second, study has shown that a simple SIR model and a similarly configured stochastic compartmental model can behave differently (Koopman et al. 2002) . Hence, it would be interesting to explore the effect of correlation in a stochastic compartmental model that incorporates social dynamics and transmission within the network. One might also consider studying the heterogeneity in infection time, infection duration, and viral load of different individuals, which imposes a more complicated correlation structure. An agent-based simulation model may be a good approach to achieve this. Third, it would be meaningful to incorporate sampling noise, where the sample viral load could be zero for an infected individual. The additional transmission due to undetected individuals may counteract the benefits offered by correlated pooling, and such consideration is of practical interest for large-scale epidemic control. This could be addressed using latent variable models. Proof of Theorem 1. For i = 0, 1, we have that the overall false negative rate is given by In both correlated pooling and naive pooling, all V j 's are identically distributed by Proposition 1, which are also identically distributed. Hence, we obtain that For naive pooling, the V i 's are i.i.d. Hence, Similarly, we derive β 1 for correlated pooling. Following Equation EC.1 we have that When ≥ 2, we haveV n > 1 n V 1 because there exists at least one j = 1 such that V j > 0. Assuming p(v) is a monotone increasing function in v, we obtain p(V n ) ≥ p( 1 n V 1 ), which, combined with p(V 1 ) > 0 given V 1 > 0, implies that A ≥ A 1 . Therefore, taking α → 0 + gives The inequality is strict if p(v) is strictly increasing in v and Condition 1 holds. Condition 1 implies that there exists ≥ 2 such that lim α→0 Proof of Theorem 2. We first derive η 0,α for naive pooling. By similar arguments in the Proof of Theorem 1, the denominator of η 0,α is given by The numerator of η 0,α is given by By definition of η 0,α and Equations EC.2 and EC.3, taking α → 0 + gives . Then, we derive η 1,α for correlated pooling. (both terms in the denominator are nonzero because p(v) > 0 ∀v > 0) . Lower-bounding Equation EC.5 by n and using Equation EC.8 gives the desired result. Proof of Corollary 1. We apply the threshold sensitivity function to the calculation of lim α→0 + η 0 and η 1 . In Equation EC.5, the first term on the numerator in the parenthesis is given by which implies lim α→0 + η 0,α = n. In Equation EC.8, the numerator of the last term is given by which implies η 1,α ≤ n. We give an example of sensitivity and viral load distribution under which correlated pooling has lower test efficiency, contrary to the claims in the literature. Through this example, we also show the necessity of the bound in Theorem 2. Consider a piecewise constant sensitivity function p, where θ 1 , θ 2 ∈ (0, 1) and θ 1 < θ 2 : . Consider a correlated pool consisting of two samples with the joint viral load distribution given in Table EC .1. By Assumptions 1 and 2, the corresponding naive pool contains two samples whose viral loads are independent with the same marginal distribution as that in Table EC .1. We set α = 1%. Recall that, for i = 0, 1, referring to naive and correlated pooling respectively, and prevalence α, we have the following notation: • β i,α is the overall FNR of the pooling strategy; • η i,α is the number of followup tests consumed per positive identified; • efficiency i,α is the number of individuals screened per test consumed. By Equation 3, it is also given by the following expression: To derive efficiency, we first calculate β i,α and η i,α for i = 0, 1. Using Equation EC.1, we derive the overall FNR for naive and correlated pooling: Second, we derive η i,α for i = 0, 1. By definition, we have for i = 0, 1. We find that 2. β 1,α < β 0,α and η 1,α < η 0,α . This occurs when (θ 1 , θ 2 ) falls outside region 'B' in Figure EC .1b 10 . Furthermore, we take α to the limit of zero and observe that the region where lim α→0 + η 1,α η 0,α > 1 resembles region 'B' in FigureEC.1b where α takes 0.01. This necessitates the (1 + δ) bound in Theorem 2. 9 Though not depicted, a small area at the top of region 'B' (where θ1 and θ2 are both close zero) corresponds to β1,α > β0,α, both of which are close to 1. This does not violate Theorem 1 which considers the asymptotic scenario. 10 If correlated pooling has lower efficiency, i.e., η1,α(1 − β1,α) > η0,α(0 − β1,α) in Equation EC.9, it is not possible to have both η1,α < η0,α and β1,α ≥ β0,α. The cdf of the viral load of this sample is Suppose the correlated pool being studied is the Jth of the |A| correlated pools. Because the correlated pool we are studying is chosen randomly from the |A| pools, P (N) (J = j ) = 1 |A| for all j = 1, · · · , |A|. Now consider an arbitrary individual from this pool, and suppose this individual is the ith of this pool. Recall that we reordered samples in each pool by performing an independent random permutation of 1 through n, denoted by π. Then, the index of i before the permutation is uniformly distributed over {1, · · · , N }, that is, n for all i = 1, · · · , n. Let V (j ,i ) denote the viral load of the i th sample in the j th pool before reordering, for each i , j . Then, the cdf of the sample viral load of this arbitrary individual from the correlated pool is given by where the last equality follows from the observation that this double sum is equivalent to summing over all individuals in {1, · · · , N }. This is identical to the cdf of the viral load of an individual chose uniformly at random from the naive pool. We define a measure of association between the viral loads of one individual and a group of individuals. Consider a collection of individuals j, whose population indices are denoted {j 1 , · · · , j |j| }. For a population of size N , we define the cumulative distribution function for the viral load v of individual i ∈ [N ]\j conditioning on the viral loads z ∈ R |j| of the individuals in j: Based on this, we define a measure of association between the viral loads of i and j: e-companion to Wan, Zhang, and Frazier: Correlation Improves Group Testing ec7 This quantity is the maximum change in the cdf of i's viral load that can be created by varying the viral loads of j. It reflects the degree to which conditioning on the viral loads of j affects the viral load of i. A larger ∆ (N) α (i, j) indicates a stronger association between i and j. The collection of individuals having association with j stronger than is We denote by m (N) α ( ) the maximum size of such sets, across any collection j of at most n − 1 individuals. When m (N) α ( ) is small relative to N , when we add an individual i to the pool who is chosen uniformly from the larger population, they are unlikely to be in a set with high association ∆ (N) α (i, j) > with the individuals already in the pool. This makes the viral loads in the pool unlikely to be strongly correlated. Recalling that the pool size is n, we have Now we take N to the asymptotic regime and make the following assumption. Assumption EC.1. There exists a sequence N ↓ 0 such that lim N →∞ Assumption EC.1 prescribes that as population size N goes to infinity, for any collection j of individuals of size less than n, the set of individuals that have association stronger than N with j grows slower than linearly in population size. In an epidemic like COVID-19, transmission typically takes place between close contacts (World Health Organization 2020). It is reasonable to assume that for two individuals to be associated in infection statuses, they have to be within a few degrees of contact with each other. Since the duration of the infectious period is finite, and a person's contact rate is typically bounded above by some constant (Hu et al. 2013 ) even as population size grows large, the number of people connected to an individual in j via within a few degrees of contact grows slower than linearly in population size. Hence, this assumption is well-justified. Proof of Proposition 2. Let random variables [1], [2], · · · , [n] be the population indices of the individuals placed into this randomly chosen naive pool J. We use superscript (N ) and subscript α to index the quantities computed for a size N population at prevalence α. We assume lim N →∞ P To prove the proposition, we want to show that the joint cdf of viral loads in a naive pool factors into a product of cdf's of individual viral loads as N → ∞. Let v ∈ R n ≥0 . We first use the law of conditional probability to expand the joint cdf: (EC.14) To analyze the conditional probability in the second term, we first make the following claim: For all j ⊂ {1, · · · , N } with |j| = n − 1 and i / ∈ j, To prove the claim, we first apply the law of iterated expectations to the second term on the left hand side of Equation EC.15: We expand and bound the first conditional probability in Inequality EC.16 Therefore, That is, the absolute difference is bounded by a constant, which does not change when taken expectation over z. This proves the claim. Claim EC.15 enables a closer analysis of the conditional probability in Equation EC.14. Using the law of iterated expectations where we condition on [1], [2], · · · , [n] (hereafter abbreviated as [1 : n]), we have that We consider two cases for the expectation in the second term. For any > 0, ∆ (N) α ([n], [1 : n − 1]) could either be less than , or greater than but upper bounded by 1. That is, . Plugging this result back to Equations EC.18 and EC.17, we have the following for each > 0: We can apply Bound EC.19 to iteratively decompose and bound the full joint cdf in Equation EC.14. Let N be a sequence satisfying Assumption EC.1, i.e., N ↓ 0 and lim N →∞ 1 N m (N) α ( N ) = 0. Taking the limit N → ∞ of the expression above, we have Similarly, we can use the other direction of Inequality EC.15 to derive a lower bound counterpart to Inequality EC.17: Applying Inequality EC.20 to Equation EC.14, we derive a lower bound for the joint cumulative distribution function: For the same sequence of N satisfying Assumption EC.1, we have Since the lower and upper bounds coincide, we have that Therefore, as N → ∞, viral loads of samples in a naive pool are asymptotically independent. Proof of Proposition 3. For succinctness, we abbreviate the probability operator P 1,α (·) and the expectation operator E (N) 1,α [·] as P(·) and E[·] in Appendix C.4. For a generic pool j ∈ {1, · · · , |A|}, let I(j) be the sample in pool A j with nonzero infection probability and the smallest population index, I(j) = min{i : P(V i > 0) > 0, i ∈ A j }. If such a sample does not exist in A j , then I(j) = ∞. Let C I(j) denote the set of I(j)'s close contacts and K(j) denote an individual selected uniformly at random from C I(j) . Let S j = i∈A j 1{V i > 0}. Since the pooling assignment A is a random variable, A j , I(j) and C I(j) are all random. We make the following observation: if sample I(j) is positive, sample K(j) is positive, and K(j) is also in pool j, then pool j must contain more than one positive. Therefore, P(S j > 1) = P(S j < 1 | I(j) < ∞) · P(I(j) < ∞) since pooling assignment is assumed to be independent of viral loads e-companion to Wan, Zhang, and Frazier: Correlation Improves Group Testing ec11 We generalize this result to a pool J selected uniformly at random from all pools. On the other hand, for a fixed pooling assignment A, the probability that a generic pool j contains one or more positives can be bounded above:  since viral load does not depend on pooling assignment We now generalize the result in Equation EC.22 to a pool J selected uniformly at random from all pools and all pooling assignments: Combining Equations EC.21 and EC.23, we find which is a positive constant that does not depend on α. This proves the proposition. We argue that the metrics investigated in Section 3 (β i,α , γ i,α and η i,α ) are appropriate for evaluating a group testing protocol in the population-wide screening context. These metrics were defined in terms of the joint distribution of quantities associated with a single pool selected uniformly at random. We show that, as the population grows large, population-level quantities of interest converge in probability to these metrics. Specifically, we show that the fraction of positives missed across the whole population converges to , as long as E i,α [S] > 0 (which holds because α > 0). Hereafter in Appendix C.5, we drop subscripts i = 0, 1 and α because our arguments apply to both naive and correlated pooling, and all prevalence α > 0. Recall that P (N) indicates the probability distribution over quantities when the population size is N , under which the number of pools is N/n, S j = i∈A j 1{V i > 0} is the number of positive samples in pool j, and D j is the number of positives identified in pool j. We make the following assumption. Assumption EC.2. Under any P (N) , for any j = 1, · · · , N/n, conditioned on S j , D j is independent from all other S j where j = j. Under any P (N) , for any s = 0, · · · , n and j = 1, · · · , N/n, D j | S j = s are i.i.d with mean d s for some constant d s ∈ [0, n] that does not depend on N . Assumption EC.2 is based on an implicit assumption that the viral load of any individual is independent from the viral loads of all other individuals given his/her infection status. It is a simplification from the general correlation model of the viral loads across the population. However, this assumption is mild because disease progression and peak viral load across different individuals are determined by individual biological responses to the virus which we consider independent. We also believe that the result in this section would hold when Assumption EC.2 is violated but conditional dependence of D j given S j across pools j vanishes asymptotically for pools collected from disparate parts of the overall population. To establish this, we conjecture that it would be sufficient to replace the law of large numbers used in this section with a version that allows for weak dependence, e.g., Theorem of Barnstein in Cacoullos (2012) . e-companion to Wan, Zhang, and Frazier: Correlation Improves Group Testing ec13 Now let L s be the number of pools in the entire population with s positives in the pool: LetD s be the average number of positives detected per pool in pools with s positives in the population: (EC.24) LetD be the average number of positives detected per pool in the population: We assume that for each s = 0, · · · , n, the fraction of pools with s positives stabilizes as the population size grows large. Assumption EC.3. For each s = 0, · · · , n, the distribution of Ls |A| under P (N) converges in probability to s as N → ∞ for some constant s ∈ [0, 1] 11 . Although we assumed earlier in Section 4.2 that the viral loads in a randomly chosen pool have a limiting joint distribution, this alone does not imply Assumption EC.3. For example, if strong correlation exists across all the pools in the population, Ls |A| may not converge to a single constant. However, we consider this to be unlikely in reality. Correlation in infections typically does not extend across the entire population, because social interactions and infectious periods are bounded above, limiting the number of secondary infections a source case can produce. Therefore, the effect of inter-pool correlation on L s diminishes as population size grows to infinity. Hereafter, we use p → to denote convergence in probability under P (N) as N → ∞. (EC.29) 11 We note that the constants { s} n s=0 , representing the allocation of positive samples across the pools, should be different for naive pooling and correlated pooling, though we do not make this distinction here. Plugging Inequality EC.29 into Equation EC.28, we find that Since l s /2 > 0, by Assumption EC.3 and definition of convergence in probability (Billingsley 1995 (EC.31) In Equation EC.31, f µ k ,σ k and F µ k ,σ k denote the probability density function and cumulative density function of the k th component with mean µ k and standard deviation σ k , respectively. The censoring threshold d cens Note: Here, πk, µk, σk are the weight, mean and standard deviation of the k th component, respectively. represents the limit of detection of the PCR assay, such that a sample with C t value exceeding it is not observed. Note that the authors fit a censored GMM to account for the detection limit of PCR tests, such that those with too high a C t value are not reported in the data. The associated uncensored GMM model represents the true C t distribution of the entire population, including those that may not be detected through individual PCR tests. Moreover, since C t value is a measurement of the viral load, and viral load is the quantity directly of interest to our simulation, we use a formula given in Jones et al. (2020) to convert this distribution to that of the log 10 of viral load (copies/mL) 12 : log 10 V L = log 10 (1.105 · 10 14 · e −0.681C t ) = (14 + log 10 1.105) − 0.681 ln 10 C t . This results in a GMM on the log 10 of the viral load with parameters shown in Table EC The first step in a pooled PCR test is the collection of samples from each subject. For SARS-CoV-2 testing, the most common sample types include nasopharyngeal swabs, anterior nares swabs, and saliva. We assume the raw volume of the samples is the same across all subjects, denoted by V sample . (Nasopharyngeal and anterior nares swabs can be transported in a fixed amount of viral transport media; saliva samples, whether self-collected or not, can require a prescribed volume.) Once the n samples are collected, they are transported to the lab to be prepared for pooling. Let V i denote the viral load (i.e., the number of viral RNA copies per unit volume) of the ith sample in the pool. If the ith sample is negative, then V i = 0. A pipetting robot fetches a volume of V subsample from each sample for pooling, so the number of RNA copies selected for pooling is for the ith sample. We assume that, compared to an individual test, pooling reduces the subsampling volume by a multiplicative factor of n. (That is, the n subsamples, when pooled together, have the same volume as an individual test in the same step.) Then, all n subsamples are pooled together and go through an RNA extraction step using glass fiber plates. Assuming that each RNA copy attaches to the glass fiber plates independently with probability ξ, the number of eluted RNA copies used as templates that enter the PCR machine follows a binomial distribution M ∼ Binom ( n i=1 N i , ξ). Aggregating the binomial subsampling in these steps, we find that M follows a binomial distribution: Finally, we assume the PCR test has a detection threshold τ , a positive integer, such that if M ≥ τ , the test returns a positive result; otherwise, negative 14 . (As a result, a negative sample is always classified as negative.) 13 The proof of this relation is straightforward, based on two identities: (i) If Xi ∼ Binom(ni, p) are independent, then i Xi ∼ Binom( i ni, p); (ii) If X ∼ Binom(n, p) and Y | X ∼ Binom(X, q), then Y ∼ Binom(n, pq). 14 The detection threshold τ is not to be confused with the limit of detection (LoD), i.e., the lowest concentration of the target (in copies per volume) that a PCR assay can detect at least 95% of the time (Burns and Valdivia 2008 ). In our model, a higher τ corresponds to a higher LoD. The way we model the subsampling steps using binomial random variables captures the randomness associated with the definition of LoD. For succinctness, we abbreviate the probability operator P 1,α (·) and the expectation operator E 1,α [·] as P(·) and E[·] in Appendix E. We rely on two conditional independence assumptions discussed previously in Section 3.1 and Section 5.1 to derive the upper bound δ for δ, which we formulate again below. Assumption EC.4. For all i = 1, · · · , n, W i is independent of {V j } j =i and {W j } j =i given V i . Assumption EC.5. For all i = 1, · · · , n, V i is independent of {V j } j =i given E i where E i = 1{V i > 0}. Assumptions EC.4 and EC.5 also imply a sequence of conditional independence results, which we use in the derivation of an upper bound for δ in Appendix E.2. First, we show that Assumption EC.4 implies a weaker conditional independence relation, namely Proof of Lemma EC.1. Starting from the joint conditional density, we have that .. repeat the above calculations for n − 1 times Then, we derive a similar conditional independence relation that {V i } n i=1 are independent given {E i } n i=1 . To see this, we first note that by the definition of independence, it immediately follows from Assumption EC.5 that given E i , V i is also independent of the indicators E j where j = i. Lemma EC.2. For all i = 1, · · · , n, V i is conditionally independent of {E j } j =i given E i . Lemma EC.2, together with Assumption EC.5, implies that given all indicator variables Proof of Lemma EC.3. The proof technique is the same as that of Lemma EC.1. Starting from the joint conditional density, we have that f (v 1:n | e 1:n ) = f (v 1:n , e 2:n | e 1 ) f (e 2:n | e 1 ) = f (v 1 | e 1 )f (v 2:n , e 2:n | e 1 ) f (e 2:n | e 1 ) by Assumption EC.5 and Lemma EC.2 = f (v 1 | e 1 )f (v 2:n | e 1:n ) = ... repeat the above calculations for n − 1 times Hence, given E 1:n , V 1 , · · · , V n are independent. It follows from Lemmas EC.1 and EC.3 that (V i , W i ), i = 1, · · · , n are also conditionally independent, given Proof of Lemma EC.4. We consider the joint conditional density of ( f (v i | e 1:n ) by Lemma EC.1 and EC.3 f (v i | e i ) by Assumptions EC.4 and EC.5 f (w i , v i | e 1:n ) by Lemma EC.1 and EC.3. We are done. . To bound δ from above, we provide upper and lower bounds for the terms in the numerator and denominator in Equation EC.32, respectively. We start by proving an upper bound for the second term in the numerator. It also implies that P(S D > 0 | S > 0) ≥ P(S D > 0 | S = 1) for the second term in the denominator. Proposition EC.1. P(S D = 0 | S > 0) ≤ P(S D = 0 | S = 1). Proof of Proposition EC.1. We consider P(S D = 0 | S = k) for any k ∈ {1, 2, · · · , n}. Since S D = n i=1 W i , we have that Note that for i = 1, 2, · · · , n, we have Sinceβ ∈ [0, 1], we find P(S D = 0 | S = k) ≤ P(S D = 0 | S = 1) for all k ∈ {1, 2, · · · , n}. By the law of iterated expectations, it follows that P(S D = 0 | S > 0) ≤ P(S D = 0 | S = 1). Second, we provide a lower bound for the first term in the denominator in Equation EC.32. To achieve this, we characterize a first-order stochastic dominance relation, given in Lemmas EC.5. Proof of Lemma EC.5. Recall that W i = Ber(p(V i )) where p(·) : R ≥0 → [0, 1] is monotone increasing. By Bayes rule, we have that Then, If P(V i ≥ v) = 1, then the inequality holds; otherwise, by monotonicity of p(v) we have We are done. Proof of Proposition EC.2. We consider P(Y = 1 | S D = k, S = s) for any 0 ≤ k ≤ s ≤ n and show that it is increasing in both k and s. We have that To derive the inner expectation, we study the joint conditional density of V 1 , · · · , V n given W 1:n and E 1:n . We have that f (v 1:n | w 1:n , e 1:n ) = f (v 1:n , w 1:n | e 1:n ) f (w 1:n | e 1:n ) Hence, given W 1:n and E 1:n , {V i } n i=1 are independent, with the distribution of V i given by V i | W i , E i . Since V 1 , · · · , V n are identically distributed, we have that {V i | W i = 1, E i = 1} n i=1 and {V i | W i = 0, E i = 1} n i=1 ec21 are also identically distributed, respectively. Denote the distributions for V i | W i = 1, E i = 1 and V i | W i = 0, E i = 1 by F V |W =1 and F V |W =0 , respectively. Then, n i=1 V i is the sum of S D i.i.d random variables with distribution F V |W =1 and S − S D i.i.d random variables with distribution F V |W =0 . That is, the distribution of n i=1 V i only depends on {E i } n i=1 and {W i } n i=1 through their respective sums, S and S D . Hence, since p(v) is monotone increasing, P(Y = 1 | S D = k, S = s) is increasing in s. Moreover, since F V |W =1 first-order stochastic dominates F V |W =0 by Lemma EC.5, P(Y = 1 | S D = k, S = s) is also increasing in k. Therefore, we In this section we provide a point estimate and 95% confidence interval for δ under different pool sizes and detection thresholds. We show that δ is consistently small under various conditions. Below we describe the methodology in details. We use Monte Carlo simulation to estimate P(Y = 1 | S D = 0, S = n) and P(Y = 1 | S D = S = 1) separately. Let V 1 , · · · , V n i.i.d ∼ F V |W =0 where F V |W =0 is the distribution for V i | W i = 0, E i = 1. Then, as shown in the proof of Proposition EC.2, X = P(Y = 1 | V 1:n ) = p 1 n n i=1 V i is an unbiased estimator for P(Y = 1 | S D = 0, S = n), i.e. P(Y = 1 | S D = 0, S = n) = E[X]. To sample from F V |W =0 , we first sample V from V | V > 0, the viral load distribution described in Table EC .3, then we sample W ∼ Ber(p(V )). We keep the sampled ec22 e-companion to Wan, Zhang, and Frazier: Correlation Improves Group Testing V if the sampled W is equal to zero and discard V otherwise. We generate B = 10 6 samples X 1 , · · · , X B for estimating P(Y = 1 | S D = 0, S = n). Similarly, let V ∼ F V |W =1 where F V |W =1 is the distribution for V i | W i = 1, E i = 1. Then, Z = P(Y = 1 | V, 0, · · · , 0) = p(V /n) is an unbiased estimator for P(Y = 1 | S D = S = 1), i.e. P(Y = 1 | S D = S = 1) = E[Z]. Sampling from F V |W =1 follows a similar procedure as sampling from F V |W =0 . We generate B = 10 6 samples Z 1 , · · · , Z B for estimating P(Y = 1 | S D = S = 1). Hence, the point estimate for δ is given byδ To provide a confidence interval for δ , we first find confidence intervals for the E[X] and E[Z] separately. We derive the confidence interval for E[Z] based on central limit theorem. Using normal approximation, the q = 99.99% confidence interval for E[Z] is given by [L Z , U Z ] = [Z − 3.891 · σZ,Z + 3.891 · σZ]. On the other hand, E[X] is close to zero in the regime we consider, and the samples X i can differ by several orders of magnitude. Thus, instead of using the normal approximation, we employ bootstrapping (Efron and Tibshirani 1993) with 10 4 replications to construct the 95 q % confidence interval for E[X], denoted by [L X , U X ]. Because the samples X i 's and Z i 's are independent, the Cartesian product [L X , U X ] × [L Z , U Z ] is a 95 q · q % = 95% confidence interval for (E 1,α [X], E 1,α [Z]). It follows that L X U Z , U X L Z (assuming that 0 < L Z ≤ U Z and 0 ≤ L X ≤ U X ) is a 95% confidence interval for δ . Note: US±1, US±2 are household distributions with weights ±0.075, ±0.15 respectively uniformly allocated to household sizes > 1 from the weight of household size 1. For example, US+1 has weight 0.284 − 0.075 on households of size 1, weight 0.345 + 0.075/5 on households of size 2, weight 0.151 + 0.075/5 on households of size 3, etc. Here we demonstrate that the advantage of correlated pooling over naive pooling is robust against deviation in parameter values from the baseline setting. In each plot, we show the performance of naive and correlated pooling when varying the value of a single parameter and fixing the other parameters to their values in the baseline setting. Figure In all plots, correlated pooling consistently performs better than naive pooling in terms of both sensitivity and efficiency. Figure EC.2a shows that smaller prevalence leads to lower sensitivity but higher efficiency. This is due to the existence of fewer positive samples in a positive pool, which results in larger FNR because of the dilution effect. Smaller prevalence also implies fewer positive pools, leading to fewer followup tests and therefore higher overall efficiency. Figure EC .2b shows that a larger pool size typically implies a stronger dilution effect, which causes sensitivity to decline. Efficiency increases with pool size initially because for smaller pools the number of pooled tests is the dominating factor in determining the efficiency. On the other hand, a larger pool (e.g., size e-companion to Wan, Zhang, and Frazier: Correlation Improves Group Testing ec25 of 24) is more likely to contain a positive, which requires more individual tests once the pool tests positive. This causes the efficiency to decline for larger pools. In Figure EC .2c, sensitivity decreases and efficiency increases as the population-average individual test FNR,β, rises. A higherβ also implies a higher FNR of the pooled test, which explains the drop in sensitivity. Efficiency increases because a higher detection threshold causes more cases to be missed by the pooled tests and therefore fewer followup tests are required. In Figure EC .2e, the change in household size distribution does not affect the performance of naive pooling, but it does affect that of correlated pooling. Under household size distributions that have larger weights on larger household sizes (e.g., CN, US+1, US+2), positive pools under correlated pooling tend to contain a larger number of positives, which implies improvement in both sensitivity and efficiency. The above sensitivity analyses are based on the baseline setting. However, we do expect the sensitivity analysis based on other parameter settings to show similar patterns as the results illustrated here. Optimal risk-based group testing Group testing in a pandemic: The role of frequent testing, correlated risk, and machine learning Lessons from applied large-scale pooling of 133,816 SARS-CoV-2 RT-PCR tests The effect of correlation and false negatives in pool testing strategies for covid-19 Digital assays part I: partitioning statistics and digital PCR Assessing the dilution effect of specimen pooling on the sensitivity of SARS-CoV-2 PCR tests New onset of loss of smell or taste in household contacts of home-isolated SARS-CoV-2-positive subjects Group testing as a strategy for COVID-19 epidemiological monitoring and community surveillance Secondary attack rate of pandemic influenza A (H1N1) 2009 in Western Australian households Graph-constrained group testing Using viral load and epidemic dynamics to optimize pooled testing in resourceconstrained settings Statistical modeling for practical pooled testing during the COVID-19 pandemic Stochastic effects on endemic infection levels of disseminating versus local contacts Work-related COVID-19 transmission in six asian countries/areas: a follow-up study Robots, know-how drive COVID lab's massive testing effort Group testing for case identification with correlated responses Positively correlated samples save pooled testing costs Pooling of samples for testing for sars-cov-2 in asymptomatic people Household transmission of SARS-CoV-2: a systematic review and meta-analysis of secondary attack rate Meningococcal disease. Secondary attack rate and chemoprophylaxis in the united states Testing at scale during the COVID-19 pandemic A strategy for finding people infected with SARS-CoV-2: optimizing pooled testing at low prevalence. medRxiv A methodology for deriving the sensitivity of pooled testing, based on viral load progression and pooling dilution Assessment of secondary attack rate and effectiveness of antiviral prophylaxis among household contacts in an influenza A(H1N1)v outbreak in Kobe Group testing for SARS-Cov-2 to enable rapid scale-up of testing and real-time surveillance of incidence COVID-19 laboratory testing Q&As Crowding and the shape of COVID-19 epidemics The vera cloud testing platform, protecting our communities by enabling testing at scale Participation in fraternity and sorority activities and the spread of COVID-19 among residential university communities-Arkansas Pooled testing for HIV screening: capturing the dilution effect Optimizing screening for acute human immunodeficiency virus infection with pooled nucleic acid amplification tests Mathematic modeling of the risk of HBV, HCV, and HIV transmission by window-phase donations not detected by NAT Secondary attack rate of tuberculosis in urban households in World Health Organization (2020) Modes of transmission of virus causing COVID-19: implications for IPC precaution recommendations: scientific brief Saliva is more sensitive for SARS-CoV-2 detection in COVID-19 patients than nasopharyngeal swabs Rapid response to an outbreak in Qingdao, China Evaluation of COVID-19 RT-qPCR Test in Multi sample Pools Optimizing and evaluating pcr-based pooled screening during covid-19 pandemics Pooled testing for HIV prevalence estimation: exploiting the dilution effect Correlation Improves Group Testing References for the Appendices Probability and measure Group testing as a strategy for COVID-19 epidemiological monitoring and community surveillance Modelling the limit of detection in real-time quantitative PCR Exercises in probability Distribution of U.S. households by size 1970-2020 An Introduction to the Bootstrap Probability inequalities for sums of bounded random variables The scaling of contact rates with population density for the infectious disease models Australia community profile Households in France An analysis of SARS-CoV-2 viral load by patient age US Food and Drug Administration (2020) SARS-CoV-2 reference panel comparative data Asymptotic statistics World Health Organization (2020) Modes of transmission of virus causing COVID-19: implications for IPC precaution recommendations: scientific brief The authors are grateful to Diego Diel and Jeff Pleiss for conversations on the implementation details of PCR tests. The authors also thank Stephen Chick, Saskia Comess, Claire Donnat, and Susan Holmes for providing valuable feedback. This work was conducted under the support from Cornell University when the authors served in the Cornell COVID mathematical modeling team. Additional support was provided by Air Force Office of Scientific Research FA9550-19-1-0283. J.W. and Y.Z. contributed equally to this paper.