key: cord-0125557-5nnn3rl4
authors: Malinovsky, Yaakov; Albert, Paul S.
title: Nested Group Testing Procedures for Screening
date: 2021-02-06
journal: nan
DOI: nan
sha: 33c074a591d7fec19ecfd8850d98226d13fc0b31
doc_id: 125557
cord_uid: 5nnn3rl4

This article reviews a class of adaptive group testing procedures that operate under a probabilistic model assumption as follows. Consider a set of $N$ items, where item $i$ has the probability $p$ ($p_i$ in the generalized group testing) to be defective, and the probability $1-p$ to be non-defective independent from the other items. A group test applied to any subset of size $n$ is a binary test with two possible outcomes, positive or negative. The outcome is negative if all $n$ items are non-defective, whereas the outcome is positive if at least one item among the $n$ items is defective. The goal is complete identification of all $N$ items with the minimum expected number of tests.

In the last few months of the COVID-19 pandemic, the mostly-forgotten practice of group testing has been raised again in many countries as an efficient method for addressing an epidemic while facing restrictions of time and resources. During this very short period, numerous publications and reports have appeared in both scientific and non-scientific journals. The reader can easily see them, for example, in the Washington Post, NY Times, Scientific American, Science Advances, medRxiv, bioRxiv, ArXiv, and elsewhere.

The story goes back to 1943, when Robert Dorfman published a manuscript where he introduced the concept of group testing in response to the need to administer syphilis tests to millions of individuals drafted into the U.S. Army during World War II (Dorfman, 1943) .

In a group of size k ≥ 2, the total number of tests is 1 with probability q k (q = 1 − p) and k + 1 with probability 1 − q k . Therefore, the expected number of tests per person in Procedure D is E D (k, p) = 1 − q k + 1 k , for k ≥ 2; and equals 1 otherwise. Dorfman numerically found the optimum group size k when assuming that a population is large (infinite) and p is fixed. For example, if p = 0.01, then the optimal group size is 11. Specifically, this means that for testing N = 999, 999 individuals, we need only 195, 571 tests in expectation. Samuels (1978) showed that an optimal value of k, k * D (p) is a non-increasing function of p, which is 1 for p > 1 − 1/3 1/3 ≈ 0.31 and otherwise is either 1

There is a logical inconsistency in Procedure D. It is clear that any "reasonable" group testing plan should satisfy the following property: "A test is not performed if its outcome can be inferred from previous test results" (Ungar, 1960) . Procedure D does not satisfy this property since if the group is positive and all but the last person are negative, the last person is still tested. The modified Dorfman procedure, which we define as D ′ , would not test the last individual in that case (Sobel and Groll, 1959) . Also, Procedure D ′ will be preferred over individual testing if p is below Ungat's cut-off point (Malinovsky and Albert, 2019) . For k ≥ 2, the total number of tests is 1 with probability q k , k with probability q k−1 (1 − q), and k +1 with probability 1−q k−1 . Therefore, the expected number of tests per person in a group of size k under Procedure D ′ : Pfeifer and Enis (1978) showed that an optimal value k * D ′ is the smallest k value that satisfies E D ′ (k, p) ≤ E D ′ (k − 1, p) and E D ′ (k, p) < E D ′ (k + 1, p). It was conjectured and empirically verified in Malinovsky and Albert (2019) that the optimal group size k * D ′ (p) is equal to either ⌊p −1/2 ⌋ or ⌈p −1/2 ⌉.

2.2 Sterrett sequential procedure Sterrett (1957) realized that one can improve the efficiency of the GT procedure by a sequential modification of Procedure D ′ . If in the first stage of Procedure D ′ a group is positive, then individuals are tested one-by-one until the first positive individual is identified, or until all but the last person are negative; in the latter case, the positivity of the last individual follows from the positivity of the group. Otherwise, if the first individual identified as positive is not the last in the group, then the first stage of Procedure D ′ is applied to the remaining (nonidentified) individuals. This process is repeated until all individuals are identified. A simple closed-form expression for the expected number of tests per person was provided by Sobel and Groll (1959) :

It was conjectured and empirically verified in Malinovsky and Albert (2019) that for 0 < p < p U the optimal group size k * D ′ (p) is equal to ⌊ 2/p⌋ or ⌊ 2/p⌋ + 1 or ⌊ 2/p⌋ + 2. Using above conjectures one can verify that lim

Some extensions of the Sterrett (S) procedure were presented in Johnson et al. (1991) .

A finite population of size N is not necessarily divisible by k. Therefore, for a finite population of size N and a given Procedure A ∈ {D, D ′ , S}, we have to solve the following optimization problem: find the optimal partition {n 1 , . . . , n I } with n 1 + . . . + n I = N for some I ∈ {1, . . . , N} such that E A (k, p) is minimal. A common method to solve such an optimization problem is dynamic programming (DP) (Sobel and Groll, 1959) . It was conjectured by Lee and Sobel (1972) for that Procedure D, the optimal partition subgroup sizes differ at most by one unit. Gilstein (1985) proved a similar result for Procedure D ′ , and Malinovsky and Albert (2019) for Procedure S.

Procedures D, D ′ , and S fit into a larger class of procedures discussed below.

The hierarchical class procedure was introduced by Sobel and Groll (1959) and defined as follows (see also Hwang et al. (1981) ): A procedure is in the hierarchical class (HC) if two units are only tested together in a group if they have an identical test history, i.e. if each previous group test contains either both of them or none of them.

It follows from this definition that a procedure in the HC is similar to the multistage Dorfman procedure. An optimal hierarchical procedure was obtained by Sobel and Groll (1959) as a dynamic programming algorithm with computational cost O(N 2 ), which they called Procedure R 3 . This was recently computationally improved by Zimmerman (2017) (see also Malinovsky (2019a) for a discussion).

This class of GT procedures was defined by Sobel and Groll (1959) and Sobel (1960 Sobel ( , 1967 A nested procedure requires that between any two successive tests n units not yet classified have to be separated into only (at most) two sets. One set of size m ≥ 0, called the "defective set," is known to contain at least one defective unit if m ≥ 1 (it is not known which ones are defective or exactly how many there are). The other set of size n − m ≥ 0 is called the "binomial set" because we have no knowledge about it other than the original binomial assumption. Either of these two sets can be empty in the course of experimentation; both are empty at termination.

The number of potential nested group testing algorithms is astronomical. For example, if N = 5, then there are 235, 200 possible algorithms (Moon and Sobel, 1977) . Sobel and Groll (1959) overcame this problem by proposing a DP algorithm that finds the optimal nested algorithm, which Sobel and Groll termed "Procedure R 1 ." There was a large research effort to reduce the O(N 3 ) computational complexity of the original proposed algorithm (Sobel, 1960; Kumar and Sobel, 1971; Hwang, 1976a; Yao and Hwang, 1990) . Zaman and Pippenger (2016) provided an asymptotic analysis of the optimal nested procedure; see also Malinovsky and Albert (2019) for discussion. In addition, the connection of group testing with noiseless-coding theory was presented in the group testing literature by Sobel and Groll (1959) and further investigated in Sobel (1960 Sobel ( , 1967 . In particular, for N = 2 the procedures D ′ , S, R 3 , and R 1 coincide and are the optimal GT procedures. The optimality follows from the fact that for N = 2 they are equivalent to the optimal prefix Huffman code (Huffman, 1952) with the expected length L(N). In general, for any N, L(N) can serve as a theoretical lower bound for the expected number of tests of an optimal GT procedure; however, the complexity of calculation of L(N) is O 2 N log 2 (2 N ) . Therefore, even for small N, obtaining the exact value of L(N) is impossible. A well-known noiseless coding theorem provides the information theory bounds for L(N) as H(p) ≤ L(N) ≤ H(p) + 1, where H(p) = N p log 2 1 p + q log 2 1 q is the Shannon entropy. For a comprehensive discussion, see Katona (1973) . Below, we compare Procedures D ′ , S, Optimal Hierarchical (R 3 ), and Optimal Nested (R 1 ) for different p with respect to the expected number of tests for N = 100. For Procedures D ′ and S, the optimal configuration for finite population was found in Gilstein (1985) and in Malinovsky and Albert (2019), respectively. Table 1 shows that for each value of p there is a consistent ranking among optimal Procedures D ′ , S, Optimal Hierarchical (R 3 ), and Optimal Nested (R 1 ) with respect to the expected total number of tests: Procedure R 1 is the best, R 3 is the second-best, S is the thirdbest, and D ′ is the worst. The consistent ranking among Procedures D ′ , R 3 , and R 1 follows from their definitions; R 3 is similar to D ′ but without being limited in maximal number of stages as D ′ , and R 1 does not have a restriction (as R 3 has) that any two units only be tested together in a group if they have an identical test history. Meanwhile, Procedure S also belongs to the hierarchical class, but has the restriction that in a positive group, individuals are tested one-by-one until the first positive individual is identified, or until all individuals but the last one are determined negative. Consequently, Procedure R 3 and therefore R 1 rank higher than Procedure S. To the best of our knowledge, no theoretical results compare optimal Procedures D ′ and S. It is important to note that for some values of p and N, ties among the procedures are possible (case N = 2 was mentioned early in this section).

The generalized group testing problem (GGTP), first introduced by Sobel (1960) , consists of N stochastically independent units u 1 , u 2 , . . . , u N , where unit u i has the probability p i (0 < p i < 1) to be defective and the probability q i = 1 − p i to be non-defective. We assume that the probabilities p 1 , p 2 , . . . , p N are known and that we can decide the order in which the units will be tested. All units have to be classified as either non-defective or defective by group testing. Since its introduction, GGTP has seen considerable theoretical investigation (Lee and Sobel (1972) ; Nebenzahl and Sobel (1973) ; Katona (1973) ; Hwang (1976a) ; Yao and Hwang (1988a,b) ; Kurtz and Sidi (1988) ; Kealy et al. (2014); Malinovsky (2019b Malinovsky ( , 2020 ; Malinovsky et al. (2020) ).

Ideally, under procedure A (A ∈ {D, D ′ , S}) we are interested in finding an optimal partition {m 1 , . . . m I } with m 1 + . . . + m I = N for some I ∈ {1, . . . , N} such that the total expected number of tests is minimal, i.e. {m 1 , . . . m I } = arg min n 1 ,...,n J E A (n 1 , n 2 , . . . , n J ) subject to J i=1 n i = N, J ∈ {1, . . . , N}, where E A (n 1 , n 2 , . . . , n J ) = E A (1 : n 1 ) + · · · + E A (1 : n J ), and E A (1 : n j ) is the total expected number of tests (under procedure A) in a group of size n j . This task is a hard computational problem, and moreover impossible to perform because the total number of possible partitions of a set of size N is the Bell number B(N) = 1 e 2N j=1 j N j ! , which grows exponentially with N. For example, B(13) = 27, 644, 437. In fact, the optimal partition is known only for Procedure D, due to Hwang (1975 Hwang ( , 1981 . Hwang proved that under Procedure D, an optimal partition is an ordered partition (i.e. each pair of subsets has the property that the numbers in one subset are all greater than or equal to every number in the other subset); he also provided a dynamic programming algorithm for finding an optimal partition with computational effort O(N 2 ). However, the ordered partition is not optimal for Procedures D ′ and S (Malinovsky, 2019b) , and finding an optimal partition for these procedures is a hard computational problem. That said, one can evaluate the optimal D ′ and S algorithm under a predetermined order of p's. In Table 2 , we compare Procedures D, D ′ and S for ordered p's, where the method for Procedure D was developed by Hwang (1975) and those for Procedures D ′ and S by (Malinovsky, 2019b) .

For the fixed predetermined order of p 1 , p 2 , . . . , p N an optimal nested and hierarchical procedures with respect to the expected total number of tests were developed as DP algorithms in Kurtz and Sidi (1988) and in Malinovsky et al. (2020) , respectively.

We generated the vector p 1 , p 2 , . . . , p 100 from a Beta distribution with parameters α = 1, β = (1 − p)/p such that the expectation equals p. We repeat this process M = 1000 times for each value of p. Each time an optimal ordered partition with the corresponding expected number of tests was found for Procedures D, D ′ , and S, the hierarchical (HL) and nested (ON) procedures were obtained using DP algorithm (based on the previously mentioned references). Also, in the GGTP, Shannon entropy N i=1 p i log 2 1 p i + (1 − p i ) log 2 1 1−p i can serve as the information lower bound for the expected number of tests of an optimal group testing procedure. The averages of 1000 repetitions are presented in Table 2 below. Table 2 shows the same ranking pattern among procedures as was observed in Table 1 for the homogeneous p case. This pattern can be explained along the lines of the previous discussion concerning homogeneous p.

Unknown p In many practical situations, the exact value p of the probability of disease prevalence is unknown or else only some limited information is available, for example a range. Since all of the above-presented procedures require knowledge of p, there is a need to evaluate p during the process of testing. For a nested algorithm R1, Sobel and Groll (1966) proposed a Bayesian approach that uses upcoming information during testing to evaluate and revaluate p. A minimax approach for Procedure D was introduced by Malinovsky and Albert (2015) and for Procedures D ′ and S in Malinovsky and Albert (2019).

In many settings, particularly in biology and medicine, tests may be subject to measurement error or misclassification. This issue occurs in individual testing but may be enhanced in group testing. In particular, for many applications, the sensitivity of a grouped test may decrease with group size (this is often referred to as dilution). Graff and Roeloffs (1972) and Hwang (1976b) recognized early that when tests are misclassified, the objective function should not be the expected number of tests. Graff and Roeloffs (1972) and Burns and Mauro (1987) proposed a modification of the Dorfman procedure and searched for a design that minimized total cost as a linear function of the expected number of tests, weighted the expected number of good items misclassified as defective, and weighted the expected number of defective items misclassified as good. Hwang (1976b) studied a group testing model with the presence of a dilution effect, where a group containing a few defective items may be misidentified as one containing no such items, especially when the size of the group is large. He calculated the expected cost under the Dorfman procedure in the presence of the dilution effect and derived the optimal group sizes to minimize this cost. Malinovsky et al. (2016) characterized the optimal design in the Dorfman procedure in the presence of misclassification by maximizing the ratio between the expected number of correct classifications and the expected number of tests. Haber et al. (2021) proposed to minimize the expected number of tests while controlling overall misclassification rates. In general, since it is expected that misclassification may be related to group size, one has to be very cautious about proposing Dorfman designs with large group sizes. Alternative designs where groups are re-tested in different ways have been explored (Litvak et al., 1994 (Litvak et al., , 2020 .

Consider a very large (infinite) population of items, where each item, independent from the others, is either defective with probability p or non-defective with probability 1 − p. The goal is to identify a certain number of non-defective items as quickly as possible. To the best of our knowledge, the incomplete identification problem was introduced by Bar-Lev et al. (1990) . For recent developments and references, see Malinovsky (2018) .

Incomplete identification models for grouptestable items

Group testing with test error as a function of concentration

The detection of defective members of large populations

An introduction to probability theory and its application

Optimal partitions of finite populations for Dorfman-type group testing

Group testing in the presence of test error; an extension of the Dorfman Procedure

Is group testing ready for prime-time in disease identification

A Method for the Construction of Minimum-Redundancy Codes

A generalized binomial group testing problem

An optimal nested procedure in binomial group testing

Group testing with a dilution effect

An Optimal Hierarchical Procedure for a Modified Binomial Group-Testing Problem

Inspection errors for attributes in quality control

Combinatorial search problems

The capacity of non-identical adaptive group testing

Multiple access algorithms via group testing for heterogeneous population of users

Finding a single defective in binomial group-testing

Dorfman and R 1 -type procedures for a generalized group testing problem

The Right Kind of Pooled Testing for the Novel Coronavirus: First, Do No Harm

Screening for the presence of a disease by pooling sera samples

On optimal policy in the group testing with incomplete identification

End Notes. M ath. Mag. 92

Sterrett procedure for the generalized group testing problem. M ethodology and Computing in Applied Probability

Conjectures on Optimal Nested Generalized Group Testing Algorithm. Applied Stochastic Models in Business and Industry

A note on the minimax solution for the two-stage group testing problem

Revisiting nested group testing procedures: new results, comparisons, and robustness

Reader reaction: A note on the evaluation of group testing algorithms in the presence of misclassification

An optimal design for hierarchical generalized group testing

Enumerating a class of nested group testing procedures

Finite and infinite models for generalized group-testing with unequal probabilities of success for each item

Dorfman-type group testing for a modified binomial model

The exact solution to the two-stage group-testing problem

Group testing to classify efficiently all defectives in a binomial sample. Information and Decision Processes

Optimal group testing

Group testing to eliminate efficiently all defectives in a binomial sample

Binomial group-testing with an unknown proportion of defectives

On the detection of defective members of large populations

Cutoff points in group testing

A fundamental monotonicity in group testing

Individual testing of independent items in optimum group testing

On optimal nested group testing algorithms

Asymptotic analysis of optimal nested group-testing procedures

Detecting deficiencies: an optimal group testing algorithm