key: cord-144448-mqs502xm
authors: Theagarajan, Lakshmi N.
title: Group Testing for COVID-19: How to Stop Worrying and Test More
date: 2020-04-14
journal: nan
DOI: nan
sha: 
doc_id: 144448
cord_uid: mqs502xm

The corona virus disease 2019 (COVID-19) caused by the novel corona virus has an exponential rate of infection. COVID-19 is particularly notorious as the onset of symptoms in infected patients are usually delayed and there exists a large number of asymptomatic carriers. In order to prevent overwhelming of medical facilities and large fatality rate, early stage testing and diagnosis are key requirements. In this article, we discuss the methodologies from the group testing literature and its relevance to COVID-19 diagnosis. Specifically, we investigate the efficiency of group testing using polymerase chain reaction (PCR) for COVID-19. Group testing is a method in which multiple samples are pooled together in groups and fewer tests are performed on these groups to discern all the infected samples. We study the effect of dilution due to pooling in group testing and show that group tests can perform well even in the presence of dilution effects. We present multiple group testing algorithms that could reduce the number of tests performed for COVID-19 diagnosis. We analyze the efficiency of these tests and provide insights on their practical relevance. With the use of algorithms described here, test plans can be developed that can enable testing centers to increase the number of diagnosis performed without increasing the number of PCR tests. The codes for generating test plans are available online at [1].

The novel corona virus has caused a pandemic in early 2020. The reproduction rate of the corona virus is estimated to be between 1.4 and 3.9 [2] . From an infected patient, COVID spreads to 1.4 to 3.9 others on an average. As the virus spreads through respiratory droplets, it spreads at an exponential rate, especially, in densely populated locations. The infected patients become carriers at an early stage even before the onset of symptoms [3] . Thus, it becomes imperative to test a large number of people and identify early stage infections to contain the spread of COVID-19. The PCR based testing is considered to be one of the best techniques for early stage diagnosis [4] . Group testing is well known methodology in the combinatorics and compressed sensing literature [5] . In group testing(GT), multiple samples are pooled together to form groups (fewer in number than the total number of samples). Tests are performed on these groups and from the outcome of these tests, the infected samples are inferred. Group testing relies on the fact that not all of the tested samples may be infected. Thus, by employing group testing for COVID-19 diagnosis, the number of samples tested can be increased, the tests performed can be made economical and the speed of detection can be improved. This can lead to efficient containment of the spread of COVID-19.

In this article, we investigate group testing for COVID diagnosis using PCRs from a practical view. We analytically study the effect of dilution, caused by pooling in group tests, on the accuracy of the tests. We present multiple group testing algorithms with which a test plan for diagnosis of a pool of samples can be generated. We analyze and present the scenarios in which the presented algorithms are appropriate. We show through simulations that group testing can provide considerable gains in the number of tests performed. Finally, we provide some guidelines for performing COVID diagnosis in practice. A practitioner can directly generate the test plans using the MATLAB codes that are available online at [1] .

Certain key findings and discussions of this article can be listed as follows:

• When the infection rate is less than 33%, group test can help to reduce the number of tests significantly. Some examples on the gains obtained through group testing (with a sensitivity of more than 99%) are given below.

Number of samples to test 16 16 16 32 32 32 Infection rate 5% 10% 20% 5% 10% 20% Number of tests required by group testing 6 8 12 11 16 26

• Replication or repeating a test is required to improve the accuracy of any qRT-PCR test outcome. We find that replicating the test twice could be sufficient to get high accuracy. Further, group testing could provide inherent replication and reduce the number of replicates required. For details refer to Sections 3.1 and 5.3.

• We also analyzed the effect of dilution caused by dividing a swab sample's content to smaller portions and mixing multiple samples together. This helped us to determine an upper limit on the number of samples that can be pooled together. It was found that up to 57 samples can be pooled together without reducing the test accuracy. For details refer to Section 5.2.

• It was found that different group testing strategies are optimal under different conditions. The diagnostician should adapt the test plan depending on the infection rate of a given cluster or local community. Test plans that are optimal for a given infection rate are discussed in Section 6.

Further guidelines to follow while testing for COVID-19 through group testing are discussed in Section 7. Notations: . denotes the floor operation, i.e., largest integer lesser than or equal to the given number. . denotes the ceil operation, i.e., smallest integer greater than or equal to the given number. . denotes rounding off operation. |A| denotes the cardinality of the set A, i.e., the total number of elements in the set A.

The real time or quantitative reverse transcription polymerase chain reaction (qRT-PCR) is the testing method used for early detection of COVID. In RT-PCR, the viral RNA molecules present in the test medium are first reverse transcribed into DNA, which is referred to as the complementary DNA (cDNA). The cDNA corresponding to the COVID genes are amplified using PCR, which is performed through thermal cycling. In the high temperature cycle, the target cDNA's double-helix strands are separated (referred to as the denaturation process). In the low temperature cycle, the primers bind onto the cDNA strands (referred to as annealing) to initiate the polymerization through which the free nucleotides assemble on the DNA strands to form a new double-helix DNA (referred to as the elongation process). Thus, in every cycle the number of DNA particles present are doubled; refer Fig. 1 . After several cycles, the total number of target DNA present is increased to an exponentially large volume. Finally, fluorescent dyes that fluoresce in the presence of the target DNA are used to visually identify the presence or absence of the virus. The number of thermal cycles determine the final volume of target DNA present; hence, by increasing the thermal cycles (also referred to as amplification cycle), the sensitivity or the detection accuracy of qRT-PCR can be improved. A detailed procedure and reagents used in qRT-PCR for detection of COVID can be found in [6] . 

Using qRT-PCR, it was found that 10 virus particles are sufficient to successfully detect the presence of the virus in a test sample with very high accuracy (>99.9%) [7] . However, this accuracy or goodness of test may vary with the qRT-PCR kit used. We define the following metrics to compare and analyze the goodness of tests.

Let X be a binary random variable that denotes the presence (X = 1) or absence (X = 0) of the virus in a test sample. Let X t be a binary random variable that denotes the outcome of a test.

Definition -False negative rate : γ = Pr(X t = 0|X = 1) The rate or probability of the outcome of a test being negative when the virus is actually present in the sample. Alternatively, the sensitivity is defined as the probability of the test outcome being positive when the virus is actually present in the sample; this is given by 1−γ = Pr(X t = 1|X = 1). It is desirable for the test to have a low value for γ.

Definition -False positive rate : β = Pr(X t = 1|X = 0) The rate or probability of the outcome of a test being positive when the virus is actually absent in the sample. It is desirable for the test to have a low value for β.

Definition -Prior : α = Pr(X = 1) The rate of occurrence of the virus in a given population.

The prior can be approximately computed, using previous history, as the ratio of the number of positive cases to the total number of tests performed. It is important to minimize sample bias in this heuristic computation. According to [8] , the prior (at the time of writing this article) for United States of America is about 0.1973 and for India is about 0.0379. Among the countries where more than 10 5 tests were performed, the lowest prior is for UAE (about 0.005) and the highest is for Spain (about 0.4423). In real life, we are more concerned about the false negative rate (as opposed to false positive rate). It is very important to have the least γ to minimize the spread of COVID and fatality caused by it. Further, the value of these rates depend on the viral load or the number of viral RNA copies present in the test (denoted by l). We denote the false negative and positive rates as a function of l using γ(l) and β(l), respectively. It was found in [7] that γ(5) ≈ 0.07. Therefore, the goodness or efficiency of a testing technique is given by γ(l) and β(l). It is necessary to know these parameters for a given testing technique before we can analyze the efficiency of group testing using the same.

Often tests are performed multiple times on a single sample to confirm the outcome and account for any variability in the testing procedure, thereby improving the accuracy of the test. Here, we shall analyze the improvement in accuracy when tests are replicated for a given sample. When the test is replicated r times, the outcome is declared based on the majority rule. To declare the final decision as negative, the APR should have a value greater than 1. The expressions to compute the sensitivity and APR due to replication are derived in Appendix A. Figure 2 illustrates the improvement in accuracy for double and triple replication. From Fig. 2a , one can choose the value of r, i.e., how many tests to perform depending on the γ of the testing technique and the γ r that is desired. For example, when γ is small (< 0.05), double and triple replicates give very similar performance; hence, double replication will suffice. Figure 2b shows us that replication is necessary when the prior is more than 8% at a false negative and positive rates of 5% and 10%, respectively, for the testing technique.

Group testing(GT) refers to the idea of pooling multiple samples together and performing tests on certain subsets of these samples to discern the infected samples. First, we illustrate GT through a simple example given in Fig. 3 Here, we have utilized only 6 tests to diagnose 8 samples -a 25% reduction in the number of tests. Note that, if we had known that there was exactly one infected sample, then only 4 tests would be required (tests 4 and 6 would not be required). This GT algorithm is referred to as the binary search and is the optimal GT when there is exactly one infected sample in a given pool. In general, to test a pool of N samples with up to one infected sample, we require at most 2 log 2 N tests and at least log 2 N tests. This gain increases exponentially with N ; for example, N = 64 requires at most 14 tests with GT as opposed to 64 tests without GT.

The theory of group testing deals with the design and analysis of algorithms that tell us how to choose subsets (groups) of samples to pool together and test, and identify the infected samples. In other words, GT aims to minimize the number of tests (T ) required to identify at most D number of infected samples among N number of given samples (pool size); 0 ≤ D < N and T ≤ N . In practice, the value of D is not known. Nevertheless, depending on the value of the prior, an upper bound on the number of infected samples can be chosen 1 as D for a given N .

There are two paradigms of GT, namely,

• Combinatorial group testing (CGT): The combinatorial algorithms require the exact number (or an upper bound) of the infected samples D. As long as the chosen value of D is greater than or equal to the true number of infected samples, CGT algorithms always identifies the infected samples correctly.

• Probabilistic group testing (PGT): The probabilistic algorithms require an upper bound on α and identify all infected samples with certain probability P D . Usually, the detection probability P D is very close to 1; however, there still may be a non-zero probability of the infected cases going unidentified. For small values of α, the average number of tests performed in PGT is less than N ; however, there can exist cases for which the number of tests performed by PGT might be greater than N , albeit, the probability of occurrence of such cases will be relatively small. In PGT, there exists a trade-off between P D and the average value of T .

From the above discussion, it can be seen that, due to their deterministic nature, CGT algorithms are preferred in practice to test for diseases. When an upper bound on D cannot be reliably established, PGT algorithms may prove to be more efficient as opposed to CGT. There exists PGT algorithms in which P D = 1. In the worst case (i.e., when D = N − 1), the value of T can be greater than N , but the average value of T can still be much less than N . This is further discussed in Section 6. CGT and PGT algorithms are further classified into

• Adaptive tests: Here, the tests are performed sequentially. First a group is chosen randomly based on D N or α and tested, the outcome of this test determines the next group to test and so on. Thus, the size and samples of a group are chosen adaptively based on previous group and its test outcome. The GT described in Figure 3 is an example of adaptive CGT.

• Non-adaptive tests: When the test plan is fixed for a given D and N , then it is known as the non-adaptive GT. Here, a fixed number of tests are always performed irrespective of the number of infected samples present in the pool. An advantage of non-adaptive GT is that, if T N is the number of groups to be tested, then all the T N tests can be simultaneously run in parallel.

The non-adaptive GT can be represented by a matrix of dimension T N ×N (referred to as the measurement matrix), and the samples can be represented by a N × 1 vector with 0's for uninfected samples and 1's for infected samples.

The measurement matrix consists of 0's and 1's, the 1's in a row correspond to the samples included in a test and the 1's in a column correspond to the number of times a particular sample is tested. The Boolean OR operation between the measurement matrix and the samples vector gives the GT outcomes. Now, designing a non-adaptive GT algorithm becomes a problem of designing efficient measurement matrices 2 and decoding algorithms to discern the positions of 1's in the sparse samples vector with the given T N test measurements. The non-adaptive test algorithms can be studied with the aid of compressed sensing literature [9] , [5] .

A key drawback of the non-adaptive GT algorithms is that it requires a large value of N (in the order of 10 3 −10 6 ) to be efficient and provide reliable performance. Also, when the number of testing kits are limited and the prior is small, sequential tests are more efficient and quicker in identifying and discarding uninfected samples than parallel tests. Therefore, in this article, we shall limit our study only to adaptive tests.

In this section, we shall discuss some limitations and surprising advantages of performing group tests for COVID diagnosis in practice using qRT-PCR.

A key question in the application of GT for COVID testing is how small should D be, relative to N ? That is, what is the range of values of D N for which GT reduces the total number of tests required? This was first answered in [10] ; GT is said to reduce the number of tests required when D N < 3− √ 5

2 . As a general rule of thumb, considerable reduction in the total number of tests can be achieved when D N or the prior α is less than 33% ; in all other cases, it is best to perform individual tests. As discussed in Section 3, typical value of α for COVID is less than 0.33. Thus, GT can be utilized to efficiently increase the number of samples tested for COVID with minimal number of tests performed.

In group testing, dilution occurs in two stages, namely, preparation and pooling. We shall discuss the effects of these dilutions on the allowable sample size and goodness of test. The following discussion will also help us to determine a practically appropriate value of the pool size N that provides reliable test outcomes.

When preparing the samples for GT, each sample needs to be divided into multiple portions for usage in multiple groups. In the example in Fig. 3 , each sample is required to be divided into 3 portions since any sample could be involved in a maximum of 3 groups. In general, a sample may be involved in at most log 2 N groups 3 while testing N samples [11] . If each test is replicated r times, then a sample may have to be divided into r log 2 N portions. However, we shall see in the next subsection that each test need not be replicated in GT as it could provide inherent replication. In practice, to test for COVID, swabs from nasopharynx or throat [3] are taken. The swab samples can be dissolved in liquid buffer media [12] and this can be divided into multiple portions. When the samples are divided into multiple portions, the viral load gets distributed across the divided portions. Therefore, each sample can be divided only into certain number of portions such that the viral load in each portion can still be detected reliably by qRT-PCR. The swab from nasopharynx contains 10 6 − 10 9 corona viral particles [3] , if L is the amount of viral load required for reliable detection, then each sample can be divided into at most V l L portions, where V l is the amount of viral particles in the swab. As discussed in Section 3, the false negative rate γ(l) depends on the viral load in the test sample. If γ * is the required false negative rate for each test, then the corresponding viral load is given by the inverse function γ −1 (γ * ), and each sample can be divided into at most V l γ −1 (γ * ) . In Figure 4 , using the γ(l) from [7] for qRT-PCR, we plot the number of portions into which a swab containing V l viral load can be divided to achieve a given false negative rate with three replicates. It can be seen that, when the swab has a viral load of 10 4 , we can still obtain about 220 portions for γ = 0.02. When portions of N swabs, of which D are infected, are mixed or pooled together for a qRT-PCR test, the viral load in D samples are diluted by the N − D virusfree samples. This dilution is referred to as the pooling dilution. The effect of this dilution on the performance of virus detection has been studied in [13] . Based on the probit model described in [14] , the effect of pooling dilution was derived in [13] . Though the model derived in [13] is for testing HIV, the authors mention that the model is applicable to other viral tests. Simplifying the model in [13] for COVID, the sensitivity after pooling dilution is given by

where Φ(.) is the CDF of the normal distribution, χ is the number of RNA copies per viral particle (χ = 1 in the case of corona virus), D is the number of infected samples, V p is the viral load in each infected sample, V 50 and V 95 are the viral loads corresponding to a sensitivity of 0.5 and 0.95, respectively, for the considered testing 3 The maximum number of groups in which a sample will be involved may vary depending on the GT algorithms. It can be proved that log 2 N are sufficient to identify an infected sample among an arbitrary number of infected samples [11] . Further, once identified, as infected or uninfected, that sample is not used in any subsequent groups for testing.

technique, i.e., V x = γ −1 (1 − x). In Fig. 5 , we show the reduction in sensitivity of the pooled test when different number of samples (N ) are pooled together with different number of infected samples (D). It can be seen that the pool size of 32 still provides a very high sensitivity after pooling dilution even when a single sample is infected. A similar conclusion on the pool size was reported through experimentation in [15] . In general, if T is the number of tests for COVID that are to be performed on a swab with viral load V l with r replicates to achieve a sensitivity of 1 − γ * , then the maximum pool size can be derived as

To achieve a sensitivity of 95%, the above equation can be simplified 4 to N = V l rT V95 . For T = log 2 N , we get

where W 0 (.) is the Lambert W function. This formulation includes the effect of dilution in both the sample preparation and pooling stages. Example: For a swab containing a viral load of 10 6 viral particles, and a testing technique that requires 10 3 viral particles to achieve 95% sensitivity, the maximum pool size for 95% sensitivity of the group test should be N = 57.

As seen from the above example, an efficient group test which performs the least number of total tests (T ) per sample can enable us to increase the pool size, which, in turn, increases the testing speed and cost. Therefore, employing an efficient GT algorithm for COVID testing is the key component in pooling tests. We shall discuss some practically relevant algorithms in Section 6.

In group testing, samples that belong to a group, which has a test outcome of positive, are often tested again in smaller groups. This ensures that multiple tests are performed for some samples. This can be observed in the example described in Fig. 3 ; sample 5 is involved in 3 tests that are positive. In GT, almost every sample that tests positive is replicated at least twice. This inherent replication reduces drastically the false positive rates of GT without an explicit replication of the individual tests. The exact amount of reduction of the false positive rate depends on the GT algorithm. Note that the group tests whose outcome are negative may not be replicated in GT and may require explicit replication to reduce false negative rates. Therefore, the false negative rates or the sensitivity would suffice to be an appropriate metric to evaluate group tests and one can focus on reducing γ to improve GT. In this section, we discuss two combinatorial group tests and a probabilistic group test which could be practically useful in testing for COVID-19. The generalized binary splitting (GBS) and multi-stage testing (MST) are the CGT algorithms, and nested testing (NT) is the PGT algorithm that we describe in the following subsections.

The generalized binary splitting (GBS) is the most commonly used adaptive algorithm in the CGT literature [11] . GBS is the generalization of the binary splitting procedure (BSP). First, we describe BSP, and generalize it to obtain GBS. The binary splitting procedure assumes that there exists at least one infected sample in a given pool and the goal of BSP is to identify exactly one infected sample in the least number of tests. That is, BSP works on a pool of size N with D ≥ 1 and identifies an infected sample in exactly log 2 N tests.

The steps in BSP are given below:

• Step 1 Split the N samples into two halves, say groups G 1 and G 2 .

• Step 2 Test G 1 .

•

Step 3 When the test is positive: (i) continue performing all future tests with the samples from only G 1 , (ii) set N = |G 1 |, and (iii) if the number of samples in G 1 is 1, then one infected sample has been identified and the algorithm is terminated.

•

Step 4 When the test is negative: (i) continue performing all future tests with the samples from only G 2 , (ii) set N = |G 2 |, and (iii) if the number of samples in G 2 is 1, then one infected sample has been identified and the algorithm is terminated. Note that, since D ≥ 1, if G 1 tests negative, then G 2 must test positive.

•

Step 5 With the updated N and samples pool, goto Step 1.

The GBS algorithm simply attempts to perform the BSP D times to identify at most D infected samples in a given pool of size N . The pseudocode and the steps in GBS algorithm are given in Algorithm 2. Analysis: Since the outcome of a test performed at each step determines the next group for testing, the total number of tests varies for different inputs for a given N and D. The number of tests performed in the worst case, i.e., the maximum value that T can attain in GBS is given by T ≤ log 2 N D + D. Figure 6 shows the maximum number of tests required by GBS with each test replicated twice for different values of N and D. It can be seen that GBS testing requires lesser number of tests, even in the worst case, than the conventional 5 testing, which is 2N . In GBS, each sample may be involved in up to log 2 N D + 1 tests; the sample preparation stage should provide at least log 2 N D + 1 portions for each sample.

Practical considerations: When employing GBS in practice, the following points need to be kept in mind.

• The tests in GBS are sequential. That is, all the group tests have to be performed adaptively and in-order to obtain the final result.

• The gains of GBS are higher for large values of N. From simulations, we observed that GBS is suited for D N ≤ 20% and the average number of tests required can be reduced by up to 50% of that of conventional tests. • A key thing to note in GBS is that the algorithm can fail if the value of D is underestimated. If the pool contains lesser number of infected samples than D, then GBS identifies all infected samples perfectly.

• Inherent replication: In GBS, when a group is tested positive, often it gets tested again in the subsequent groups, thereby providing inherent replication. However, due to BSP, there exists scenarios where no samples are involved in multiple tests. Therefore, in practice, one has to explicitly watch out for such samples and replicate the test, if required.

• At the end of GBS tests, the N − D samples, that are marked as uninfected, can be pooled into one group and tested. This ensures that D is not underestimated.

Some of the disadvantages of GBS are overcome by the multi-stage testing algorithm, which is described in the next subsection.

A disadvantage in GBS tests is that the number of group tests that a particular sample undergoes is difficult to compute. To overcome this issue, we can employ C. H. Li's multi-stage testing (MST) [11] . In MST, each sample undergoes exactly s number of tests. The value of s can be chosen as dictated by any practical restrictions; however, there is an optimal number of stages that would minimize the total number of tests performed, this is given by [11] s = arg min

The pseudocode of the MST algorithm is given in Algorithm 1. It can be explained as follows. Maintain three bins: infected samples' bin (ISB), uninfected samples' bin (USB) and queued samples' bin (QSB). Initially, all N are in QSB. In the ith stage (i = 1, 2, · · · , s), form g i groups with all samples in QSB and test them. The samples that belong to groups which tested negative are moved to USB. If the group size is more than one, then the samples that belong to groups that tested positive are retained in QSB, else they are moved to ISB. Proceed to i + 1th stage and repeat the above steps till the group sizes become 1. Analysis: The group size g i is computed as g i = N i−1 /k i , where k i = δ 1− i s is the average number of samples in a group. In practice, some groups of g i may contain k i samples and the rest may contain k i + 1 samples. Note that, in the last stage, all groups contain exactly one sample, i.e., k s = 1. The total number of tests performed in MST is T = s i=1 g i . The value of T is not fixed for a given N and D as it depends on the group test outcomes. However, we can determine the value of T in the worst case, i.e., the maximum value that T can take for a given N and D. This is given by T ≤ eD ln δ [11] .

Practical considerations: When used in practice, MST has the following advantages.

• All groups in each stage can be tested in parallel. Thus, despite being an adaptive GT, testing can be sped up using MST.

• When more than D groups test positive at any stage, then it can be inferred that the estimate of D is incorrect. In this case, the remaining samples in QSB can be tested with an MST test plan for the updated value of D and N (the number of samples in QSB will be the new value of N ). This ensures that the tests performed in all previous stages are still useful even when the value of D is incorrect.

• When the true number of infected samples is less than or equal to D, MST identifies all infected samples.

• Since the number of stages s is known, the number of times any sample will be tested is at most s. Thus, in the sample preparation, it is sufficient to divide each swab content to only sr portions, where r is the number of replicates.

• Inherent replication: Since all the samples in QSB at the end of stage-i are tested again in stage-i + 1, the tests for infected samples are automatically replicated. Therefore, only those groups that test negative at each stage needs replication. The groups that test positive are replicated inherently at least s times, thereby increasing the test accuracy and reducing false positives. This inherent replication further reduces the total number of tests required. We derived that the total number of tests reduced due to inherent replication in MST is at least k 1

• The maximum number of tests required by MST is always bounded above by N , i.e., T ≤ N .

In Fig. 7a , we plot the optimal number of stages required in MST for different values of D and N . Note that, when the number of stages increase, the inherent replication for infected samples increase and improves the accuracy without any additional tests for replication. In Fig. 7b , we plot the number of tests required by MST in the worst case scenario for different values of D and N with double replication. For N = 16, the maximum number of tests required for MST saturates at 30 as opposed to 32 in the conventional testing with double replicates. It can be seen that MST requires uniformly lesser number of tests than the conventional testing. Though these numbers represent the worst case scenario, simulations show that the average number of tests required to detect infected samples could be 40% less than the conventional testing. GBS vs MST: Figure 8 shows the total number of tests required in the worst case for GBS and MST at different values of N and D = 1, · · · , 6. The number of tests required by conventional testing is also indicated for baseline comparison. It can be seen that for small values of δ, GBS outperforms MST. As D increases for a fixed N , GBS may sometimes require more number of tests than the conventional testing; however, MST always requires less than or equal number of tests as conventional testing. In general, for large values of N (>24), GBS may be appropriate, and MST may be appropriate in the other regime. Figure 8 can be used as a guideline to choose the best test for a given value of N and D.

The value of D played an important role in the adaptive CGT algorithms discussed in the previous subsections. When the value of D is an underestimate, then the test plan provided by CGT may fail or may turn out to be suboptimal. This shortcoming can be addressed by the usage of PGT algorithms. Here, we shall describe a probabilistic group testing algorithm known as the nested testing (NT) [16] . The NT algorithm takes N and α as the input and provides an adaptive test plan that minimizes the average number of total tests required to diagnose N samples. The actual number of infected samples in a pool or its estimate is not required for NT, only the prior probability of the presence of an infected sample is needed. Nested testing is an optimal PGT which identifies all infected samples without failure [16] . The algorithm proceeds by testing a group of samples from UB and moving them to PIB, if they test positive, then transferring the diagnosed samples to DB and returning the rest to UB, and repeating the whole process till all samples are diagnosed.

Analysis: The NT algorithm models the presence of the virus in each sample as an independent Bernoulli random variable with probability α. Thus, NT can identify any number of infected samples in a pool, i.e., 0 ≤ D ≤ N . The probability Pr(D = i), i = 0, · · · , N , is given by the Binomial distribution; nested testing exploits this fact to identify the group that has the highest probability to contain infected samples and tests it. The NT algorithm gives the least number of tests for which Pr(D = i) is the maximum. When the number of infected samples is i, such that Pr(D = i) is the least, the total number of tests required by NT may be more than N . However, when the NT test plan is applied to multiple pools of size N , the average number of tests required per pool will be lesser than N . The average number of tests required by NT is given by G(0, N ) , where G() is as given in (9) 6 . Figure 9 plots the average number of tests required by NT for different values of N and α. It is clear that, for smaller values of the prior probability, NT can provide large gains in terms of the average number of tests performed without any knowledge of D.

Practical considerations: When used in practice, NT has the following advantages.

• Nested testing can identify all infected samples irrespective of the actual value of D. The probability of detection of infected samples by NT is 1.

• A key condition for NT to perform well in practice is the availability of independent samples. That is, the probability of each sample being infected should be independent of the other samples in the pool and equal α. This can be achieved in practice through randomization of samples. That is, constituting a pool with samples from reasonably separated geographical locations.

• The value of the prior α can be estimated based on the past history as discussed in Section 3. Further, this value should be continuously updated based on the outcomes of the tests performed each time.

• Inherent replication: In NT, an infected sample has very high probability of getting tested in at least two groups. Hence, NT can ensure at least double replication for samples that test negative.

• To create an NT test plan, the algorithm needs to evaluate the expression in (9) 

Group Testing is a promising method that can be effectively utilized to ramp up the number of tests performed in COVID-19 diagnosis. GT can help us quickly identify several early stage infections. By reducing the number of tests performed, GT can make the testing less expensive and economical. The following are some guidelines that can be followed in employing GT for COVID-19 diagnosis.

• Before employing pooling, the sensitivity of the testing technique needs to be characterized. Some qRT-PCR kits are known to provide accurate results with smaller viral load, whereas some are known to require a large amount of viral load to detect the virus. This sensitivity characteristics, i.e., γ(l), needs to be understood before the formulation of GT test plans. The knowledge of γ(l) would help us to decide the optimal pool size and number of replications required.

• When the testing speed is a primary factor, it may be better to employ MST as it enables to perform parallel group tests.

• In CGT algorithms such as GBS or MST, randomization is not required. The statistical relation between the samples are irrelevant in CGT group tests. However, the estimate of the upper bound on the number of infected samples in a pool is an important parameter that needs to be accurate. The value of D can be estimated by observing the history of a location or cluster. When more samples from a location test positive, then the estimate of D should be correspondingly increased and vice versa.

• PGT algorithms such as NT has the advantage of not requiring the exact value or an upper bound on the number of infected samples in a pool. PGT algorithms require the prior probability values (α). Once again, this can be obtained from the history of the tests performed for a location or cluster. The ratio of the number of samples tested positive to the total number of tests performed can give us this estimate (refer Section 3). This estimate needs to be update continuously over time.

• Randomization could significantly reduce the number of tests performed in PGT. When samples from a small community or cluster are pooled together, most of them are likely to have a similar test outcome. However, PGT works best when the samples are independent. Hence, the testing center should randomize by picking samples from different communities or clusters to form a pool. This randomization can bring down the average fraction of samples that are positive in a pool and the dependence among the samples.

• In practice, PGT tests are suited well for locations where the infection rate is higher and the CGT tests are suited well for locations where the infection rate is small. This is because, PGT does not need the value of D and an underestimated value could cause failure of CGT tests.

• When choosing a GT algorithm to perform diagnosis, the practitioner can intelligently switch between GBS, MST and NT, depending on the estimate of D and α, and their accuracy. The performance plots provided in Section 6 can be utilized in choosing the best test for a given scenario.

• From the study so far, it is clear that considerable reduction in the total number of tests can be achieved when D N or the prior α is less than 33%.

• Note that when D > N 2 or α > 1 2 , group testing can still be helpful. Under this condition, the goal of GT would be reversed, i.e., to identify the uninfected samples rather than the infected samples. All our discussions so far remain applicable with this switch.

• The MATLAB codes for simulating the results and generating group test plans are available online at [1] .

where X t is the random variable denoting the outcome of the tth replicate. When the test is replicated twice, Pr(X 1 = 0, X 2 = 0|X = 1) = γ 2 . For triple replication, Pr(X 1 + X 2 + X 3 ≤ 1|X = 1) = 3γ 2 (1 − γ) + γ 3 and Pr(X 1 = 0, X 2 = 0, X 3 = 0|X = 1) = γ 3 . For a given set of test replicate outcomes X 1 , · · · , X r , the probability of the sample actually containing the virus is given by Pr(X = 1|X 1 , · · · , X r ) -this is referred to as the a posteriori probability. The a posteriori probability ratio (APR) Pr(X=0|X1,··· ,Xr)

Pr(X=1|X1,··· ,Xr) is easier to compute than the individual a posteriori probabilities. The APR for a given set of outcomes is AP R = Pr(X = 0|X 1 , · · · , X r ) Pr(X = 1|X 1 , · · · , X r ) = Pr(X 1 , · · · , X r |X = 0) Pr(X 1 , · · · , X r |X = 1)

where m is the number of negative outcomes (i.e., X t = 0) in r replicates. Divide P i−1 into g i disjoint groups -G 1 , G 2 , · · · , G gi such that G 1 ∪ G 2 ∪ · · · ∪ G gi = P i−1

Test groups G 1 , G 2 , · · · , G gi 8:

Discard groups that tested negative Test group G ⊆ P

if test outcome is positive then 6: Identify an infected sample in G with BSP (Since the group tested positive, it must contain at least one infected sample)

Update N = N − 1 − g (where g is the number of uninfected items diagnosed from BSP, remove these from the pool) Test the N samples individually 15: end if

Available online: Matlab codes for test plan generation

Pattern of early human-to-human transmission of wuhan 2019 novel coronavirus (2019-ncov)

Sars-cov-2 (covid-19) by the numbers

Coronavirus and the race to distribute reliable diagnostics

Group testing and sparse signal recovery

Diagnostic detection of 2019-ncov by real-time rt-pcr

Detection of 2019 novel coronavirus (2019-ncov) by real-time rt-pcr

Coronavirus disease (covid-19)-statistics and research

Boolean compressed sensing and noisy group testing

On the cut-off point for combinatorial group testing

Combinatorial group testing and its applications

Transport of viral specimens

A methodology for deriving the sensitivity of pooled testing, based on viral load progression and pooling dilution

Refinement of a viral transmission risk model for blood donations in seroconversion window phase screened by nucleic acid testing in different pool sizes and repeat test algorithms

Evaluation of covid-19 rt-qpcr test in multi-sample pools

Born again group testing: Multiaccess communications

Group testing to eliminate efficiently all defectives in a binomial sample

The author thanks Namrata M. Nilavar, Indian Institute of Science, Bangalore, for useful discussions.

Hence, the false negative and positive rates for r tests are given by (each replicate is assumed to be independent of the other) Compute h =G(0, |U|) using (8)

Test a group G ⊆ U of size h 6:if Test is negative then 8:else 10:if h == 1 then

Update D = D + G and make P empty while P is not empty do

Compute g =G(|P|, |P ∪ U|) using (8)

Test a group G ⊆ P of size g 17: if Test is positive then G(1, m) = G(0, m − 1), G(0, 1) = 1, G(0, 0) = 0.