key: cord-0489947-lz91qs75
authors: Heidarzadeh, Anoosheh; Narayanan, Krishna R.
title: Two-Stage Adaptive Pooling with RT-qPCR for COVID-19 Screening
date: 2020-07-06
journal: nan
DOI: nan
sha: e8afd9eeaaf3fa5e2ef88e743a971e2bab79890f
doc_id: 489947
cord_uid: lz91qs75

We propose two-stage adaptive pooling schemes, 2-STAP and 2-STAMP, for detecting COVID-19 using real-time reverse transcription quantitative polymerase chain reaction (RT-qPCR) test kits. Similar to the Tapestry scheme of Ghosh et al., the proposed schemes leverage soft information from the RT-qPCR process about the total viral load in the pool. This is in contrast to conventional group testing schemes where the measurements are Boolean. The proposed schemes provide higher testing throughput than the popularly used Dorfman's scheme. They also provide higher testing throughput, sensitivity and specificity than the state-of-the-art non-adaptive Tapestry scheme. The number of pipetting operations is lower than state-of-the-art non-adaptive pooling schemes, and is higher than that for the Dorfman's scheme. The proposed schemes can work with substantially smaller group sizes than non-adaptive schemes and are simple to describe. Monte-Carlo simulations using the statistical model in the work of Ghosh et al. (Tapestry) show that 10 infected people in a population of size 961 can be identified with 70.86 tests on the average with a sensitivity of 99.50% and specificity of 99.62. This is 13.5x, 4.24x, and 1.3x the testing throughput of individual testing, Dorfman's testing, and the Tapestry scheme, respectively.

There is broad consensus among epidemiologists, economists and policy makers that wide-scale testing of asymptomatic patients is the key for reopening the economy. While the benefits of testing are obvious, shortage of testing kits, reagents and the ensuing low-throughput of individual testing protocols has prevented deployment of wide-scale testing. Group testing or, pooling is an alternative way to substantially increase the testing throughput.

This material is based upon work supported by the National Science Foundation (NSF) under Grant CCF-2027997.

The idea of group testing was introduced by Dorfman [1] during World War II for testing soldiers for syphilis without having to test each soldier individually. Dorfman's scheme consists of two stages (or rounds). In the first stage, the set of people to be tested is split into disjoint pools and a test is performed on each pool. If a pool tested negative, everyone in that pool will be identified as non-infected. Otherwise, if a pool tested positive, we proceed to the second stage where all people in a positive pool will be tested individually, and then identified as infected or non-infected accordingly. When the prevalence is small, Dorfman's scheme requires substantially fewer tests than individual testing.

Dorfman-style testing has been implemented in the past in screening for many diseases including HIV [2] , Chlamydia and Gonorrhea [3] . It has also been considered for screening for influenza [4] . For COVID-19, several experimental results have confirmed the feasibility of using Dorfman-style pooling and it has been implemented in Nebraska, Germany, India and China [5] , [6] , [7] , [8] .

While Dorfman-style pooling is easy to implement, it is not optimal. Over the past 75 years, more sophisticated group testing schemes that provide higher testing throughput have been designed. The literature on group testing is too vast to review in detail and an overview of the techniques can be found in [9] and [10] . Group testing is also related to compressed sensing and insights from compressed sensing have been used to design group testing schemes. An important difference between group testing and compressed sensing is that in group testing, the measurements are Boolean (test result is either positive or negative) and they naturally correspond to non-linear functions of the unknown vector.

The vast majority of the work using group testing with real-time reverse transcription quantitative polymerase chain reaction (RT-qPCR) has only considered Boolean measurements even though the RT-qPCR process can produce more fine-grained information (soft information) about the total viral load in the pool. It is well-known in information theory that such soft information can potentially be used to increase testing throughput substantially. However, group testing schemes that leverage soft information from the RT-qPCR process remain largely unexplored.

Very recently, Ghosh et al. in [11] developed a statistical model relating the soft information from the RT-qPCR to the total viral load in the pool. They designed a scheme called Tapestry, which uses non-adaptive group testing using Kirkman triples and they considered several decoding algorithms that use the soft information. They showed substantial gains in testing throughput over Dorfman's scheme and to the best of our knowledge, this scheme is the state of the art non-adaptive group testing scheme that works with RT-qPCR, especially since it is the only work we are aware of that uses the soft information from the RT-qPCR measurement process.

Here, we propose two simple and effective two-stage adaptive pooling schemes that use the soft information from the RT-qPCR process and provide several advantages over Dorfman's scheme and the Tapestry scheme. We refer to these algorithms as the Two-stage Adaptive Pooling (2-STAP) and

the Two-stage Adaptive Mixed Pooling (2-STAMP) schemes/algorithms. The proposed schemes provide substantially higher throughput than Dorfman-style testing. Compared to the Tapestry scheme in [11] , 2-STAP and 2-STAMP have higher testing throughput and under the statistical model developed in [11] , for all tested cases, our algorithms have higher sensitivity and higher specificity. The proposed algorithms require fewer pipetting operations than Tapestry, but require more pipetting operations than Dorfman's scheme. Finally, 2-STAP and 2-STAMP work with much smaller pool sizes and population sizes than the Tapestry algorithm and hence, is easy to describe and implement in the lab. Monte-Carlo simulations using the statistical model in the work of Ghosh et al. (Tapestry) , show that 10 infected people in a population of size 961 can be identified with 70.86 tests on the average with a sensitivity of 99.50% and specificity of 99.62% with a pool size of 31. This is 13.5x, 4.24x, and 1.3x the testing throughput of individual testing, Dorfman's testing, and the Tapestry scheme, respectively.

Unlike Tapestry, which is a non-adaptive scheme, 2-STAP and 2-STAMP require storage of the swab samples and their accessibility for the second round of testing-similar to that of Dorfman's scheme.

In this section, we explain the problem setup. Throughout, we will consider the example of poolingbased testing for COVID-19 using the real-time reverse transcription quantitative polymerase chain reaction (RT-qPCR) technique-considered also in [11] -as an application of sensing with binary matrices for support recovery of sparse signals.

Let R ≥0 = {x ∈ R : x ≥ 0} and R >0 = R ≥0 \ {0}. For any integer i > 0, we denote {1, . . . , i} by

[i], and define [0] = ∅.

Consider a population of n people, labeled 1, . . . , n, that are to be tested for COVID-19. The vector of viral loads of these people can be modeled by a signal x = [x 1 , . . . , x n ] T , x j ∈ R ≥0 , where the jth coordinate of x represents the viral load of the jth person. If the jth person is infected (i.e., COVID-19 positive), then x j is a nonzero value; otherwise, if the jth person is not infected (i.e., COVID-19 negative), then x j is zero. We assume that every coordinate in x is nonzero with probability p (or zero with probability 1 − p), independently from other coordinates, and every nonzero coordinate takes a value from R >0 according to a fixed and known probability distribution p x . Note that the sparsity parameter p may or may not be known. In this context, the sparsity parameter p is known as prevalence.

We denote the number of nonzero coordinates in x (e.g., the number of infected people) by n + , and denote the number of zero coordinates in x (e.g., the number of non-infected people) by n − . Note that n + + n − = n. We denote by S(x) the support set of x, i.e., the index set of all nonzero coordinates in x. Note that |S(x)|= n + .

The ith binary linear measurement y of x is defined as a linear combination of coordinates x j 's according to the coefficients a i j 's that are elements from {0, 1}. That is, y i = a i · x = ∑ n j=1 a i, j x j , where a i = [a i,1 , . . . , a i,n ], a i, j ∈ {0, 1}. (Note that a binary linear measurement is different from a Boolean measurement. In the former, the coefficients are binary but the measurement value can be a real number, whereas in the latter both the coefficients and the measurement value are binary.) For example, a measurement y i represents the sum of viral loads of a subset of people to be tested for COVID-19.

Given a measurement y i , any coordinate x i j such that a i, j = 1 (or a i, j = 0) is referred to as an active (or inactive) coordinate in the measurement y.

Suppose we sense the signal x by making the measurements y 1 , y 2 , . . . , and observe noisy versions of y 1 , y 2 , . . . , denoted by z 1 , z 2 , . . . . The ith measurement y i and the noisy measurement z i are given by

where ε i 's are independent realizations of a random variable ε -taking values from R >0 according to a fixed and known probability distribution p ε . (The reason we consider a multiplicative noise model, instead of the commonly-used additive noise model, will be discussed shortly.) Note that z i = 0 if and only if y i = 0 (i.e., all active coordinates in the ith measurement are zero coordinates), and z i = 0 if and only if y i = 0 (i.e., there exists at least one nonzero coordinate among the active coordinates in the ith measurement). A detailed explanation about the multiplicative noise model in (1) can be found in Appendix I.

Our goal is to collect as few noisy measurements z 1 , z 2 , . . . as possible for any signal x such that the support set S(x) can be recovered from z 1 , z 2 , . . . , with a target level of accuracy as defined shortly.

We refer to the process of generating the measurements as sensing, and refer to the process of estimating the support set from the noisy measurements as (signal-support) recovery. Given a sensing algorithm and a recovery algorithm, we denote the estimate of S(x) by S(x), which depends on the noisy measurements z 1 , z 2 , . . . and the sensing and recovery algorithms. Any coordinate x j such that j ∈ S \ S(x) is referred to as a false negative, and any coordinate x j such that j ∈ S(x) \ S(x) is referred to as a false positive.

Similarly, any coordinate x j such that j / ∈ S(x) ∪ S(x) is referred to as a true negative, and any coordinate

x j such that j ∈ S(x) ∩ S(x) is referred to as a true positive. We denote by f − (x) the number of false negatives, i.e., f − (x) = |S(x) \ S(x)|. Similarly, we denote by f + (x) the number of false positives, i.e.,

Note that f − (x) and f + (x) depend on the noisy measurements z 1 , z 2 , . . . and the sensing and recovery algorithms.

Given a sensing algorithm and a recovery algorithm, the false negative rate r − is defined as the expected value of the ratio of the number of false negatives to the number of nonzero coordinates, i.e., r − =

where the expectation is taken over all signals x. Similarly, the false positive rate r + =

It should be noted that the ratios f − (x)/n + (x) and f + (x)/n − (x) are random variables, because they depend on x, which is itself random in both the deterministic and probabilistic models defined earlier. Also, conditioned on x having k nonzero coordinates and n − k zero coordinates, we denote the conditional false negative rate by r + . The quantities 1 − r − and 1 − r + are known as (unconditional) sensitivity and specificity, respectively. Analogously, we refer

+ as conditional sensitivity and conditional specificity, respectively. For given thresholds 0 ≤ δ − , δ + < 1, our goal is to design a sensing algorithm and a recovery algorithm such that with minimum number of measurements the constraints r − ≤ δ − and r + ≤ δ + (or r

The thresholds δ − and δ + specify the target level of accuracy for support recovery.

In a single-stage sensing scheme, also known as non-adaptive sensing, m measurements y 1 , . . . , y m are made in parallel, and m noisy measurements z 1 , . . . , z m are observed. The coefficient vectors of the measurements y 1 , . . . , y m can be represented by an m × n sensing matrix A with entries from {0, 1}.

That is, A = [a T 1 , . . . , a T m ] T where a i = [a i,1 , . . . , a i,n ] represents the ith row of the sensing matrix A, i.e., the coefficients of x j 's in the ith measurement y i . It should be noted that in a single-stage sensing scheme, the sensing matrix A is designed in advance, prior to any initial sensing of the signal, and hence the name "non-adaptive sensing". We denote the vector of measurements by y = [y 1 , . . . , 

Then, y (t) = A (t) x and z (t) = y (t) • ε (t) . For each t > 1, the measurements in the tth stage are all made in parallel, similar to the single-stage case, but the design of sensing matrix A (t) depends on the design of all sensing matrices A (1) , . . . , A (t−1) and all noisy measurement vectors z (1) , . . . , z (t−1) . Given the sensing matrices A (1) , . . . , A (T) , a recovery algorithm seeks to compute an estimate S(x) of S(x) from z (1) , . . . , z (T) .

The similarities and differences between this work and those in the literature of Compressed Sensing (CS) and Quantitative Group Testing (QGT) are summarized below:

• Binary versus non-binary signals: In QGT, the nonzero coordinates of the signal have all the same value; whereas in CS -similar to the present work, the nonzero coordinates can take different values.

• Real versus binary sensing matrices: Many of the existing CS algorithms rely on sensing matrices with real entries; whereas the sensing matrices in the QGT algorithms -similar to this work, are binary.

• Unconstrained versus constrained sensing matrices: In CS and QGT, there is often no explicit constraint on the number of nonzero entries per row (referred to as the row weight) and the number of nonzero entries per column (referred to as the column weight) of the sensing matrices. However, in this work, the focus is on sensing matrices in which the row weights and the column weights are constrained.

• Additive versus multiplicative noise: Almost all existing work on CS and QGT consider either the noiseless setting, or the settings in which the noise is additive. The focus of this work is, however, on the multiplicative noise.

• Single-stage versus multi-stage sensing: Most of the existing work on CS focus on single-stage (non-adaptive) sensing algorithms. However, in QGT -similar to this work, both single-stage and multi-stage sensing algorithms have been studied.

• Signal recovery versus signal-support recovery: In QGT, the signal recovery and the support recovery problems are equivalent, since the signal is binary-valued. In CS, however, the goal is often to estimate the signal itself, rather than estimating the support of the signal only. In this work, the goal is to recover the support of the signal (similar to QGT), even though the signal is non-binary (similar to CS).

For the same setting as the one in this work, a scheme called Tapestry was recently proposed in [11] .

In what follows, we briefly explain the sensing and recovery algorithms in this scheme.

The Tapestry scheme employs a single-stage sensing algorithm. The sensing matrices used in this scheme are based on the Kirkman triples, a special class of the Steiner triples, borrowed from the combinatorial design theory.

Let m be a multiple of 3, and let c be an integer such that c ≤ m−1 2 . A (balanced) Kirkman triple system (with parameters m and c) can be represented by a binary matrix with m rows and m 3 × c columns such that:

• The support set of each column has size exactly 3;

• The intersection of support sets of any two columns has size at most 1;

such that i is a multiple of m 3 , the sum of the columns indexed by i + 1, i + 2, . . . , i + m 3 is equal to an all-one vector.

In [11] , several different recovery algorithms from the CS literature were considered. In the following we explain one of these recovery algorithms which will also be used as part of the proposed schemes in this paper, and refer the reader to Appendix II for detailed description of the other recovery algorithms considered in [11] . 

In this section, we propose a two-stage sensing algorithm and an associated recovery algorithm which we collectively refer to as the 2-STAP scheme. The proposed scheme is inspired by the well-known two-stage Dorfman's scheme which was originally proposed in the context of group testing but requires substantially fewer measurements. Translating the Dorfman's scheme into the language of our work, in the first stage, the signal coordinates are pooled into a number of disjoint groups of equal size, and one measurement is made for each pool where all coordinates in the pool are active in the measurement. In the second stage of Dorfman's scheme, one measurement is made for every coordinate in a positive pool (i.e., a pool whose measurement in the first stage is nonzero), and no additional measurements are made for any negative pool (i.e., a pool whose measurement in the first stage is zero).

The first stage of the 2-STAP scheme is the same as that of Dorfman's scheme. The second stage, however, differs from that of Dorfman's scheme. In particular, unlike the second stage of Dorfamn's scheme, in the second stage of the 2-STAP scheme, a number of measurements are made for different (not necessarily singleton) subsets of coordinates, in each positive pool. Similar to the Dorfman's scheme, the second stage of the 2-STAP scheme makes no additional measurements for any negative pools.

In the following, we denote by 1 t or 0 t an all-one or an all-zero row vector of length t, respectively.

Given a signal x, we partition the n signal coordinates x 1 , . . . , x n into q pools of size s = n/q. We denote by x l the lth pool of coordinates, i.e.,

We denote by m 

l . Note that m (1) and m (2) are the total number of measurements in the first and second stage, respectively.

In the first stage, for each pool l ∈ [q], we make one measurement y

Thus, the total number of measurements in the first stage is m (1) = q, and the sensing matrix in the first stage, A (1) , is a q × n matrix whose lth row is a

Suppose we observe the noisy measurements z

Assume, without loss of generality (w.l.o.g.), that L = [q] \ [t] for some 0 ≤ t ≤ q. That is, the first t pools are all positive, and the last q − t pools are all negative.

For any l ∈ [t], we need to make additional measurements for the lth pool in the second stage, i.e.,

, because such a pool contains at least one nonzero coordinate. For any l ∈ [q] \ [t], the lth pool contains only zero coordinates, and we do not need to make any additional measurements for any such pool in the second stage, i.e., m

. Below, we focus on the positive pools, i.e., the pools 1, . . . , t.

Intuitively, the larger is m (2) l , the smaller will be the (conditional) false negative/positive rate, but the larger will be the average number of measurements. Since it is not known how to theoretically optimize m (2) l , we resort to a heuristic approach to choose m (2) l . In particular, we present two variants of the 2-STAP scheme: 2-STAP-I and 2-STAP-II. In 2-STAP-I, for all positive pools, the number of measurements and the pooling scheme in the second stage will be the same, regardless of the observed measurements for these pools in the first stage. That is, m

On the other hand, in 2-STAP-II, for each positive pool, the number of measurements and the pooling scheme in the second stage will be chosen based on the number of nonzero coordinates in x l , denoted by k l . Note that k l may not be known a priori, but an estimate k l of k l can be computed using the observed measurement for the lth pools in the first stage, as follows.

Let p(k) be the probability that x l has k nonzero coordinates and s − k zero coordinates, and let p(z (1) l |k) be the probability density of z

given that x l has k nonzero coordinates and s − k zero coordinates. If the sparsity parameter p is known, for any k, p(k) and p(z (1) l |k) can be computed exactly or approximately (depending on the distribution of values of the nonzero coordinates in the signal and the distribution of the noise). In this case, given a noisy measurement z (1) l , a maximum-a-posteriori (MAP) estimate of k l is given by

1 s of p, and then use p, instead of p, to first compute p(k) and p(z (1) l |k) for any k, and then compute a MAP estimate k l of k l . (Note that the average number of positive pools is q(1 − (1 − p) s ), and t is a realization of the number of positive pools. Setting these quantities equal to each other and solving for p, we get the estimate p of p, as defined above.)

Given m (2) l , the optimal design of A (2) l is not known. In this work, for each l ∈ [t], we randomly choose A l × s binary matrices (with distinct rows and distinct columns) with a pre-specified row/column weight profile. The weight profile should be chosen to obtain a good trade-off between the computational complexity (of the sensing and recovery algorithms) and false negative/positive rates. The weight profile of matrices used in our simulations can be found in Appendix II.

The recovery algorithm in the second stage consists of three steps: COMP algorithm, MAP decoding, and list generation.

For each l ∈ [t], suppose the noisy measurement vector z

l ) T ] T be the overall noisy measurement vector corresponding to the lth pool. (Note that z l is a vector of length m l and x l is a vector of length s.) We will estimate the support set S l of x l as follows.

1) COMP Algorithm: First, we use the COMP algorithm to find a (superset) estimate S l of S l . In particular, the COMP algorithm recovers S l from z l given A l . Let

We denote by x * l the sub-vector of x l restricted to the coordinates indexed by S l ; denote by A * l the sub-matrix of A l restricted to the rows indexed by [m l ] \ I l and the columns indexed by S l ; and denote by z * l the sub-vector of z l restricted to the coordinates indexed by

In the next step, we will estimate the support set S * l of x * l from z * l , given A * l . 2) MAP Decoding: Given the estimate k l of k l (the number of nonzero coordinates in x * l ), let k min = max{ k l − 1, 1} and k max = min{ k l + 1, s * l }. For any k min ≤ k ≤ k max , and for any k-subset T of S l , we compute

by finding x * l with support set T such that the conditional probability density of x * l given z * l is maximum.

Thus, if the sparsity parameter p is known, f (T) (for any T) can be approximated by solving a (potentially non-linear and/or non-convex) optimization problem (depending on the distributions p x and p ε ) numerically. (In our simulations, the "fmincon" function in MATLAB was used to compute an approximation of f (T).) If p is not known, f (T) can be approximated similarly as above, except using the ML estimate p everywhere, instead of p.

3) List Generation: Let f * = max T f (T) where the maximization is over all T defined as above. We find all T, say T 1 , T 2 , . . . , T , such that f (T) ≥ α · f * for a given 0 < α ≤ 1, and use T 1 ∪ T 2 ∪ . . . ∪ T as the estimate of the support set S * l of x * l . Note that the larger is the threshold α, the smaller will be the (conditional) average false negative rate and the larger will be the (conditional) average false positive rate.

The proposed scheme in this section, which we refer to as the 2-STAMP scheme, is a generalization of the 2-STAP scheme. The first stage of the 2-STAMP scheme is the same as that in the 2-STAP scheme, but in the second stage we make measurements on mixtures of positive pools together, instead of making measurements on separate pools only. For the ease of exposition, we explain a special case of the 2-STAMP scheme when up to two pools can be mixed together. The same idea can be easily extended for mixing larger number of pools.

The main idea behind mixing pools in the second stage is as follows. Consider two positive pools that we expect to contain a relatively small number of nonzero coordinates. By mixing these pools together and sensing the mixed pool altogether (instead of sensing the pools individually), we can save a few measurements while maintaining the implementation/computational complexity of both the sensing and recovery algorithms affordable. However, the rest of the pools that are expected to contain a relatively large number of nonzero coordinates will be sensed individually, so as to avoid the sensing and/or recovery algorithms to become too complex implementation-wise or computationally.

At the end of the first stage, suppose the noisy measurements z be the number of parts in the partition. For the ease of exposition, we assume that the last part is {t − 1, t}. We define r mixed poolsx 1 , . . . ,x r , wherẽ

. (Note that the lth pool has size s or 2s for l ∈ [τ] or l ∈ [r] \ [τ], respectively.) Also, we define the r noisy measurement vectors (corresponding to the first stage) for the mixed pools byz 

The key differences between the proposed schemes and the Tapestry scheme are listed below:

• Tapestry is a single-stage scheme, and hence all measurements can be made in parallel. The proposed schemes are, however, two-stage schemes; and notwithstanding that all measurements in each stage can be made in parallel, the measurements in the second stage can only be made after those in the first stage.

• When compared to Tapestry, in the tested cases, the proposed schemes achieve a better tradeoff between the average number of measurements and the (conditional) average false negative/positive rate. This comes from two facts: (i) the sensing algorithm of the proposed schemes is more flexible than that of the Tapestry scheme. This is because the total number of measurements in the latter can vary for different signal realizations, whereas the former is oblivious to different signal realizations and uses the same number of measurements always; and (ii) the measurements in the proposed schemes are localized to small pools. This makes it possible to implement recovery algorithms that are carefully designed for the multiplicative noise model with reasonable computational complexity.

In contrast, the recovery algorithms discussed in the Tapestry scheme were borrowed from the CS literature where the noise model is assumed to be additive. Unlike the recovery algorithms used for Tapestry, the recovery algorithm proposed in this work also takes into account the signal and noise distributions and the sparsity parameter or its estimate.

• The sensing algorithm of the proposed schemes can potentially have a substantially lower computational and implementation complexity than that of the Tapestry scheme. This follows from two facts: (i) the total number of nonzero entries in the overall sensing matrix of the proposed schemes can be much smaller than that in the Tapestry scheme; and (ii) the nonzero entries in each row of the sensing matrix in the Tapestry scheme are spread out everywhere, whereas the nonzero entries in the sensing matrix of the proposed schemes are localized in each row. In particular, in the first stage each measurement is localized to a pool of consecutive coordinates, and the measurements in the second stage for each pool are over the coordinates in that pool only.

In this section, we present our simulation results. As a case study, in these simulations, we have considered a population of n = 961 people to be tested for COVID-19 and assumed that the prevalence is p = 0.01. (In the simulations, we considered both cases where p is known or unknown a priori, and we did not observe any significant difference in the performance of either of the proposed schemes, [11] . In particular, in each simulation trial, a signal of length 961 was generated (independently from other trials) as follows: k coordinates were randomly chosen to take a nonzero value from the interval [1, 1000] according to a (continuous) uniform distribution, and the remaining n − k coordinates were all set to value zero. In the simulations, the measurement noise ε was also assumed to have a log-normal distribution with parameters µ ε = 0 and σ ε = 0.1 ln(1.95).

the results for the Tapestry scheme for the same problem model (i.e., the same population size and the same viral load and noise distributions) where each measurement is made on a pool of 31 people (see [11] ). In this table, m min , m max , m std , and m ave represent the minimum, maximum, standard deviation, and the average of the number of measurements used in 100 simulations, respectively. The sensitivity and specificity results are rounded to two decimal places, for fair comparison with the results reported in [11] . More detailed results for the 2-STAP-I, 2-STAP-II, and 2-STAMP schemes are presented in Tables II, III , and IV, respectively. In these tables, (i) α is the threshold used in the list generation Tapestry + COMP + NN-LASSO [11] Tapestry + COMP + NN-OMP [11] Tapestry + COMP + SBL [11] 1.00 step of the recovery algorithm (see Section VI-D3), which controls the trade-off between sensitivity and specificity; and (ii) the conditional negative (or positive) predictive value is the average of the ratio of the number of true negatives (or positives) to the total number of true and false negatives (or positives). Note that the negative predictive value represents the probability that a person is truly non-infected given that the recovery algorithm identifies them as non-infected, and the positive predictive value represents the probability that a person is truly infected given that the recovery algorithm identifies them as infected.

Comparing the results of Tapestry and the two variants of 2-STAP in Table I , it can be seen that for k ∈ {5, 10}, 2-STAP-I requires smaller number of measurements on the average for the same (or even higher) sensitivity and specificity (see also Table II) . For k = 15, 2-STAP-I uses about 10 more measurements than Tapestry on the average, but it achieves a substantially higher specificity by about 2% for almost the same sensitivity (see also Table II) . It can also be seen that for all k ∈ {5, 10, 15}, Table III . However, using Tapestry, for k = 10, one can achieve a sensitivity and a specificity between 98.50% and 99.49% with 93 measurements (about 20% more measurements than that in 2-STAP-II).

These improvements in the performance are mainly due to the fact that 2-STAP is an adaptive scheme (although with a very small degree of adaptivity, i.e., using only one round of feedback), whereas Tapestry is a non-adaptive scheme. In particular, identifying all negative pools (which contain a relatively large fraction of population for sufficiently low prevalence) at the end of the first stage and using a relatively small number of additional measurements only for each positive pool in the second stage enable us to achieve a better trade-off between average number of measurements, sensitivity, and specificity.

As can be seen in Table I To keep the discussion simple, suppose that an individual person is to be tested for COVID-19 by using this technology. The sample collected from the person to be tested is dispersed into a liquid medium, and the reverse transcription (RT) process is applied to convert the RNA molecules of the SARS-CoV-2 virus (the coronavirus that causes COVID-19) in the liquid (if the person is infected) into cDNA. Followed by adding primers that are complementary to the cDNA of the viral genome, these primers attach themselves to the cDNA of the viral genome, and together they undergo an exponential amplification process by the RT-qPCR machine [11] . This process consists of a maximum of C max cycles. The output of the RT-qPCR process is the cycle count C after which the concentration of DNA exceeds a pre-specified threshold D min or C max . The thresholds D min and C max are often chosen so that: (i) if a person is not infected, the DNA concentration does not exceed D min over the course of C max cycles, and (ii) if a person is infected, the DNA concentration exceeds D min at some point, say cycle C, over the course of C max cycles. Note that for a fixed D min , the larger is the viral load of an infected person, the smaller C will be. Ideally, the concentration of DNA molecules is doubled in every cycle, i.e., after C cycles the concentration of DNA molecules is x2 C , where x is the viral load of the person to be tested. In reality, however, the amplification process may not be ideal. To reflect the randomness in the process, we use the same model as the one suggested in [11] and assume that the concentration of DNA molecules after C cycles is given by xb C+∆ for some positive constant b (close to 2), where x is the viral load

of the person to be tested, and ∆ is a Gaussian random variable with mean zero and variance σ 2 ∆ . The multiplicative term ε = b ∆ can be viewed as the noise in the amplification process. The random variable ε has a log-normal distribution with parameters µ ε = 0 and σ ε = σ ∆ ln b. Note that the closer are b to 2 and σ ∆ to 0, the weaker will be the noise and the closer will be the process to ideal. For any x > 0, let C x be the number of cycles it takes for the concentration of DNA molecules to be approximately equal to D min . That is, xb C x +∆ ≈ D min . The cycle count measurement C x can be converted to an equivalent measurement z given by z = D min b −C x . It can then be seen that z = xε gives us the measurement model with multiplicative noise in (1).

OTHER RECOVERY ALGORITHMS IN [11] Recall the index set I = {i ∈ [m] : z i = 0} of zero measurements and the superset estimate S(x) of S(x) from the COMP algorithm. For simplifying the notation, denote S(x) and S(x) by S and S, respectively. Also, denote by x * the sub-vector of x restricted to the coordinates indexed by S; denote by A * the sub-matrix of A restricted to the rows indexed by [m] \ I and the columns indexed by S; and denote by z * the sub-vector of z restricted to the coordinates indexed by [m] \ I. Let m * = |I| and n * = | S|. Note that x * is a vector of length n * , A * is an m * × n * matrix, and z * is a vector of length m * .

The rest of the recovery algorithms discussed in [11] consist of two phases. In the first phase, all of these algorithms use COMP for an initial signal-support recovery and reduce the instance (x, A, z) to the instance (x * , A * , z * ), and in the second phase, each of these algorithms employs a different technique for signal recovery. The second phase of these algorithms are briefly described as follows.

For simplifying the notation, in the following we omit the superscript " * ", and denote x * , A * , z * , m * , n * by x, A, z, m, n, respectively. i.e., x 1 ≤ λ, and (ii) a non-negativity constraint on all coordinates in x. The choice of the threshold λ depends on the sensing matrix A, the noisy measurement vector z, and the distribution of the noise ε (for details, see [11] ).

non-negative sparse signal recovery that finds an estimate x of x by seeking to minimize the L 0 -norm of x, i.e., x 0 , subject to (i) a constraint on the L 2 -norm of the residual (additive) error vector z − A x, i.e., z − A x 2 ≤ , and (ii) a non-negativity constraint on all coordinates in x. Followed by initializing r (0) by z and S (0) by the empty set, in each iteration l, the algorithm first updates the support set

where the j (l) th column of A has the maximum correlation with the residual error vector r (l−1) . That is, j (l) = argmax j∈[n] A T j r (l−1) , where A j denotes the jth column of A. Next, the algorithm updates the residual error vector r (l) = z − A x, where x minimizes z − A x 2 subject to x j ≥ 0 for all j ∈ S (l) , and x j = 0 for all j ∈ [n] \ S (l) . Once the stopping criterion z − A x 2 ≤ is satisfied, the algorithm terminates and returns x as an estimate of x. The choice of the threshold depends on A, z, and the distribution of ε (see [11] ).

Sparse Bayesian Learning (SBL): In SBL, for an estimate x of x, the jth coordinate in x is assumed to be an independent Gaussian random variable with mean zero and variance σ 2 j , and the coordinates of the residual (additive) error vector z − A x are assumed to be independent Gaussian random variables with mean zero and variance σ 2 . Under these assumptions, the likelihood of x is given by

, and the conditional likelihood of z given x is given by p(z| x; σ) = (2πσ 2 ) − m 2 exp(− 1 2 σ −2 z − A x 2 2 ). Let I be an m × m identity matrix, and let Σ z = σ 2 I + A diag({σ 2 j })A T , where diag({v j }) is a square diagonal matrix such that the ( j, j)th entry is v j . Marginalizing the joint distribution p(z, x; {σ j }, σ) = p( x; {σ j })p(z| x; σ), the likelihood of z is given by p(z; {σ j }, σ) = (2π) − m 2 |Σ z | − 1 2 exp(− 1 2 z T Σ −1 z z). SBL uses the Expectation-Maximization (EM) algorithm, which is an iterative technique, to find {σ j } and σ that maximize p(z; {σ j }, σ). Followed by initializing {σ j } by {σ (0) j } and σ by σ (0) , in each iteration l, the EM algorithm updates σ (l) j and σ (l) as follows:

where (Σ x ) j, j is the ( j, j)th entry in the matrix Σ x = ((σ (l−1) ) −2 A T A + diag({(σ (l−1) j ) −2 })) −1 and x = (σ (l−1) ) −2 Σ x A T z. The iterations continue until convergence of x. (The EM algorithm is guaranteed to converge to a fixed point, which may or may not be a local optimum.) Upon convergence, the algorithm terminates and returns x. Since x may have some negative coordinates, as suggested in [11] one can use

[max{ x 1 , 0}, . . . , max{ x n , 0}] T as an estimate of x = [x 1 , . . . , x n ] T . A smarter yet more elaborate approach to impose the non-negativity constraint as part of the optimization problem is to use algorithms such as Rectified SBL [12] that assume the coordinates in x follow a different distribution than the Gaussian distribution used in SBL.

For the 2-STAP-I scheme, for each positive pool l, m 1 1 1 1 0 1 1 0 0 1 1 1 1 1 0 0 1  18 1 0 0 0 0 1 1 0 0 0 1 1 1 1 0 1 0 1 1 0 0 1 0 1 1 1 1 0 1 1 1  19 0 1 1 1 0 0 0 1 1 0 1 1 0 1 1 0 0 1 0 0 1 1 1 1 0 1 1 1 1 1 0 

For each m 

For each m 1 1 1 0 0 0 1 0 1 1 1 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 1 1 0 0 0 1 0 1 1 1 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 1 0 0 1 0 0 1 0 0 0 0 0 1 1 0 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0   24 1 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1   24 0 1 1 1 1 0 1 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 1 0 1 0 1 1 1 0 0 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 21 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 1 1 1 1 1 0 1 0 1 0 1 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 21 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 1 1 1 1 1 0 1 0 1 0 1 

The detection of defective members of large populations

Pooling of sera for human immunodeficiency virus (HIV) testing: an economical method for use in developing countries

Cost savings and increased efficiency using a stratified specimen pooling strategy for chlamydia trachomatis and neisseria gonorrhoeae

Pooling nasopharyngeal/throat swab specimens to increase testing capacity for influenza viruses by PCR

Tests in short supply? try group testing

Efficient high throughput SARS-CoV-2 testing to detect asymptomatic carriers

Evaluation of covid-19 RT-qPCR test in multi-sample pools

Sample pooling as a strategy to detect community transmission of SARS-CoV-2

Combinatorial group testing and its applications

Group Testing: An Information Theory Perspective. now

Tapestry: A single-round smart pooling technique for COVID-19 testing

Rectified gaussian scale mixtures and the sparse non-negative least squares problem