key: cord-0620397-so7gr7yq
authors: Wang, Hsin-Po; Gabrys, Ryan; Vardy, Alexander
title: Tropical Group Testing
date: 2022-01-12
journal: nan
DOI: nan
sha: d5c3c8f621e1097453b9d289bafc37e6a93f5214
doc_id: 620397
cord_uid: so7gr7yq

Polymerase chain reaction (PCR) testing is the gold standard for diagnosing COVID-19. PCR amplifies the virus DNA 40 times to produce measurements of viral loads that span seven orders of magnitude. Unfortunately, the outputs of these tests are imprecise and therefore quantitative group testing methods, which rely on precise measurements, are not applicable. Motivated by the ever-increasing demand to identify individuals infected with SARS-CoV-19, we propose a new model that leverages tropical arithmetic to characterize the PCR testing process. Our proposed framework, termed tropical group testing, overcomes existing limitations of quantitative group testing by allowing for imprecise test measurements. In many cases, some of which are highlighted in this work, tropical group testing is provably more powerful than traditional binary group testing in that it require fewer tests than classical approaches, while additionally providing a mechanism to identify the viral load of each infected individual. It is also empirically stronger than related works that have attempted to combine PCR, quantitative group testing, and compressed sensing.

C OVID-19 PANDEMIC has highlighted the critical role that widely-accessible testing can have in controlling the spread of infectious diseases. Efficient testing schemes have the potential to simultaneously reduce the time to diagnosis while improving both the reliability and accuracy of the testing procedure. This subject has attracted significant attention in the open literature [AE22] ; however, existing works do not accurately model the semiquantitative information available at the output of the polymerase chain reaction (PCR) testing methods used to detect the presence of SARS-CoV-19.

PCR tests output cycle threshold (Ct) values, which, as a result of the testing mechanism itself, are typically represented as semiquantitative measurements in the log domain [Fun+20; Arn+20] ). In this instance, the term quantitative refers to the fact that the tests' readings are non-binary and semi means that the readings are noisy or potentially inaccurate. Previous semiquantitative approaches are ill-suited for modeling the output of a PCR test as previous works mostly rely on the assumption that test measurements are reported on a linear (rather than a log) scale [Per20; DLO20; Aus20; Gho+21; She+20; YMX20; Yi+20; HN21; Coh+21; Pet+21; PBJ20; Lin+21; Zab+21] As an illustration of the potential problem of modeling the PCR test outputs on a linear scale, consider the Ct value of a test as the dB value of a sound wave or the pH value of a liquid. When adding a 50 dB white noise with a 30 dB one, we get a 50.04 dB white noise that is indistinguishable from 50 dB. Due to the wide range in viral load between infected individuals, the same phenomenon for Ct values has

The authors are with University of California San Diego, CA, USA. This work was supported by NSF grants CCF-2107346 and CCF-1764104. Emails: {hsw001, rgabrys, avardy} @eng.ucsd.edu Table I FOUR GROUP In order to address the masking issue and also to take advantage of the quantitative outputs available from PCR, we propose introducing delays during the DNA amplification process in order to encode extra information. The basic idea will be to generate tests where each of the samples within a test can be inserted at different times. As a simple example of how this would work, suppose we design a test that consists of a single sample from an infected individual that has a Ct value of X. Then, if we delay inserting the sample by ∆ cycles, the output of the resulting test would be X + ∆.

We use tropical multiplication x ⊙ y := x + y and tropical addition x ⊕ y := min(x, y) to model the behavior of the Ct values that are provided as output of each of the PCR tests. See Table I for a comparison of this versus other models. According to our model, the extra information can be retrieved by matching the pattern of the Ct values against the pattern of the delays. See Figure 1 for an illustration of this process. Delaying and matching, hand in hand, leave the masking issue nowhere to stand.

We propose a new model based upon two simple techniques -delaying and matching-for handling quantitative measurements that behave like dB values, pH values, or Ct values. With cleverly-crafted delays, we can attack a handful of scenarios as listed below.

• When there is a single (D = 1) infected person in a population of size N , we can identify her in just T = 2 nonadaptive tests, where N can be arbitrarily large (Theorem 8). When delays are limited to ℓ cycles, we show more generally N ≈ T ℓ T −1 (Theorem 10).

• For the case of D = 2 infected persons within a population of size N , we give two constructions. Tubes to the right: Specimens will be remixed in two tubes, A and B, at the same time the PCR machine is amplifying DNA. The schedule of mixing and amplifying is shown by the arrows. We double the colored area to represent the fact that the virus DNA is replicated (in reality the liquid would not expand in volume). Assume one of X, Y, and Z is infected. If the test results show that the DNA is twice as concentrated in tube A than in tube B, person X is infected. If tubes A and B contain roughly the same concentration of DNA, person Y is infected. Otherwise, person Z is infected.

-The first construction uses T ≈ 2 √ N nonadaptive tests (Theorem 13). In this construction, every person is present in only two tests.

-The second construction uses T ≈ log 2 (N ) + log 2 (log 2 N ) nonadaptive tests only (Theorem 17). When the delays are constrained, we can limit it to ℓ ≈ log 2 (N )/ log 2 (log 2 N ) cycles, and achieve T ≈ 1.01 log 2 N (Theorem 22). This construction outperforms the information-theoretical bound of binary group testing.

• For general D, we give one necessary and two sufficient conditions of the existence for group testing schemes (Section VI).

• When adaptive testing is allowed, T = 4 tests are sufficient to find D = 2 infected persons among arbitrarily many persons (Theorem 35).

• In general, T = 3D + 1 adaptive tests are sufficient to locate D infected persons among arbitrarily many persons (Theorem 40). For this approach, one does not need to know D beforehand. When delays are limited to ℓ cycles, we show that T ≈ 4D log ℓ N tests suffice (Theorem 41).

• We present simulations that confirm that in many cases matching and delaying outperform related works (Appendix D).

-Without delaying, matching alone has a comparable sensitivity-specificity tradeoff as existing works. -The tradeoff improves as the range of the Ct value increases, i.e., matching works better when the masking issue is supposedly more destructive. -Adding random delay improves the tradeoff even further; the greater the variance of the delays, the better the performance.

All construction proposed in this paper report the identities of the infected people as well as estimate the extent of infection; we are not sacrificing quantitativity in favor of sensitivity and specificity. The reported Ct values can help further diagnoses [Puj+20; Wal+20].

Section II reviews PCR and group testing and states the assumptions and goals of tropical group testing. Section III develops optimal strategies to identify a single (D = 1) infected person. Section IV considers the case of D = 2 infected persons with minimum pipetting efforts. Section V presents a testing scheme for D = 2 infected persons with nearly optimal number of tests. Section VI presents necessary and sufficient conditions on non-adaptive testing strategies that hold for general D. Section VII and Section VIII present our proposed adaptive strategies for D = 2 and general D, respectively. For a detailed breakdown of the organization of the paper and the problems tackled in the respective sections, please see Table II .

In this section, we review the working principles of the PCR process. After that, we review the taxonomy of group testing and provide a brief history of quantitative group testing. Next, we introduce the notion of delays and how this notion is formalized using the tropical semiring. Finally, we conclude this section with our problem statement.

A. The PCR process PCR stands for polymerase chain reaction. It is performed by a machine that holds a collection of tubes, each tube tube warmed to T containing some specimens. A PCR machine first heats the tubes up to a temperature T h • C; the heat decouples every double-helical DNA molecule into two single-stranded DNA templates. The machine then cools the tubes down to another temperature T c • C; at this temperature, the primers (short, single-stranded DNA segments) will attach to the DNA templates, labeling the starting point of DNA replication. Following that, the machine warms the tubes up to an intermediate temperature T i • C, which is the working temperature for the polymerases, the enzymes that will be completing the single-stranded DNA templates to form double-stranded DNA molecules. A cycle means that the tubes undergo T h • C, T c • C, and T i • C once, which implies that the concentration of DNA is doubled. Repeating this process for r cycles increases the concentration of DNA by 2 r -fold. See Figure 2 for an illustration.

In a quantitative PCR 1 test, some fluorescent dyes are added into the tubes; these dye molecules emit light when attaching to DNA. As the process of DNA-amplification continues, more and more of these fluorescent molecules will attach themselves to the newly-created DNA content. Eventually, the tubes will emit sufficient light that triggers a sensor. When this happens, the current cycle count is reported as the cycle threshold (Ct) 1 It is also called real-time PCR but still abbreviated as qPCR, not RT-PCR. RT-PCR refers to another technique called reverse transcription PCR. The combination of the two is abbreviated as RT-qPCR, which is the kind that is used in this pandemic. value of the specimen. Accordingly, Ct value = ⌊− log 2 (viral load) + constant⌋.

If all tubes turn out positive or some tubes take too long (40 cycles in practice) to emit a sufficient amount of light, the machine is turned off. For those tubes that did not trigger the sensor, negative results are reported.

For 

• Known v. unknown number of defectives. The classification of the various scenarios studied throughout this paper is listed in Table II .

We next discuss group testing models that involve nonbinary test outcomes, which involves quantitative group testing as well as our tropical group testing model.

The earliest form of non-binary group testing dates back to the 1940s, when balance scales were evaluated as a tool to single out counterfeit coins [Hwa87; GN95; Kho13] . A variant of this coin-weighing problem considers the usage of pointer scales to count how many counterfeit coins are in the queried pool. This problem later became known as quantitative group testing and many of the techniques and results have found application in, say, additive multiple access channels [Geb+19; FL20]. On a parallel track, threshold group testing considers the setup where each test outputs a "yes" when the the number of counterfeit coins exceeds a certain threshold θ, and it outputs a "no" otherwise [Dam06; D'y+13; Che13] .

A unification of quantitative and threshold group testing, which has received recent attention in [EM14b; EM14a; EM16; CGM21; Gab+21] due to its applications to genomic sequence processing, is known as semiquantitative group testing. In semiquantitative group testing, the positive reals are partitioned into intervals

and each test outputs a number that indicates in which interval the true value lies in. Previous works considered the setup where the endpoints {θ j } j represent an arithmetic progression. As will be discussed later in the next subsection, motivated by the PCR testing method, we consider a semiquantitative setup where the endpoints {θ j } j are a geometric progression. See Table I for a summary of existing semiquantitative group testing regimes.

In this work, we consider a special group testing scenario whereby one adds specimens into each tube at different cycles. Consider the following example procedure that is also illustrated in Figure 1 :

• First, we place empty tubes A and B into the PCR machine.

• Next, we put specimens X and Z into tubes A and B, respectively.

• Run the machine for 1 cycle.

• Put specimen Y into tubes A and B.

• Run the machine for 1 cycle.

• Put specimens Z and X into tubes A and B, respectively. • And finally, run the machine for another 38 cycles or until the detection of light signal, whichever is earlier. Unlike traditional group testing-which is only concerned with how to distribute specimens into tubes-this paper is also concerned with the question of when specimens should be placed into their respective tubes.

In Figure 1 , since specimen Y is put into the tubes after one cycle has elapsed, we say that specimen Y is delayed by 1 cycle. More generally, if a specimen is inserted into a tube after δ cycles have elapsed, we say that the specimen is delayed by δ cycles. If a specimen is never put into a tube, we say that the corresponding delay is infinity.

We now recall a mathematical framework that will be used to describe the behavior of Ct values under pooling and delaying. The real numbers with (positive) infinity, R ∪ {∞}, with operators ⊕ and ⊙, as defined by

is called the tropical semiring or the min-plus algebra 2 . The tropical semiring captures the behavior of addition and multiplication through the lens of log q , for some large q, and finds applications in algebraic geometry and combinatorics [MS15; Jos22] . As an example, the core of the Floyd-Warshall algorithm (ibid. or [Cor+22] ) is tropical matrix multiplication, A ⊙ B, whose (i, k)th entry is defined to be

As will be justified by simulations in Appendix D, we use tropical arithmetic to model the Ct values of pooled and delayed specimens.

Assumption 1 (Tropical model). The mixture of specimens with Ct values x 1 , x 2 , . . . , x N , each delayed by δ 1 , δ 2 , . . . , δ N cycles, respectively, has Ct value

When the context is clear, we write δ ⊙ x to denote a tropical vector-vector multiplication. When there are multiple tests, we write S ⊙ x to denote a tropical matrix-vector multiplication. We call S a schedule matrix.

With this assumption we can claim, formally, the goal of this paper.

The goal of this paper is to construct tropical group testing schemes that allow us to diagnose a population efficiently. The following two definitions state our goal precisely.

Definition 2 (Nonadaptive testing). A (T, N, D)-tropical code is a schedule matrix S ∈ ({0} ∪ N ∪ {∞}) T ×N such that, for any distinct vectors x, y ∈ ({0} ∪ N ∪ {∞}) 1×N , each having at most D finite entries,

A tropical code is said to be within maximum delay ℓ if S ∈ {0, 1, . . . , ℓ, ∞} T ×N .

Definition 3 (Adaptive testing). An R-(T, N, D)-tropical protocol is a series of R functions, S (1) , S (2) , . . . , S (R) , that feed on past results and output variable-height schedule matrices

. . . Remark 4. Our setup is more demanding than the traditional group testing setup in that we will be requiring not only the identities of the infected individuals but also the extent of infection of each infected individual. Remark 5. Our setup is also more resilient to the fuzziness innate to PCR testing in that (i) our decoder makes decisions based on integer (instead of real number) Ct values and (ii) every time two specimens are mixed together, the one with fewer virus particles are "forgotten". This harsh rule forces us to find workarounds that must not rely on the type of arithmetic that "subtracts 50 dB from 50.04 dB to obtain 30 dB". 

This section considers the simplest case whereby at most one person is infected. It demonstrates how tropical group testing can find this infected person using only two tests, regardless of how many people are taking the tests. 

That is, 3

• put tubes A and B into the PCR machine; • put specimens X and Y into tube A; • put specimens Y and Z into tube B; • run the machine for 7 cycles; • put specimens Z and X into tubes A and B, respectively; and finally

• run the machine for 33 more cycles or until the light sensor reports detects the virus, whichever is earlier. Instead of delaying by 7 cycles, one can delay by any nonzero cycles, including infinity. A larger delay is safer when Ct values are noisy; so one should consider maxing out the available delays.

Decoding: If all three individuals are healthy, then both tubes will report negative results. If X is infected, then we will see a − b = −7 and we can infer her Ct value x = a. If Y is infected, we will see a−b = 0 and infer her Ct value y = a. Similarly, if Z is infected, we will see a − b = 7 and infer z = b. Figure 3 picturizes how to interpret the test results.

The next proposition summarizes the preceding discussion.

Proposition 7. Schedule matrix (1) is a (2, 3, 1)-tropical code.

The difference of delays, for instance 0 − 7 = −7 for specimen X, is called the diff-lay and will play a central role in the sequel. The next subsection generalizes the (2, 3, 1)tropical code by generating many more diff-lays.

B. Two tests on many people-general (2, N, 1)-tropical code One can define a (2, 4, 1)-tropical code as We can test as many people as we want so long as the delays can keep up. Say there is a cap on delays, which we denote by ℓ. Then it is possible and optimal to test 2ℓ + 3 persons at once via the following schedule matrix.

Notice how each column has a unique diff-lay and they exhaust all possible diff-lays from −∞, −ℓ, −ℓ + 1 to ℓ − 1, ℓ, ∞. Alternatively, one might prefer the following schedule matrix

because the pipetting works are spread out over time (at most four pipets per cycle). The following theorem formalizes this idea.

Theorem 8 (One patient, two tests). Any schedule matrix S ∈ {0, . . . , ℓ, ∞} 2×N that contains no infinite column [ ∞ ∞ ] and satisfies

is a (2, N, 1)-tropical code within maximum delay ℓ. Every such S must satisfy N 2ℓ + 3.

Proof. Let x ∈ ({0} ∪ N ∪ {∞}) N have at most one finite entry. Iff x is entirely infinite (everyone is healthy), S ⊙x will be entirely infinite (both tubes are negative). Next, suppose x j is finite for some j. Let

The diff-lay would then be a − b = (S 1j + x j ) − (S 2j + x j ) = S 1j − S 2j and by that diff-lay we can uniquely determine j.

Once j is known, we can compute her Ct value x j = a − S 1j or x j = b − S 2j . N must be less than or equal to 2ℓ + 3 because all possible diff-lays are −∞, −ℓ, −ℓ+1, . . . , ℓ−1, ℓ, ∞. Without uniqueness of diff-lays, columns having the same diff-lay are interchangeable.

Corollary 9. Matrices (2) and (3) are both (2, 2ℓ + 3, 1)tropical codes within maximum delay ℓ that attain the equality N = 2ℓ + 3.

What is good about schedule matrix (2) is that there are algebraic formulas that compute who is infected j = a − b + ℓ + 2 and how sick she is x j = min(a, b) (unless infinity involves). For schedule matrix (3), the algebraic formulas read j = a − b + ℓ + 2 and x j = ⌊(a + b − ℓ)/2⌋ (unless infinity involves).

In Theorem 8, ℓ scales linearly in N . As for how to handle the D = 1 case when ℓ ≪ N , Appendix A proves the following generalization of Theorem 8.

Theorem 10 (One patient). There is a (T, N, 1)-tropical code within maximum delay ℓ iff N (ℓ + 2) T − (ℓ + 1) T .

In the upcoming sections, we consider the setup where more than a single person can be infected.

In the next two sections, we set the number of infected persons to be at most D = 2. For this section, we impose an extra constraint that each person should participate in exactly two tubes. This constraint is to minimize the pipetting works that consume labor and suffer from reproducibility issues.

First, let us walk thorough an N = 4 example, after which we will give a general construction.

The goal of this subsection is to identify two infected persons using four tests on four persons, i.e., to find a (4, 4, 2)tropical code. Note that there exists a trivial (4, 4, 4)-tropical code that tests each person individually. Our goal here, a (4, 4, 2)-tropical code, is not meant to be efficient, but to become the building block of a design based on bipartite graphs.

Call the four tubes A, B, C, and D; call the four persons W, X, Y, and Z. Let the lower letters a, b, c, d, w, x, y, and z be the corresponding Ct values.

Encoding: Prepare the pools as the following.

This can be graphically paraphrased by Figure 4 (left), in which a vertex is a tube and an edge is a person. An edge's specimen is first put in the arrow tail and then added into the arrow head after 7 cycles. Decoding: When no one is infected, all four tubes report negative results. When one person is infected, some two tubes will report positive results. For instance, if A and B are positive, then X is infected and her Ct value is x = a. When two persons are infected, there are two possibilities. Figure 4 . Left: A directed 4-cycle with weighted arrows to indicates the diff-lays; a person's (an edge's) specimen is first put in the tube (vertex) at the arrow tail and then added into the tube at the arrow head after four cycles. Right: The direction of an arrow is part of the sign of the diff-lay. A negative diff-lay can be understood as the opposite arrow with a positive diff-lay. So in this case, this directed K 22 represents the same configuration as the directed 4-cycle to the left dose.

First possibility-neighbor patients: Two neighboring persons are infected iff exactly three tubes report positive results. For instance, when we see that A, B and C are positive, it must be that X and Y are infected. Their Ct values can be calculated by (x, y) = (a, c − 7).

Second possibility-diagonal patients: Two diagonal persons are infected iff all four tubes report positive results. We cannot easily tell whether X and Z are sick or W and Y are sick because, as subsets, W ∪ Y = X ∪ Z. That said, we can compute the alternating sum a − b + c − d. If X and Z are infected, then

If W and Y are infected, then

To rephrase it, the alternating sum of the Ct values along the cycle will be either 14 or −14. Its sign tells us if X and Z are infected or W and Y are infected. When we see that X and Z are infected, their Ct values are (x, z) = (a, c). Otherwise, the Ct values of W and Y are (w, y) = (d, b).

The next lemma formalizes the construction.

Lemma 11 (Nonzero 4-cycle). Consider a configuration like Figure 4 (left) where 7's are replaced with general diff-

Then it forms a (4, 4, 2)-tropical code within maximum delay max(|δ W |, |δ X |, |δ Y |, |δ Z |).

Proof. If X and Z are infected, then

if W and Y are infected, then

Given that δ W +δ Y = −δ X −δ Z , the alternating sum a−b+c−d will be distinguishable in two cases. Other cases (no patients, one patient, and neighboring patients) are straightforward. Figure 5 . A diff-lay configuration of K 3,3 (which contains K 2,3 ). Arrows are +7. Undirected edges are 0 (specimen put in both tubes upon machine startup). One can verify that all 4-cycle sums are nonzero. Figure 6 . A diff-lay configuration of K 4,4 (which contains K 3,4 ). Arrows are diff-lay 7; undirected edges are 0 (specimen put in both tubes upon machine startup). One can verify that all 4-cycle sums are nonzero.

Knowing how to deal with 4-cycles, we now show that a tropical code cannot contain 3-cycles.

Lemma 12 (No 3-cycle). There does not exist a tropical code where three persons participate in three tests in total, each person is in two tests, and each test contains two persons.

Proof. Suppose part of the schedule matrix that concerns the three persons and three tests is

where 0 δ, ε, ζ, η, ϑ, κ ℓ. Then we cannot distinguish

and hence this cannot be a tropical code.

Lemmas 11 and 12 imply that, in order to find a tropical code that tests as many people as possible, we should look at graphs with girth 4.

By Mantel's theorem, the 3-cycle-free graphs that maximize the number of edges are complete bipartite graphs with equal or almost equal partition sizes [AZ18] . These graphs have the potential to give rise to good tropical codes.

Before actually constructing any tropical code out of a complete bipartite graph K p,q , we demonstrate how to encode the delays. Let B be a p × q matrix. Consider the edge that connects u ∈ [p] to the left to v ∈ [q] to the right. If the specimen is first put in left u and, after δ cycles, in right v, let B uv be δ. If the specimen is first put in right v and, after ε cycles, in left v, let B uv be −ε. This is how we can represent the diff-lays on a bipartite graph using a weighted biadjacency matrix B.

As an example,

represents Figure 4 (right). In a biadjacency matrix representation, the cycle-sum condition becomes whether every Figure 5 gives a valid assignment. For K 4,4 , Figure 6 gives a valid assignment.

This begs the question of whether there exist a systematic way to fill-in larger biadjacency matrices to meet said condition. The answer is positive.

Theorem 13 (Two patients, min pipetting). Let T 2. Let p T /2 be an odd prime. Then there exist a graph-based (T, ⌊T /2⌋⌈T /2⌉, 2)-tropical code within maximum delay (p − 1)/2.

as the biadjacency matrix of a weighted directed bipartite graph. Or use part of it when p > T /2. To minimize the delay. use integers between ±(p−1)/2 to represent the residue classes modulo p. To verify that the cycle sum along any 4-cycle is nonzero, note that every 2 × 2 sub-matrix has the structure [ tb te ub ue ]. Hence the cycle sum is congruent to

According to Theorem 13, it follows that asymptotically there exist tropical codes such that T ≈ 2 √ N and ℓ ≈ √ N . We note that the tropical group testing schemes proposed in this section already represent an improvement over binary group testing. A binary group testing scheme, when each person participates in two tubes, would avoid 3and 4-cycles.

And it is known that the number of edges of a 4-cycle-free graph is sub-quadratic [BJ17] , whereas our proposed schemes allows N to scale quadratically in terms of T .

This section again considers the D = 2 case. However, in this section we allow people to participate in more than two tests. The goal of this section is to show that under this setup, there exist tropical codes that require less tests than the information-theoretical bound for binary group testing.

We first go over an (11, 66, 2)-example. Then we state a general sufficient condition of the existence of codes. We provide a construction that satisfies this condition. Finally, we briefly discuss the possibility of using concatenation to conserve the delays.

A. An (11, 66, 2)-example

Let us take T = 11 tests as an example. Let F ⊆ 2 [T ] be a block design such that, for any two blocks B, Z ∈ F , we have |Z \ B| 2. It can be shown that |F | 66 and the equality holds when F is the (unique up to isomorphism) (4, 5, 11)-Steiner system [Ber82, Section 6]. For more on Steiner systems, see the standard textbook [CD06] and a recent breakthrough [Kee19] . For now, fix F to be the (4, 5, 11)-Steiner system and N = 66.

Let j : F → [66] be a bijection that associates blocks with id numbers. Let k : F → [37] be a coloring of blocks 4 such that k(B) = k(Z) whenever |B ∩ Z| 2. Let x j be the Ct value of the jth person. Let the schedule be

Let c t be the Ct value of the tth tube. Regarding the performance of F and S we have the following two observations that supports the injectivity of tropical-multiplication of S from the left.

Proposition 14. For any three distinct individuals A, B, Z ∈ F , we can differentiate the case where individuals A, B are infected from where A, Z are infected.

Proof. First notice that If B \ A = Z \ A, then we can easily distinguish between the two cases based upon the set of tests that contain infected samples. Suppose therefore that

On the other hand, when A, B are infected,

Proposition 15. For any four distinct individuals A, B, Y, Z ∈ F , we can differentiate the case where individuals A, B are infected from where Y, Z are infected.

Proof. Suppose that Y, Z are infected and we hypothesize that A, B are infected. The goal is to reject this hypothesis using the diff-lays. We say a tube t is dominated by an individual B if c t = S tj(B) + x j(B) , i.e., the contribution of B is what makes c t c t . Let I AY be the set of tubes that we think are dominant by A but are actually dominated by Y . Define I AZ , I BY , and I BZ similarly. Then |I AY ∪ I AZ ∪ I BY ∪ I BZ | |A ∪ B| = |A \ B| + |B| 2 + 5 = 7. This implies that one of I AY , I AZ , I BY , and I BZ has cardinality 2 or higher. Suppose |I AY | 2 and {t, u} ⊆ I AY , then

Now that we can distinguish A from Y , we can reject the hypothesis that {A, B} are the infected persons.

The moment we have rejected all incorrect hypotheses using Propositions 14 and 15, whatever remains must be the truth. That yields the following result.

Proposition 16. Schedule matrix (4) is an (11, 66, 2)-tropical code within maximum delay 36.

Proof. We claim that the following decoder works. Given any c ∈ ({0} ∪ N ∪ {∞}) T , if the number of positive tubes is 5, we know those 5 tubes are the only infected block. If more than 5 tubes are positive, then we know there are two patients. We blindly guess two blocks A, B ∈ F and check if {A, B} is compatible with c. If so, output the guess; if not, start over and make a new guess.

Here is why the decoder, albeit inefficiently, works. Proposition 14 shows that if |{A, B}∩{Y, Z}| = 1, where Y, Z ∈ F are the actual patients, there will be a contradiction. Proposition 15 shows that if |{A, B} ∩ {Y, Z}| = 0, there will also be a contradiction. Hence the claimed decoder will terminate iff |{A, B} ∩ {Y, Z}| = 2, i.e., our guess matches the reality. There are a finite number of combinations to be guessed so the decoder must eventually guess correctly. Once we know who are the patients it is straightforward to infer their Ct values. We conclude that S is a valid tropical code.

One immediately sees that Proposition 16 can be generalized to host more tests and larger population.

Theorem 17 (Two patients, max participants). Let F ⊆ 2 [T ] be a block design with N blocks. Assume |Z \ B| 2 and |B| 3 for all distinct blocks B, Z ∈ F . Let j : F → [N ] be a bijection. Let p be an integer whose prime divisors are T . Assume there exists k :

is a (T, N, 2)-tropical code within maximum delay p − 1.

Proof. Suppose there is one infected person, Z ∈ F , and we guess there is one, B ∈ F . If B = Z, then the tubes in B \ Z will be negative while we expect them to be positive. Hence we can reject this guess. Suppose there is one infected person, Z ∈ F , but we guess there are two, A, B ∈ F . Without loss of generality, suppose A = Z. As |A \ Z| 1, tubes in A \ Z will be negative while we expect them to be positive. Hence we can reject this guess.

Suppose there are two infected persons, Y, Z ∈ F , but we guess there is one, B ∈ F . Without loss of generality, suppose B = Y . As |Y \ B| 1, tubes in Y \ B will be positive while we expect them to be negative. Hence we can reject this guess.

Suppose there are two infected persons, Y, Z ∈ F , and we guess there are two, A, B ∈ F . Then depending on |{A, B} ∩ {Y, Z}| = 1 or 0, we will see a contradiction due to the same reasoning in Proposition 14 or 15, respectively. Either case, we can reject incorrect hypothesis that {A, B} are infected.

The following lemma prepares block designs whose N is satisfactorily large. 

and |B \ Z| 2 for all distinct blocks B, Z ∈ F .

Choose w := ⌈T /2⌉ to maximize the number of blocks. The asymptote of N in terms of T is thus

Conversely, T ≈ log 2 (N ) + 1.5 log 2 (log 2 N ). In contrast, binary group testing has 2 T N (N −1)/2+N +1 because the number of cases cannot exceed the number of test outcomes. That implies T ≈ 2 log 2 (N ). We therefore conclude that tropical group testing beats the information-theoretical bound for binary group testing (by a factor of 1.99).

For small parameters, a worthy reference is Brouwer's table of constant-weight codes of minimum distance 4 [BConst] . See Table III for a copy of the first few terms. It can be seen that (T, N ) = (11, 66) is the first time constant-weight codes surpass the information-theoretical bound, which is the reason we chose this example.

The previous subsection shows that one can easily beat the information-theoretical bounds using delays growing linear in N . This subsection wants to limit the usage of delays without having to introduce too many additional tests.

To facilitate the construction, let us practice how to assemble a code out of existing codes. Heuristically speaking, our approach will allow us to generate new codes from previously constructed codes without increasing the required delay.

Let S be an (t, n, 2)-tropical code within maximum delay n. (For instance, schedule matrix (4) as a (11, 66, 2)-tropical code.) Consider the Kronecker product S ⊗ 1 1×n as a t×n 2 schedule matrix, where 1 1×n is the all-one matrix of dimension 1 × n. Notice that the first n columns of this matrix are identical, the next n columns of it are identical, and so on.

That is to say, this matrix first gathers every n persons into one pool and then tests the resulting n pools using S.

If everyone participating the tests is healthy, then all tests will turn out negative. Suppose one person, her id being 0 Z n 2 − 1, is infected. Then the tests will reveal ⌊Z/n⌋, the tens (the second least significant) digit of Z in the nary expression, and z * , the Ct value of Z. If two persons, 0 Y, Z n 2 − 1, are infected, then the tests will reveal {⌊Y /n⌋, ⌊Z/n⌋}, i.e., the tens digits of Y and Z. There are two cases.

• If the digit differ, ⌊Y /n⌋ = ⌊Z/n⌋, then S ⊗ 1 1×n sees two infected pools and will report y * and z * .

• If the digit agree, ⌊Y /n⌋ = ⌊Z/n⌋, then S ⊗ 1 1×n sees one infected pool and will report min(y * , z * ). A similar argument shows that 1 1×n ⊗ S reveals Y mod n and Z mod n, the ones digits of Y and Z, together with one (if the digits agree) or two (if the digits differ) Ct values.

We now investigate what happens when we combine S ⊗ 1 1×n and 1 1×n ⊗ S. Define S (2) as this 2t × n 2 schedule matrix S (2) := 1 1×n ⊗ S S ⊗ 1 1×n

and use it to perform the tests. If there are zero infected or if there is one, the decoding procedure is trivial. Therefore, assume that there are two infected persons, 0 Y, Z n 2 −1.

Let their Ct values be y * and z * . Let y 10 n + y 1 and z 10 n + z 1 , where 0 y 1 , y 10 , z 1 , z 10 n − 1, be the n-ary expansions of Y and Z, respectively. Then the first t tests can tell us {y 1 , z 1 } ⊆ {0, . . . , n − 1}. More precisely, the first t tests tell us {(y 1 , y * ), (z 1 , z * )} when y 1 = z 1 and tell us {(z 1 , min(y * , z * ))} when y 1 = z 1 . Similarly, the last t tests can tell us {(y 10 , y * ), (z 10 , z * )} when y 10 = z 10 and tell us {(z 10 , min(y * , z * ))} when y 10 = z 10 . Now there is only one bit of information left consider. In particular, we need to differentiate between the case where {Y, Z} = {y 10 n + y 1 , z 10 n + z 1 } and the case where {Y, Z} = {y 10 n + z 1 , z 10 n + y 1 }.

When y 1 = z 1 , we already can infer {Y, Z} = {z 1 + y 10 n, z 1 + z 10 n}. When y 10 = z 10 , similarly, we already can infer {Y, Z} = {y 1 +z 10 n, z 1 +z 10 n}. If |{y 1 , z 1 }| = |{y 10 , z 10 }| = 2 and Y and Z have different Ct values, we will see that y 1 and y 10 associate to a Ct value-y * -whilst z 1 and z 10 associate to another different Ct value-z * . That will help us link y 1 to y 10 and z 1 to z 10 and thus help us recover Y, Z. One difficult case remains to be addressed, namely |{y 1 , z 1 }| = |{y 10 , z 10 }| = 2 yet y * = z * .

To overcome the last case, we add one more test to make up the missing information: Let p max(3, n) be the smallest possible prime. Let 0 α 1 , α 2 , β 1 p−1 be distinct numbers modulo p. Compute, for any 0 a 1 , a 2 n − 1,

using modulo p arithmetics. That is, we find b 1 such that (α 1 , a 1 ) and (α 2 , a 2 ) and (β 1 , b 1 ) are point-evaluation pairs of some degree-one polynomial. In other words, they are colinear in the plane F 2 p . Treating b 1 as a function in a 1 , a 2 , we want to append b 1 (a 1 , a 2 ) at the bottom of the (1 + a 1 + a 2 n)th column of S (2) . That is, we stack S (2) on top of this 1 × n k matrix

We claim that the addition of Q (2) to S (2) results in a valid tropical code.

Proposition 19. If S is an (t, n, 2)-tropical code within maximum delay n, then

is an (2t+1, n 2 , 2)-tropical code within maximum delay p−1, where p is the least prime max(3, n).

Proof. When there is 1 infected person, there are two infected persons sharing a same digit, or there are two infected persons having different Ct values, the first 2t tests suffice. When the two infected persons have distinct digits yet the same Ct values, we utilize the last test in the following manner.

Let Y and Z be the infected persons, 0 Y, Z n 2 − 1. Let y 10 n + y 1 and z 10 n + z 1 be the n-ary expansion of Y and Z. The result of Q (2) , denoted by c 1 , is c 1 = min(b 1 (y 1 , y 10 ) + y * , b 1 (z 1 , z 10 ) + z * ). Therefore c 1 − z * ∈ {b 1 (y 1 , y 10 ), b 1 (z 1 , z 10 )}. Now there are five points on the F 2 p plane:

(α 1 , y 1 ) , (α 2 , y 10 ) , (β 1 , c 1 − z * ) , (α 1 , z 1 ) , (α 2 , z 10 ) .

We know the point (β 1 , c 1 − z * ) is either (β 1 , b 1 (y 1 , y 10 )) or (β, b 1 (z 1 , z 10 )). It must be colinear with two of the four points to its left. By looking at which two are colinear with it, we can correctly link y 1 to y 10 and z 1 to z 10 . This recovers Y and Z and finishes the proof.

Next we define the notion of BITE.

Definition 20. A (T, N, 2)-tropical code is said to beat the information-theoretical estimate (BITE) 5 if N 2 /4 > 2 T .

BITing is slightly stronger than simply beating the binary bound (by about one test), but it is an inductive invariant.

Proposition 21. If S is an (t, n, 2)-tropical code that BITEs, then

is an (n 2 , 2t + 1, 2)-tropical that BITEs.

Proof. The precondition on S is n 2 /4 > 2 t . What is desired to be shown is (n 2 ) 2 /4 > 2 2t+1 . The former implies the latter.

Propositions 19 and 21 generalize to the following theorem, of which the proof is given in Appendix B.

Theorem 22 (Two patients, no patience). Let S be a (t, n, 2)tropical code within maximum delay n, then there is an (n k , kt + 2k − 3, 2)-tropical code within q − 1 cycles, where q max(3k − 3, n) is a prime power. If S BITEs, so do the putative codes BITE for all k 1.

Remark 23. This Kronecker-based construction is inspired by [Ede04, Theorem 1] ( [Muk78] ) and the way it is used in the paper. This shares common elements with the gridbased construction [BK20] that is used to attack the pandemic [SAKH20; Täu20; Mut+21]. This also shares common elements with the fast decoder approach [NPR11] . Other fast decoder approach such as [Lee+19; Bon+21] also contain similar ideas.

Ideally we can use very large but almost equal q ≈ n ≈ 3k − 3 by consulting Theorem 22 and Lemma 18. Hence, as q goes to infinity, there exist (T, N, 2)-tropical codes within maximum delay q, where q ≈ 3 log 2 (N )/ log 2 (3 log 2 (N )) and T ≈ 1.01 log 2 (N ).

In this section, we move on to general D. Concerning the existence of tropical codes, we will prove one necessary condition and two sufficient conditions. Given a schedule matrix S ∈ ({0} ∪ N ∪ {∞}) T ×N , the underlying block design registers the tests each person participates in. It is defined to be a multiset as follows.

A nasty edge case is when two individuals participate in exactly the same subset of tests. By letting F be a multiset we have |F | = N . Also by distinct blocks B 1 , . . . , B D ∈ F , we mean that the B's originate from distinct individuals. But as subsets of [T ] they are not necessarily distinct.

Here is the necessary condition promised. We state the weak version following by the strong version.

Theorem 25. The underlying block design F of a (T, N, D)tropical code must be (D − 1)-disjunct.

Proof. Suppose F is not (D − 1)-disjunct, then there exist distinct blocks Z, B 1 , . . . , B D−1 ∈ F such that Z ⊆ B 1 ∪· · ·∪ B D−1 . This means that, when the B's are severely infected, we cannot tell if Z is slightly infected or not infected. Proof. We already see that F must be (D − 1)-disjunct. Suppose it is not (D − 1)-uniquely-disjunct, then there exist distinct blocks Y, Z, B 1 , . . . , B D−1 ∈ F such that Y \ (B 1 ∪ · · · ∪ B D−1 ) = Z \ (B 1 ∪ · · · ∪ B D−1 ), which contains but one vertex. This means that, when the B's are heavily infected, we cannot tell apart if it is Y or Z that is infected.

Here are the two sufficient conditions promised. Proof. Use the vanilla schedule

where j : F → [N ] is a bijection. To decode, reinterpret every test result as positive or negative. Use a binary group testing decoder to determine who are infected. For every infected individual Z, the "outcrop" Z \ (B 1 ∪ · · · ∪ B D−1 ) is never empty because D-separability implies (D − 1)-disjunction [CH07] . This further implies that there is at least one tube wherein Z is the only infected participant. The Ct value of this tube is the Ct value of Z.

Definition 30. A block design F is said to be D-doublydisjunct if |Z \ (B 1 ∪ · · · ∪ B D )| 2 for distinct blocks Z, B 1 , . . . , B D ∈ F . Theorem 31. If there is a (D−1)-doubly-disjunct block design F with N blocks on T vertices, then there exists a (T, N, D)tropical code.

Proof. Use schedule

where j is a bijection j : F → [N ]. Let B 1 , . . . , B D be the blocks that we think are infected. Let Z 1 , . . . , Z D be the blocks that are actually infected. For now, assume that they are all distinct, as otherwise the theorem statement will be similar (perhaps easier) to prove. Let I uv be the subset of tests that we think are dominated by B u but actually are dominated by Z v , where dominance means c t = S tj(B) + x j(B) . By that F is (D − 1)-doubly-disjunct, , connect u to the left to v to the right |I uv | times. Now G has 2D vertices and 2D edges, hence it contains a cycle (possibly a 2-cycle). Let this 2Ψ-cycle be

where the u's are to the left and the v's are to the right. Identify u Ψ+1 := u 1 . Every edge involved, be it u ψ v ψ or u ψ+1 v ψ , corresponds to a test in I u ψ v ψ or in I u ψ+1 v ψ , respectively. Hence we can talk about the their Ct values, denoted by c ψ or d ψ , respectively. Finally, examine the alternating sum along this cycle Ψ ψ=1 c ψ − d ψ .

In our mind, we expect that this is a sum of some of B's diff-lays. But in reality, this is a sum of some of Z's diff-lays.

Since the delays are distinct powers of two, there is no way a sum of delays equal to another sum of delays. This shows that we can always reject incorrect guesses.

Remark 32. We understand that using exponential delays is rhetorical as the delays already live in the logarithmic realm. This can be avoided. All we need is that the "cycle sums" do not vanish. And so it suffices to use random delays and bound from above the probability they vanish. Since only short cycles are those that are likely to vanish and since there are only polynomially many short cycles, we expect that a (T, N, D)tropical code exists within polynomial delay.

For works that discuss disjunct-ness vs other properties, see [CH07; Fan+21] . For how to design block systems with disjunct and/or separable properties, see Kautz-Singleton [KS64] and the follow-ups [DR82; PR11; Ina+19; BPS21].

In the next two sections, we turn our interest to adaptive strategies.

Recall that the D = 1 case was optimally solved in Section III. Recall also that Sections IV and V discussed some nonadaptive strategies of the D = 2 case. In this section, we explore adaptive strategies for the D = 2 case, We will give a two-round, four-test strategy that finds two infected persons among arbitrarily many. That is, we will construct 2-(4, N, 2)-tropical protocols for all N . It is rather surprising that, compared to Sections IV and V, allowing a second round saves such a large number of tests.

Let a population have Ct values

where N is the number of persons being tested. Begin with two tests that imitate matrix (2).

If tubes a and b are negative, no one is infected and we are done. Assume the opposite, that a and b are positive. We compute the pointer j := (a − b + N + 1)/2 and double-check the jth person

here 0 is at the jth column. If c is positive, we test

If c is negative, we test

We claim the following.

Proposition 33. Schedule matrices (5) to (8) form a 3-(5, N, 2)-tropical protocol.

Note that schedule matrix (7) is just (2) without the jth person so it can find us the second patient given that j is the first. It remains to show why schedule matrix (8) can find us two infected persons so quickly given that j is not one. A lemma is placed here, before the proof of the proposition, to help clarify what can we learn from the test results a and b.

Lemma 34. Given a and b as defined with schedule matrix (5) and j := (a − b + N + 1)/2. Then min i j x i + i = a and min k j x k + (N + 1 − k) = b.

This lemma encodes the core idea of this section: After a and b, we learn that j is likely to be infected. That is why we query c to check. And even if she comes out healthy, we still know that someone i < j to the left dominates a and someone k > j to the right dominates b. Now i needs one more test to locate; and k needs another test to locate. Those are what d − and e − do, respectively.

Proof of Lemma 34. By the configuration of the second test, b x k + (N + 1 − k), which implies x k b − (N + 1 − k) for all k. Suppose i is the index that attain the minimum a :=

This forces 2i 2j. As the index that attain the minimum must be j, we might as well restrict the domain of the minimum and write a = min i j x i + i. The second statement of the lemma holds by symmetry.

Proof of Proposition 33. Let c be negative and let d − , e − be defined with schedule matrix (8). From the configuration of d − we know min i<j x i = d − . By Lemma 34 we know min i j x i + i = a. We can replace the domain i j with i < j because j is confirmed to be healthy. So far we have collected the information

. . 

i := a − d − with Ct value x i = d − . By symmetry, the other infected person is k := N + 1 − (b − e − ) with Ct value x k = e − .

As it turns out, we can superimpose schedule matrices (7) and (8) in a judiciously way to optimize test c away. Let

(9) where γ is a gargantuan number, e.g., 888(N + a + b).

Theorem 35. Schedule matrices (5) and (9) give a 2-(4, N, 2)tropical protocol.

Proof. It suffices to show that we can distinguish the following four cases:

• j is indeed infected; everyone else is healthy. • j is indeed infected; the other patient is < j. • j is indeed infected; the other patient is > j. • j is healthy; one patient is < j and the other > j. They one-to-one correspond to the following regions of the configuration space of the test results (cf. Figure 7) : 

It is straightforward to verify these deviations.

Next, we proceed to how to identify more infected people.

Our next goal is to give a construction that finds all D infected persons in 3D + 1 tests regardless of how large D and N are. A pilot construction using 7D + 1 tests is specified in the next subsection.

Suppose that there are D persons infected among a population of N . Suppose also that we lack the knowledge of D.

To begin, query

If a is negative, we conclude that everyone is healthy. We had used 1 7 · 0 + 1 tests and that is within the budget. This is the leaf of our recursive algorithm whose definition continues below.

If a is positive, query

Compute the index j := (a − b + N + 1)/2. Then query

where the delay 0 is at the jth column. Test c tells us whether j is really sick or not. If c ends up positive, j is infected with Ct value x j = c. We then remove her from the population. Now that there are D−1 patients left in the remaining N −1 persons, we start over. Since the number of infected persons decreases, we expect that this recursive algorithm will eventually reach the leafs and return. If test c ends up negative, we have a strengthened Lemma 34 that applies to an unknown D.

Lemma 36. Let a and b be defined with schedule matrices (10) and (11) and j := (a − b + N + 1)/2. Then min i j x i + i = a and min k j x k + (N + 1 − k) = b. (Proof is identical to that of Lemma 34 but this time x can contain more than two patients.)

Being told that j is healthy, we know that the first j − 1 persons contain at least one infected person and so do the last N − j persons. We hereby split the population into two halves-[1, j−1] and [j+1, N ]-and apply the same algorithm to them separately. Since both halves contain less than D infected persons, the recursive algorithm will eventually reach the leafs and return. The only problem is, How many tests does it consume before termination?

For every three tests we spend on the searching matrices (10) and (11) and the confirmation matrix (12), either of the following happens.

• a and b indicate that j is suspicious and c confirms that she is indeed infected.

• a and b indicate that j is suspicious but c shows that j is healthy yet we can split the population into two halves, each containing fewer infected persons. There are D infected persons so the number of tests spent on the first case, searching and confirming, is exactly 3D. There are D − 1 "gaps" we can split the population at so the number of tests spent on the second case, splitting, is at most 3D − 3. In total, the cost is at most 6D − 3 tests. On top of that, every time we confirm an infected person j in some interval [i, k], the protocol will then query

and sometimes the test result says that everyone involved is healthy. This brings the total to 6D − 3 + D = 7D − 3 tests. The final number is 7D − 3 and it is 7D + 1.

Proposition 37. Schedule matrices (10) to (12) constitute a (7D + 1)-(7D + 1, N, D)-tropical protocol.

Much to our surprise, this protocol does not depend on how many people are being tested. Moreover, this protocol tells us the number of patients as a part of the output-we do not have to know and tell the protocol the number D before the protocol begins.

As it turns out, some tests in the (7D + 1)-protocol are redundant. The key is that the confirmation matrix (12) can be combined with the searching matrix (10) of the first j − 1 persons in the following way

This test behaves like a verification of whether j is really sick. At the same time it prefetches the result of the searching matrix in the next level of the recursion should the verification fail.

Lemma 38. Let a, b, and c ♯ be defined with searching matrices (10) and (11) and multitask matrix (13). Let j := (a − b + N + 1)/2. We have

Proof. The first two statements are by Lemma 36 and the third statement by the definition of c ♯ . Only the last one is nontrivial so let us prove it. Suppose i is the index that attains the minimum:

This finishes the proof.

These four inequalities enjoy a dichotomous behavior.

Lemma 39. Let a and b and c ♯ be defined with searching matrices (10) and (11) and verify-prefetching matrix (13). Let j := (a − b + N + 1)/2. Either the following four hold

or the following four hold

Proof. Due to Lemma 38, it suffices to prove that (i), (ii), (iii), and (iv) are equivalent to each other. (i) is equivalent to

and (iii) are equivalent to each other. If they hold, then due to the inequalities x j = (a + b − N − 1)/2 c ♯ x j , (iv) holds. Conversely, suppose that (iv) holds. Let i j be the index that attains the minimum:

Hence 2(j − i) = 0 and (i), (ii), and (iii) hold. This finishes the proof.

Thanks to the dichotomy related to c ♯ , we either confirm that j is indeed infected when we see (iv) holds (plus we know her Ct value x j = (a + b − N − 1)/2) or we know we can split the population at j when we see (iv) does not hold. This leads to a strategy that only queries 3D + 1 times.

In what follows, we use the word position as in the winning positions in the combinatorial game theory, especially in the theory of the game of Nim [BCG01] . In this context, executing a tropical protocol is like playing a game against mother nature. A position is a "current state" when we are halfway toward completely understanding everyone's Ct values. We win the game by picking out all infected individuals in a limited amount of moves. A move is analogous to one single test if we care about the total number of tests, or to a batch of parallel tests if we care about the number of rounds.

Denote by a tuple (D (1) , D (2) , . . . , D (Π) ) a position where

• there are Π piles of people;

• the πth pile contain N (π) persons whose Ct values are denoted by x (π) 1 , . . . , x (π) N (π) ; • the first pile contains D (1) 0 infected persons; • for each π 2, the πth pile contains D (π) 1 infected persons; and

• for each π 2, we know the search results a (π) := min j x (π) j + j and b (π) := min j x (π) j + (N (π) + 1 − j) of the πth pile. There are two moves that evolves a position into another position.

If Π = 1, we perform searching matrix (10) on the one and only pile. Denote the test result by a (1) . If a (1) is negative, the protocol terminates and reports everyone healthy. If a (1) is positive, perform searching matrix (11) and denote the result by b (1) . Of this pile of people we now know the "a" and "b"; we migrate these people to the second pile and leave the first pile empty. That is, we evolve the position (D (1) ) into the position (0, D (1) ).

If Π 2, we focus on the second pile. Of this pile we already know a (2) and b (2) . Compute j (2) := a (2) + b (2) − N (2) − 1. We perform the verify-prefetching matrix (13) with j := j (2) and denote the result by c (2) . By Lemma 38, c (2) is equal to or greater than (a (2) + b (2) − N (2) − 1)/2.

If c (2) is equal to (a (2) + b (2) − N (2) − 1)/2. then j (2) is infected with Ct value c (2) . The remaining of the second pile, with D (2) − 1 patients remained to be found, is merged with the first pile. Now the position (D (1) , . . . , D (Π) ) is evolved into (D (1) + D (2) − 1, D (3) , . . . , D (Π) ).

If c (2) is greater than (a (2) + b (2) − N (2) − 1)/2, then we know

For the first j (2) persons, we now know their "a" and "b"; for the last N (2) − j (2) persons, we now know their "b". It suffices to query the latter's "a":

Consequently, we know the results of the searching matrices for the first half and the second half. Suppose the first half contains D (2, ) infected persons and the second half contains D (2,>) infected persons. ( We know nothing about D (2, ) and D (2,>) beyond that they are positive and sum to D (2) .) Now the position is evolved from (D (1) , . . . , D (Π) ) into (D (1) , D (2, ) , D (2,>) , D (3) , . . . , D (Π) ).

Knowing how positions evolve, we estimate the cost. It is clear that when Π = 1, the T -formula of a singleton collapses to T (D (1) ) = 3 − 2 + 3D (1) = 3D (1) + 1. Hence the theorem will be proved if we can show that each position (D (1) , . . . , D (Π) ) requires at most T (that position) tests.

We employ induction on the T -values, the budgets, of the positions. We will show that whenever one position is evolved into another position, the number of tests spent is at most T (former position) − T (latter position). That way, the budget will be ever decreasing so it always satisfies the induction hypothesis afterwards. But never would the budget go below zero before we finish picking out all infected individuals.

Base case-Π = 1 and D (1) = 0: We spend one tests on a (1) , the "a" of the first and only pile. After getting a negative result we conclude that everyone is healthy. Since T (0) = 1, the cost meets the budget for the base case. Now suppose that all positions whose T -value fall below T can be cleared before the budget runs out. Suppose that (D (1) , . . . , D (Π) ) is a position with T -value T . We want to show that it can be cleared before the budget runs out.

Induction step, case one-Π = 1 and D (1) 1: We spend two tests on a (1) and b (1) , the "a" and "b" of the first pile. After that we relabel the first pile as the second pile, evolving (D (1) ) into (0, D (1) ). Since T (D (1) ) = 2 + T (0, D (1) ), the cost meets the budget for induction step, case one.

Induction step, case two-Π 2 and c (2) = (a (2) + b (2) − N (2) − 1)/2: We spend one test on c (2) and confirmed that j (2) is infected with Ct value c (2) . By doing so, we evolve Induction step, case three-Π 2 and c (2) > (a (2) + b (2) − N (2) − 1)/2: We spend two tests on c (2) and d (2) to complete our knowledge of the "a" and "b" of the first j (2) persons and the last N (2) − j (2) persons. We evolve (D (1) , . . . , D (Π) ) into (D (1) , D (2, j) , D (2,>j) , D (3) , . . . , D (Π) ) by doing so. Since T (D (1) , . . . , D (Π) ) = 2 + T (D (1) , D (2, j) , D (2,>j) , D (3) , . . . , D (Π) ), the cost meets the budget for induction step, case three. This is the last piece of the induction and hence completes the proof.

A generalization of Theorem 40 to a delay-limited situation is the following. We defer the proof until Appendix C.

Theorem 41 (Deep searching). Let T := 4D⌈log ℓ N ⌉ + 1. There is a T -(T, N, D)-tropical protocol within maximum delay ℓ.

For the D = 2 case with limited delay (Theorem 22), it is straightforward to handle one patient and two unequally infected patients. For two equally infected patients, we use Q (k) to obtain extra information. Can we simplify Q (k) ? The goal is to find a weaker notion of BITE that is still an inductive invariant.

For general D in nonadaptive case (Theorem 31), we used exponential ℓ but remarked that a polynomial ℓ should be possible. Is it? Also, do there exist structural constructions that, more or less, generalize Theorem 22 to general D?

For the adaptive strategy with limited delay (Theorem 41), the main term of T is 4D log 2 N . From an informationtheoretical perspective, the main term should be D log 2 N . Is this achievable? Are there tradeoffs between T the number of total tests and R the number of rounds?

Recall that PCR only runs for 40 cycles in real life. Thus, if there is a Ct 35 specimen delayed by 15 cycles, the expected Ct value is 50 but we only see 40. This is a false negative result. Can we increase the maximum delay to the extent where false negative starts showing up but we still benefit from it? Note that in this case, the decoder must perform more a complicated pattern matching, one that treats 40 as a wildcard that can possibly be 40, 45, 50, or infinity.

Hong et al. [Hon+21] suggested using factorizations of a complete hypergraph to generate balanced pooling designs. Here, balanced means that every person appears in the same number of tubes and every tubes receives (almost) the same number of persons. 6 They argued that those conditions make pooling more consistent. We, while agreeing with their argument, want to add that there are other ways to achieve the same goal. Kirkman systems, Steiner systems, BIBDs, and constant weight codes are candidates that sound equally good. To be more specific, we believe that one should (also) optimize for the probability that a block B is covered by the union of a small number of other blocks Z 1 , . . . , Z D (cf. superimposed codes, D-cover-free families).

It is worth mentioning that Hwang and Xu [HX87] once published a variant of group testing where there are two infected people, one heavily infected and the other lightly infected. The testing result is quantified by three possibilities: (i) The tested pool contains the heavily infected person. (ii) The tested pool contains the lightly infected person, but not the heavily one. (iii) The tested pool contains neither infected person. Notice that (i), (ii), and (iii) can be thought as Ct values 1, 2, and 3, respectively, while the infected persons have Ct values 1 and 2. Hwang and Xu's problem formulation already suggested that the presence of the heavily infected complicates the identification of the lightly infected. The group testing scheme proposed in Sections IV and V and Appendix B can be interpreted as a solution to a generalization of their setup. 6 In the block design context, every block having the same number of vertices is called uniform or proper; every vertex appearing in the same number of blocks is called balanced. In the hypergraph context, the former is called uniform; the latter is called regular. They want both properties. In this appendix, we factor in the restriction that one often cannot wait for arbitrarily long delays but can afford more tests. This appendix shows how to trade T for ℓ and proves Theorem 10.

The following is a three-test seven-person example. Name the persons T, U, V, W, X, Y, Z and let t, u, v, w, x, y, z be their Ct values. Encoding: For a finite ℓ, the columns of a schedule matrix are vectors in {0, 1, . . . , ℓ, ∞} 3 with at least one 0. The number of such lattice points is (ℓ + 2) 3 − (ℓ + 1) 3 = 3ℓ 2 + 9ℓ + 7. For ℓ = 0, 1, and 2, the number of lattice points are 7, 19, and 37, which are 2x, 6x, and 12x increases in throughput, respectively.

Fix an upper bound on delay ℓ 0. How fast can N grow if the number of tests T approaches infinity? Clearly we want to select, for each person, a delay column

such that the straight lines

are disjoint (far away) from each other. Note that every line contains one and only one delay columñ

that has at least one zero entry and no negative entries. This means that every line passes one and only one point in the "shell"

X has cardinality (ℓ + 2) T − (ℓ + 1) T ≈ T ℓ T −1 . We are ready to prove Theorem 10.

Proof of Theorem 10. To see N (ℓ + 2) T − (ℓ + 1) T ×1 , observe that every column vector in {0, 1, . . . , ℓ, ∞} T is congruent to a column vector in X modulo

To obtain a tropical code that meets the bound N = (ℓ+2) T − (ℓ + 1) T , use X per se or use a collection of column vectors that congruent to different column vectors in X.

Let us first define the schedule matrices. The proof follows. Fix a (t, n, 2)-tropical code S. Define S (k) to be this kt×n k matrix

where ω : F q → {0, . . . , q − 1} is a look-up bijection that applies to matrices entry-wisely,

α 1 , . . . , α k and β 1 , . . . , β 2k−3 are distinct elements in F q , and q is the smallest prime power max(3k − 3, n). We now use

to prove Theorem 22.

Proof of Theorem 22. First of all, if everyone is healthy or there is merely one infected person, the situation will be trivial.

Hereafter we assume that there are two infected people. Let their indices be Y = y 1 + y 2 n + · · · + y k n k−1 and Z = z 1 + z 2 n + · · · + z k n k−1 in their n-ary expansions. Let their Ct values be y * and z * , respectively. Then the first t tests teach us {(y 1 , y * ), (z 1 , z * )} or {(z 1 , min(y * , z * ))} if y 1 = z 1 or y 1 = z 1 , respectively. In general, the (jt − t + 1)th to the (jt)th tests teach us {(y j , y * ), (z j , z * )} or {(z j , min(y * , z * ))} depending on whether the digits differ or not. The next step is to sort y 1 , z 1 , . . . , y k , z k into two piles, y 1 , . . . , y k and z 1 , . . . , z k , so that we can recover Y and Z. This can be done when two patients assume different Ct values, y * = z * , in which case we know the digits of Y are those that associate to y * and the digits of Z are those that associate to z * . On the other hand, if we only see one Ct value the whole time, then y * = z * . We will utilize the last 2k − 3 tests as the checksums of a systematic Reed-Solomon code to help sorting. Here is how.

Without loss of generality, we may assume y * = 0 = z * . Then the decoding boils down to the following task: Suppose Y := ω −1 (y 1 , . . . , y k , u 1 , . . . , u 2k−3 )

and

are two codewords of a [3k−3, k, 2k−2]-Reed-Solomon code. Suppose that we know {y 1 , z 1 } to {y k , z k }. Suppose we also know min(u 1 , v 1 ) to min(u 2k−3 , v 2k−3 ). To recover Y and Z, make a guess of two codewords A := ω −1 (a 1 , . . . , a k , e 1 , . . . , e 2k−3 )

and

Let I AY be the set of coordinates i where a i = y i . Let J AY be the set of coordinates j where e j = min(e j , f j ) = min(u j , v j ) = u j . Define I AZ , J AZ , I BY , J BY , I BZ , and J BZ similarly. Then |I AY | + |I AZ | + |I BY | + |I BZ | 2k and |J AY | + |J AZ | + |J BY | + |J BZ | 2k − 3. Hence at least one of |I AY | + |J AY |, |I AZ | + |J AZ |, |I BY | + |J BY |, and |I BZ | + |J BZ | is k. Say |I BZ | + |J BZ | k. Then the Reed-Solomon code being [3k − 3, k, 2k − 2] forces B = Z. This then forces A = Y and we finish proving that schedule matrix (14) is a valid tropical code. Now assume that S BITEs. Then n/4 > 2 t . This implies n k /4 > 2 kt+2k−3 , hence schedule matrix (14) BITEs.

Let ℓ be the largest available delay. We want to prove that the number of tests needed is at most T = 4D⌈log ℓ N ⌉ + 1.

Here is an overview of the strategy.

Instead of matrix (2), we can only afford this schedule matrix 1 . .

where each unique column repeats ≈ N/ℓ times. In other words, we divide N people into ℓ piles, treat each pile as one person, and apply searching matrices (10) and (11). Now suppose that the diff-lay a − b points to the jth pile. That is, j := (a − b + ℓ − 1)/2. Similar to Lemma 39, there are three possibilities.

(i) The jth pile contains a patient and her Ct value is a.

(ii) The jth pile is healthy. Instead, the first j − 1 piles contains 1 patient and the last ℓ − j piles contains 1 patient. (iii) The jth pile does contain some patients but their Ct values are > a. Meanwhile, the first j − 1 piles contain 1 patient and the last ℓ − j piles contain 1 patient.

Apply the verify-prefetching matrix (13).

If c ♯ equals (a + b − ℓ − 1)/2, then (i) is the case and we can reduce our scope to the jth pile, which is one order of magnitude smaller. If c ♯ is greater then (a + b − ℓ − 1)/2, then (ii) is the case and the remaining population splits into two super-piles of piles

. .

. .

x N   each having at least one infected person. Now it suffices to apply the recursive algorithm to them separately. Overall, our strategy is a digit-by-digit ℓ-ary searching. In sunny days we can confirm a digit and reduce the candidates to a smaller population. In rainy days we split the population into two halves.

Proof of Theorem 41. Let there be N persons to be tested. Partition them into ℓ almost-equal piles, treat them as ℓ persons and apply Theorem 40. If there is no patient, the protocol will end the moment the first test comes out negative. Hereafter, we let D 1. Let there be D • infected piles. Then it takes 3D • + 1 4D test to single out the infected piles. To each and every infected pile, apply Theorem 40 recursively. That will single out some infected sub-piles, followed by some infected subsub-piles. And so on and so forth. It takes at most 4D⌈log ℓ N ⌉ tests to split the population down to atomic individuals. This finishes the proof.

Alongside the development of the theoretical aspects of tropical group testing, we devote the very last appendix to benchmarking the performance of matching and delaying in a practical setup. These simulations justify why tropical arithmetic-minimum and addition-are good approximation of the reality. They also clarify what design elements makes a scheme clinically-usable.

Let p ∈ [0, 1] be the prevalence rate. Let N denote the number of people to be tested. For every individual j ∈ [N ], we toss a Bernoulli coin with mean p. If the outcome of the toss is 0, then individual j is healthy and her Ct value x j is set to be 99; if the outcome of the toss is 1, then individual j is infected and her Ct value x j is drawn continuously uniformly from the interval [16, 32] [Jon+21; Hay+21; Jac+20; BMR21; BHF21]. Note that x j is not necessarily an integer. Under this setup, the viral load x j is set to be 2 −xj . Denote by

the column vector that represents the number of virus particles of the specimens from each of the N individuals.

Given a block design F and a bijection j : F → [N ], we will construct the schedule matrix S as we have always been:

where δ = ℓ · Bernoulli(1/2) ∈ {0, ℓ} is a random delay generated unbiasedly and independently. Let 2 −S be the result of, entry-wisely, 2 raised to the power of −S. That is,

One sees that we implement delaying by δ cycles by diluting by a factor of 2 δ . Now perform the PCR tests v := 2 −S 2 −x .

Its tth row, denoted by v t , is the viral load of the tth tube given the delay schedule S and the viral loads 2 −x . Let, entry-wisely, . Phase II-match: If the jth person is the main contributor of some two tubes t 1 , t 2 ∈ [T ], we will see ct 1 − S t 1 j = x j = ct 2 − S t 2 j . Conversely, whenever we see that the greatest two ct − S tj coincide, we can guess with confidence that x j = u j . In this figure, ct 1 − S t 1 j = 29 − 8 = 21 = ct 3 − S t 3 j , so x j is likely 21.

Phase I-underestimate: For any t ∈ [T ] and j ∈ [N ], the amount of virus in the tth tube is at least what was contributed by the jth person, i.e., v t 2 −Stj 2 −xj . This yields inequalities c t − log 2 (v t ) S tj + x j . We therefore define u j := max t c t − S tj to be an underestimate of x j . See Figure 9 for an example.

Phase II-match: For any person with Ct value x j and any tube t, either j has contributed the majority of the virus and hence c t ≈ S tj + x j , or someone else contributed significantly more and c t ≪ S tj + x j . Thinking backward, each c t − S tj is either x j or less than that. We thus look for person j where

For every such j, the decoder declares that j is infected and the inferred Ct value is u j . See Figure 10 for an example. Phase III-explain: Motivated by the SCOMP algorithm [ABJ14] , we want to make sure that all positive tubes are explained by some positive person. For any unexplained tube, we look at its participants and look for the one(s) that could x j1 24 guess "≈"

x j2 28 c t = 24

x j3 16 guess "≈" δ = 8 δ = 8 Figure 11 . Phase III-explain: If the tth tube is positive but not explained by any patients reported by the first two phases we will find in t's participants the most likely people and report them infected. In the figure, both j 1 and j 3 can contribute 2 −24 , which is what t has right now, so both are declared infected.

c t1 = 12 guess x j ≈ 12 c t2 = 11 c t3 = 13 δ = 8 δ = 8 Figure 12 . Phase IV-dark room: Fix a ρ; say ρ = 14. If we see u j < ρ, then the jth person could have had x j ∈ [ρ, 32] and we would not notice due to the masking effect. We declare that she is infected with Ct value u j . In general, the diagnose of each person is a function in ρ. Set a low ρ then she is infected; set a high ρ then she is healthy (unless the previous phases found clear evidence of infection).

2% 4% 6% 8% 10% 96% 97% 98% 99% 100% false positive rate (1 − specificity) true positive rate (sensitivity) p = 5% p = 6% p = 7% p = 8% p = 9% p = 10% have been the main contributor. In detail, recall that the tth tube has about 2 −ct and the jth person could have contributed at most 2 −Stj−uj . Therefore, the more negative the deficit c t − S tj − u j , the less likely j is responsible for tube t. More concisely, we look for the subset of suspects {j | c t − S tj − u j = max k c t − S tk − u k }.

We will report everyone in this subset infected, each with her own u j as the speculated Ct value. See Figure 11 for an example.

Phase IV-dark room: Consider the following scenario. A person is infected with Ct value 31. However, against our favor, the tubes she is in have Ct values 11, 12, and 13. By no means we can infer whether she is infected or not. For this type of "patients in the dark room", we set a bar ρ and report anyone whose underestimate u j is lower than ρ. This way, the set of people being diagnosed infected is a function in ρ. For lower ρ, fewer people are diagnosed infected, so the specificity is high, but the sensitivity is low. For higher ρ, it is the other way around. This ρ parametrizes an receiver operating characteristic (ROC) curve. Figure 13 shows how different prevalence rates translate into performances. false positive rate (1 − specificity) true positive rate (sensitivity) ℓ = 8 ℓ = 6 ℓ = 4 ℓ = 2 ℓ = 0 Figure 15 . Assume prevalence rate p = 10%, uniform Ct values on the interval [16, 32], 15 × 35 Kirkman triple system, and ℓ · Bernoulli(1/2) delay. We vary the limit of delay ℓ and plot the ROC curves. Figure 14 shows how different ranges of Ct values affect the performance. A larger range leads to a better performance. We infer that this is because more possible Ct values lead to fewer "collisions" and make matching easier. While this could be counterintuitive, it shows that the tropical framework is specialized at handling data that span a large range; the larger, the better. Figure 15 shows how the delay facilitates matching. Delaying further reduces the collision probability and improves the ROC tradeoff. Figure 16 shows how the distribution of the random delay is correlated to the performance. Bernoulli (uniform on {0, ℓ}) is apparently better than uniform (uniform on {0, . . . , ℓ}). We believe that this is because the former assumes a greater variance and makes collision rarer. Figure 17 shows how the size of the matrix can have impact on the performance. At the same code rate N/T = 7/3, a truncation of a larger Kirkman triple system performs better , and no delay (ℓ = 0). We consider Kirkman triple systems of different size (after truncation so that the code rate N/T = 7/3 is fixed) and plot the ROC curves.

because, presumably, the girth can be improved. That being said, there appeared to be a ceiling on how much the girth can help. All Kirkman systems used here are those returned by the kirkman_triple_system function in SageMath the mathematical software [Sage] . Figure 18 shows a comparison of tropical group testing to Tapestry [Gho+21] . Under the same number of patients (D = 10), the same matrix dimension (45 × 105), and no delay (ℓ = 0), we can also achieve 99% sensitivity and 95% specificity. What's more, our setting is more hazard. We assume uniform Ct value on [16, 32]; Tapestry assumes uniform viral load on [1, 32768]. We assume that c t is log 2 (v t ) rounded up to an integer; Tapestry assumes that c t is log 2 (v t ) shifted by 0.1Z log 2 (1.95) ≈ 0.096Z, where Z is a standard Gaussian. Figure 19 shows a comparison of Steiner systems to the block design used in P-BEST [She+20] . Per the report, P- false positive rate (1 − specificity) true positive rate (sensitivity) Figure 18 . Assume D = 10 patients within N = 105 persons (infection rate 9.52%), uniform Ct values on [16, 32], 45 × 105 Kirkman triple system (truncation of a 45 × 330 Kirkman triple system), and no delay (ℓ = 0). We plot 10 ROC curves. Each curve is 10,000 encoding-decodings, i.e., 450,000 tubes, 100,000 patients, and 1,050,000 test takers. Compare this to Tapestry's data point and its standard deviations (4.50% ± 2.41%, 99.30% ± 2.55%) ( (2, 4, 97)-S (2, 3, 49)-S K 17 P-Best Figure 19 . Assume prevalence rate p = 2%, uniform Ct values on [16, 32], and no delay (ℓ = 0). We consider (2, 4, 97)-Steiner system (aka 2-(97, 4, 1) design), (2, 3, 49)-Steiner system (aka 2-(49, 3, 1) design), complete graph on 17 vertices, and P-BEST [She+20] . They all have code rate N/T = 8. We plot their ROC curves.

BEST is applicable when the prevalence rate is < 1.3%. 11 We found that, under tropical group testing, the same matrix performs rather good til p = 2%. That being said, there are block designs at the same code rate (N/T = 8) that perform better. For instance, at p = 2%, the complete graph on 17 vertices, which is by definition a (2, 2, 17)-Steiner system, has a better sensitivity-specificity tradeoff with fewer vertices and simpler pipetting rules. The (2, 3, 49)-Steiner system (returned by the steiner_triple_system function in [Sage] ) has a better tradeoff when p ∈ [1%, 2%]. The (2, 4, 97)-Steiner system [RR12, Theorem 2.2] has a better tradeoff across all 11 They claimed so because their experiments and simulations show that P-BEST, with high probability, can identify 5 patients among 384 suspects. It it unclear from their paper what happens when there are more than 5 patients.

1% , and no delay (ℓ = 0). We consider Kirkman triple system on 183 vertices, complete 3-uniform hypergraph on 15 vertices, and complete graph on 61 vertices. The first two have code rate N/T = 30 + 1/3; the last one has code rate N/T = 30. We plot their ROC curves.

p ∈ [0%, 2%]. Figure 20 shows a comparison of three designs at p = 0.5%. Kirkman triple system on 183 vertices performs better than the other two at the cost of reasonable subpacketization-it uses 183 tests on 5551 persons. Complete 3-uniform hypergraph on 15 vertices, using 15 tests on 15

Assessment of Specimen Pooling to Conserve SARS CoV-2 Testing Resources

Group Testing Algorithms: Bounds and Simulations

Pooled Testing and Its Applications in the COVID-19 Pandemic

Group Testing: An Information Theory Perspective

SARS-CoV2 Testing: The Limit of Detection Matters

Pooling strategies for COVID-19 testing

Proofs from The Book. Sixth. See corrected reprint of the 1998 original [ MR1723092], Including illustrations by

Winning ways for your mathematical plays

Bounds for binary constant weight codes

6, 12) and related Steiner systems

Comparison of SARS-CoV-2 viral load in saliva samples in symptomatic and asymptomatic cases

A Bound on the Number of Edges in Graphs Without an Even Cycle

Geometric group testing

Group testing as a strategy for COVID-19 epidemiological monitoring and community surveillance

Sublinear-Time Non-Adaptive Group Testing With O(k log n) Tests via Bit-Mixing Coding

Optimal Non-Adaptive Probabilistic Group Testing in General Sparsity Regimes

Handbook of Combinatorial Designs

Semiquantitative Group Testing in at Most Two Rounds

Exploring the missing link among dseparable,d-separable and d-disjunct matrices

Improved Constructions for Non-adaptive Threshold Group Testing

Optimal group testing

Multi-Level Group Testing with Application to One-Shot Pooled

ICASSP 2021 -2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2021

Introduction to Algorithms

Threshold Group Testing

Combinatorial Group Testing and Its Applications. 2nd. WORLD SCIENTIFIC

The Mathematics of Mass Testing for COVID-19

The Detection of Defective Members of Large Populations

Bounds on the length of disjunctive codes

Superimposed Codes and Threshold Group Testing

Extensions of Generalized Product Caps

Group testing for non-uniformly quantized adder channels

Semiquantitative Group Testing

Code Construction and Decoding Algorithms for Semi-Quantitative Group Testing With Nonuniform Thresholds

Strongly separable matrices for nonadaptive combinatorial group testing

Quantitative Group Testing and the rank of random matrices

Direct Comparison of SARS-CoV-2 Analytical Limits of Detection across Seven Molecular Assays

Amplification Curve Diagnostics for Covid-19 Group Testing. 2021

Quantitative Group Testing in the Sublinear Regime

A Compressed Sensing Approach to Pooled RT-PCR Testing for COVID-19 Detection

A Compressed Sensing Approach to Pooled RT-PCR Testing for COVID-19 Detection

Coin-Weighing Problems

SARS-CoV-2 detection in formalin-fixed paraffin-embedded tissue specimens from surgical resection of tongue squamous cell carcinoma

Lower bounds for constant weight codes

Estimating epidemiologic dynamics from cross-sectional viral load distributions

Two-Stage Adaptive Pooling with RT-QPCR for Covid-19 Screening

HYPER: Group testing via hypergraph factorization applied to COVID-19

A Tale of Two Coins

Group testing to identify one defective and one mediocre item

On the Optimality of the Kautz-Singleton Construction in Probabilistic Group Testing

Viral load of SARS-CoV-2 across patients and compared to other respiratory viruses

A tractable non-adaptative group testing method for non-binary measurements

Estimating infectiousness throughout SARS-CoV-2 infection course

Essentials of tropical combinatorics

The existence of designs

Nonrandom binary superimposed codes

SAFFRON: A Fast, Efficient, and Robust Framework for Group Testing Based on Sparse-Graph Codes

Constructions and Comparisons of Pooling Matrices for Pooled Testing of COVID-19

Pooling of samples for testing for SARS-CoV-2 in asymptomatic people

Optimum Detection of Defective Elements in Non-Adaptive Group Testing

Introduction to Tropical Geometry. Graduate Studies in Mathematics

Lower bounds on mt(r, s)"

Evaluation of sample pooling for screening of SARS CoV-2

A pooled testing strategy for identifying SARS-CoV-2 at low prevalence

Efficiently Decodable Error-Correcting List Disjunct Matrices and Applications

Compressed Sensing and Group Testing, Part I (Fall 2011 Seminar)

Simpler and faster Covid-19 testing: Strategies to streamline SARS-CoV-2 molecular assays

Practical High-Throughput, Non-Adaptive and Noise-Robust SARS-CoV-2 Testing

Researchers are using algorithms to tackle the coronavirus test shortage: The scramble to develop new test kits that deliver faster results -[Spectral Lines

Improving the Reliability of Pooled Testing with Combinatorial Decoding and Compressed Sensing

Explicit Nonadaptive Combinatorial Group Testing Schemes

SARS-CoV-2 viral load predicts COVID-19 mortality

Steiner systems S(2, 4, v)-a survey

The Sage Developers. SageMath, the Sage Mathematics Software System (Version 9

Evaluation of Group Testing for SARS-CoV-2 RNA

Efficient high-throughput SARS-CoV-2 testing to detect asymptomatic carriers

Rapid, large-scale, and effective detection of COVID-19 via non-adaptive testing

SARS-CoV-2 detection, viral load and infectivity over the course of an infection

Continuous Fluorescence Monitoring of Rapid Cycle DNA Amplification

Evaluation of COVID-19 RT-qPCR Test in Multi sample Pools

Error Correction Codes for COVID-19 Virus and Antibody Testing: Using Pooled Testing to Increase Test Reliability

Low-Cost and High-Throughput Testing of COVID-19 Viruses and Antibodies via Compressed Sensing: System Concepts and Computational Experiments

Group Testing Large Populations for SARS-CoV-2