key: cord-0625721-v3hrebj4 authors: Wolff, Timo de; Pfluger, Dirk; Rehme, Michael; Heuer, Janin; Bittner, Martin-Immanuel title: Evaluation of Pool-based Testing Approaches to Enable Population-wide Screening for COVID-19 date: 2020-04-24 journal: nan DOI: nan sha: d2ed14bc0e1e302886110f07e1bbcc88cbc3e28d doc_id: 625721 cord_uid: v3hrebj4 Background: Rapid testing for an infection is paramount during a pandemic to prevent continued viral spread and excess morbidity and mortality. This study aimed to determine whether alternative testing strategies based on sample pooling can increase the speed and throughput of screening for SARS-CoV-2. Methods: A mathematical modelling approach was chosen to simulate six different testing strategies based on key input parameters (infection rate, test characteristics, population size, testing capacity etc.). The situations in five countries (US, DE, UK, IT and SG) currently experiencing COVID-19 outbreaks were simulated to reflect a broad variety of population sizes and testing capacities. The primary study outcome measurements that were finalised prior to any data collection were time and number of tests required; number of cases identified; and number of false positives. Findings: The performance of all tested methods depends on the input parameters, i.e. the specific circumstances of a screening campaign. To screen one tenth of each country's population at an infection rate of 1% - e.g. when prioritising frontline medical staff and public workers -, realistic optimised testing strategies enable such a campaign to be completed in ca. 29 days in the US, 71 in the UK, 25 in Singapore, 17 in Italy and 10 in Germany (ca. eight times faster compared to individual testing). When infection rates are considerably lower, or when employing an optimal, yet logistically more complex pooling method, the gains are more pronounced. Pool-based approaches also reduces the number of false positive diagnoses by 50%. Interpretation: The results of this study provide a clear rationale for adoption of pool-based testing strategies to increase speed and throughput of testing for SARS-CoV-2. The current individual testing approach unnecessarily wastes valuable time and resources. Evidence before this study Pool-based or group testing of samples is the faster and more resource-efficient alternative to individual testing. Its mathematical basis has first been described more than 70 years ago. Adoption in the medical field has been seen in transfusion medicine and screening for STDs. During a pandemic, rapid testing for an infection is paramount, yet, resources are often severely limited -calling for more effective testing approaches. We searched journal reference lists, preprint servers, and PubMed for relevant literature pertaining to the use of pool-based testing during COVID-19 published until 17 April 2020. Search terms were pool test, group test, mass testing, screening in combination with SARS-CoV-2 and COVID-19. Non-peer reviewed studies were judged for inclusion based on quality assessed by two authors independently. Recent publications presented 2-level pooling with optimised pool size, matrix-based approaches and binary splitting as potential testing strategies for mass screening during COVID- 19 . In this study, we simulate simulate models of 2-level pooling, matrix-based approaches and two binary splitting approaches as well as additional methods. We benchmark and compare the various testing strategies, and demonstrate the optimal choice for various scenarios. This study is therefore the first of its kind to provide a theoretical framework and practical guidance on how to best deploy limited resources to maximise the number of people tested in the shortest time. Implications of all the available evidence Adopting pool-based testing significantly increases testing capacity during pandemics such as COVID-19 while also conserving precious resources and minimising health costs. Background: Rapid testing for an infection is paramount during a pandemic to prevent continued viral spread and excess morbidity and mortality. This study aimed to determine whether alternative testing strategies based on sample pooling can increase the speed and throughput of screening for SARS-CoV-2. Methods: A mathematical modelling approach was chosen to simulate six different testing strategies based on key input parameters (infection rate, test characteristics, population size, testing capacity etc.). The situations in five countries (US, Germany, UK, Italy and Singapore) currently experiencing COVID-19 outbreaks were simulated to reflect a broad variety of population sizes and testing capacities. The primary study outcome measurements that were finalised prior to any data collection were time and number of tests required; number of cases identified; and number of false positives. Findings: The performance of all tested methods depends on the input parameters, i.e. the specific circumstances of a screening campaign. To screen one tenth of each country's population at an infection rate of 1% -e.g. when prioritising frontline medical staff and public workers -, realistic optimised testing strategies enable such a campaign to be completed in ca. 29 days in the US, 71 in the UK, 25 in Singapore, 17 in Italy and 10 in Germany (ca. eight times faster compared to individual testing). When infection rates are considerably lower, or when employing an optimal, yet logistically more complex pooling method, the gains are more pronounced. Pool-based approaches also reduces the number of false positive diagnoses by 50%. Interpretation: The results of this study provide a clear rationale for adoption of poolbased testing strategies to increase speed and throughput of testing for SARS-CoV-2. The current individual testing approach unnecessarily wastes valuable time and resources. Funding: Young Academy of the German National Academy of Sciences; German Research Foundation. Pandemics such as COVID-19 pose a significant public health threat, leading to morbidity, mortality, and rapid and significant strain on the health system. Faced with the global spread of a novel pathogen, the identification of cases and carriers and elucidation of patterns of transmission is paramount. The ability to rapidly and reliably diagnose those infected is critical to 1) identify and control clusters of infection; 2) prepare the health system for the patient numbers to be expected; 3) deploy medical countermeasures in a targeted way; and 4) assess the effectiveness of any public health measures and adapt them accordingly. The speed of testing is critical, but is limited by supply of and access to diagnostic tests, logistical challenges, and shortages in qualified personnel and/or laboratory facilities that could perform the necessary tests. In each of the above scenarios, maximising the number of people that can be tested in a given time is essential, and will save lives. For the US, this was recently highlighted as an urgent priority by Scott Gottlieb, the former FDA commissioner and Paul Romer, the Nobel Prize winning economist. 1,2 For the UK, universal weekly testing has been proposed as the only viable exit strategy from the current country-wide lockdown. 3,4 One potential approach to increase testing efficiency is pooling of different samples in one test -a well-validated method used e.g. in transfusion medicine for HIV testing that was recently experimentally deployed for SARS-CoV-2 in a small-scale study in California. 5, 6 Using a simulation approach, this study aims to identify the most effective testing strategy by comparing six mathematical procedures for mass testing a given population for infection with SARS-CoV-2: individual testing; 2-level pooling; binary splitting; recursive binary splitting; Sobel-R1; and Purim, a matrix-based group test. The primary objective thereby is to identify as many cases as quickly as possible with a given limited testing capacity. In other words, we aim to deploy the available tests as effectively as possible, increasing the identified cases per test (ICPT) while saving precious time and resources. Assumptions. We simulate a screening campaign aimed at an entire population with an estimated infection rate (including unreported cases) of ir=1%. We assume a testing capacity of c tests (e.g. PCR-based) per day, and that each test takes 5 hours to process in a clinical laboratory, resulting in a capacity of c 5 24 tests in parallel (for reasons of consistency, sample logistics are not included as an input parameter). In terms of test characteristics we assume a sensitivity of p = 0·99 based on data reported by LabCorp to the FDA and an estimated false-positive rate of q = 0·01. 7 The test characteristics are assumed to remain constant after pooling (based on a maximum pool size of k = 32 which has been shown to provide reliable results for COVID-19). 8 We simulate all six methods in repeats of ten to obtain robust statistical estimates for expectation values and standard deviation. Testing strategies. We summarise the methods and illustrate them in Figure 1 . Individual testing: The conventional approach of testing every person in a given population individually. 2-level pooling: Following a recent preprint by Hanel and Thurner we define a maximum pool size k and if the pooled test is positive, then a test on every individual in that pool is carried out. 9 This procedure was first introduced by Dorfman in 1943 and improved by Sterret in 1957. 10,11 Binary splitting: A well-known hierarchical multi-layer procedure; if the test of a pool size k is positive, the group is split in two sets of size k/2, and a pooled test is performed on the two new sets. 12 This procedure is repeated recursively for those subsets with a positive (pooled) test until each individual case has been identified. Optimised recursive binary splitting: A recent variation of binary splitting suggested by Cheng et al.; if at a given level of the hierarchy only one pool tests positive, identification of a particular case continues via binary search, then deletes the case from the pool and continues with the procedure with the remaining subjects in the unified pool. 13 We improved the method choosing optimal initial pool sizes based on the infection rate. Purim: A matrix-based pooling approach suggested by Fargion et al. in a recent preprint. One-dimensional overlapping pools are arranged in a matrix, and only crosssections of positive pool-tests are tested individually. 14 Sobel-R1: A decision tree approach based on the assumption of a binomial distribution of the test results. Pool sizes are adapted according to the minimisation of the expected number of remaining tests. 12 If the infection rate is known, it is a stochastically optimal search variant and therefore serves as an upper bound. For all pooling approaches we consider a maximum initial pool size of 32. However, the initial pool size can be optimised depending on the estimated rate of infection following, e.g., Hanel & Thurner and Xiong et al. (see Section 5 for further details). 9, 15 For Purim, we only consider its 2D method and neglect the 3D variant; the latter becomes impractical for low infection rates, requiring handling of up to 32 3 = 32, 768 samples at the same time. We have implemented all methods in Python. We generated random instances of populations using Python's numpy random number generator. The code is available at https://github.com/SC-SGS/covid19-pooling. Role of the funding source. TdW, DP and MIB are supported by the Young Academy of the German National Academy of Sciences. TdW is funded by the German Research Foundation (DFG) grant WO 2206/1-1 under the Emmy Noether Programme. DP and MR are funded by the Cluster of Excellence Data Integrated Simulation Science under Germanys Excellence Strategy (EXC 2075 -DFG grant 390740016). The authors' funding sources did not have any involvement in study design; in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the paper for publication. We employed six different methods to model the most effective strategy to screen a given population within the shortest time and the smallest number of tests. We found that the performance of all tested methods strongly depends on the input parameters, i.e. the specific circumstances of a screening campaign. This includes the size and infection rate of a population, but also the test characteristics (esp. sensitivity). To enable the reader to find the best possible strategy for their specific needs, the models we developed can be fully customised and run via https://covid19.enfunction.com. Figure 3a compares the effectiveness in terms of ICPT of all methods for different infection rates under the assumptions shown in the methods section. Hereby, the Sobel-R1 approach is the optimal method and provides a theoretical upper bound for the achievable ICPT. However, this method is restricted in its practical applicability (see below). For an infection rate of 1%, the hierarchical approaches such as binary and recursive binary splitting show almost an eight and ten fold ICPT increase compared to the current status quo of individual testing, enabling the testing of large patient groups with a single test. Disregarding Sobel-R1, we obtain the following results: for infection rates up to 6%, recursive binary splitting is the optimal method; up to 2·5%, the next best option is binary splitting, between 2·5% and 6%, the next best method is the matrix-based Purim method. For infection rates between 6% and 12% Purim is the optimal method. For infection rates of 12% and higher, 2-level pooling with optimised pool sizes yields the highest ICPT. Of note, the current standard approach of individual testing is not the best choice in any of the modelled scenarios. To better examine the potential of hierarchical approaches, we deployed our models to simulate the situation for five countries: the US, the UK, Germany, Italy and Singapore (see Table 1 ). We assume that as of early April ir = 1% of the population is infected. Figure 3b shows the overall time required to test the whole population in each of the five countries. Even in Germany, which has the highest relative testing capacity per day of these five countries, it would take 675 days with the current screening approach (individual testing) to test every individual; the US on the other hand would require 2,244 days. For the US, binary splitting reduces the time required to 285 days (about 9·5 months), and the optimised recursive binary splitting to 232 days. If only one tenth of the population needs to undergo screening -e.g. when prioritising frontline medical staff and public workersthis would mean such a campaign could be completed with binary splitting in about 29 days in the US, 71 in the UK, 25 in Singapore, 17 in Italy and 10 in Germany. When the infection rates are considerably lower, or when employing the optimal, yet logistically more challenging pooling method, the gains are more pronounced. With Sobel-R1, the same screening campaign could be completed in 21 days in the US, 52 in the UK, 18 in Singapore, 12 in Italy and 7 in Germany. As a case study, we conducted a comparison using the six approaches for three different infection rates for the US (see Table 2 ). It is important to note that a screening campaign based on a hierarchical approach on average identifies fewer cases than individual testing. Assuming a test sensitivity of 0·99, employing hierarchical testing there is a probabilistic likelihood of 0·01 (or 1%) to miss a certain case on each of several test stages. Figure 3c shows that the (compounded) expected rate of identified cases (true positives) therefore drops by between 1% and 5% in total. In contrast, the likelihood of incorrectly classifying a subject as infected is reduced from 1% to almost 0% (Figure 3d ). This reduced number of false positives has significant economical and health system consequences, as it equates to ca. 1,000 people who would otherwise have been erroneously quarantined in a population of 100,000. At the same time, binary and recursive binary splitting correctly identify up to ten times more cases per test compared to the currently employed individual testing. Furthermore, they enable multiple screening campaigns within the same time normally required for a single screening campaign where each sample is tested individually. Note that binary splitting requires up to six sequential steps of testing of a single sample. Purim and 2-level testing can be carried out in two sequential testing steps. For recursive binary splitting and Sobel-R1, re-pooling the batch sizes can lead to large numbers of sequential stages (up to 17 and 23 hierarchical steps in the worst case scenario for multiple cases in a pool, respectively). For Sobel-R1 as the (theoretical) optimal method, at an infection rate of 1%, only 5% of our tests could be carried out with at most 5 hierarchical steps, whereas 95% could be carried out with at most 13 hierarchical steps. To the best of our knowledge, this study presents the first comprehensive, comparative assessment of optimised testing strategies for effective large-scale screening for infection with SARS-CoV-2. Our simulations indicate that population-level diagnostics in a pandemic the scale of COVID-19 will only be possible by making use of pool-based strategies -otherwise, testing even 10% of the UK population at the current testing capacity would take about one and a half years. Our study enables the following key conclusions for mass testing for COVID-19: (1) Screening the entire population can be carried out several times faster via hierarchical or matrix-based approaches compared to individual testing in every scenario considered. (2) Among the immediately applicable methods, we see that with respect to an optimal ICPT: (a) Binary splitting is the best method for infection rates between 0% and 2·5%. (b) Purim is the best method for infection rates between 2·5% and 12%. (c) 2-level pooling is the best method for infection rates beyond 12%. (3) Recursive binary splitting and the Sobel-R1 method would allow a significant improvement of the ICPT for low infection rates and in general, respectively. However, to be practically applicable, first, either the time required to process each individual test needs to be reduced or the possible storage time of samples needs to be increased ca. by a factor of two given the average number of hierarchical steps expected. Second, sufficient sample material has to be available for a large number of repeat tests. Third, software support for lab technicians needs to be provided to guide lab operations and ensure that the method can be carried out swiftly and correctly. Binary splitting requires up to six sequential testing steps for a single sample. With an assumed duration of 5 hours per test, individual results are available after 5·6 = 30h and six subdivisions of the sample for a pool size of 32, which we consider as a realistic scenario. Purim and 2-level testing can be carried out in two sequential testing steps. Thus, these methods yield a conclusion after 10h and can realistically be applied in practice as of now. Of note, all hierarchical models require adequate scheduling of the processes within the involved laboratories, as the performance of certain tests is conditional upon the outcome of previous tests. However, these scheduling problems are well studied, standard-type optimisation problems. In recent preprints published in the course of the COVID-19 pandemic, 2-level pooling with optimised pool size, matrix-based approaches for pooled screening and binary splitting were presented. 9, 14, [16] [17] [18] [19] [20] Our simulations include models of all of these central categories of approaches, benchmarking and comparing the various testing strategies, and demonstrating for which scenarios each method can become the optimal choice. Even though the testing strategy will differ depending on the stage of the pandemic (early phase of an outbreak, exponential phase, etc.), one of the most important priorities shared by all scenarios is to identify and isolate as many infected individuals per day as possible. Our model calculations prioritise identifying infected individuals as soon as possible to prevent further spread, i.e. optimising for speed at the expense of a certain degree of sensitivity (especially since regular re-testing will be necessary given the continuous risk of infection and potentially even recurrence as highlighted by recent case reports). Of course, differences in test characteristics -sensitivity, rate of false positives, and processing time -will have a significant impact on the outcomes. In particular, test sensitivities were recently reported to be in the range of 0·75 for sputum or nasal swabs. 21 However, based on the same study, the decreased sensitivity can mostly be attributed to the quality of the samples, and variations in distribution of viral load in patients -in other words, the samples taken will or will not contain viral material that could be amplified during the PCR tests (with their sensitivity remaining constant at 0·99). Thus, if a sample contains no viral material, then any method will fail, whether individual or pooled. If a sample, however, contains viral material, pooling -which happens after sample taking -can be conducted and is advisable as discussed in this paper. For illustration, a 0·75 test sensitivity scenario is given in the supplementary, showing that even under these assumptions, pooling is still advisable (in addition, all variations can be modelled via our online tool https://covid19.enfunction.com. One of the biggest caveats of any modelling approach is the need to show that theoretical simulations can be successfully translated into public health measures. Hogan et al. and Yelin et al. recently presented their findings on the practical applicability of sample pooling in California and Israel, respectively, thereby providing important experimental validation for our modelling approach. 6, 8 In fact, Yelin et al. even showed that a pool size of up to 64 still provides an acceptable sensitivity, potentially enabling larger pool sizes that would increase the benefits of hierarchical pooling approaches. On 13 April 2020, the Indian Council of Medical Research published new guidance recommending limited 2-level pooling to increase screening capacity -to our best knowledge the first and only pooling approach formally adopted as of now, but certainly not the last. 22 Rapid identification of patients, asymptomatic carriers, and the modes of transmission of a given pathogen are key goals of pandemic response, that can then be embedded into a larger set of medical countermeasures. 23 This study provides a theoretical framework and practical guidance to frontline medical staff, public health authorities, and governments on how to best deploy limited testing resources to maximise the number of people tested in the shortest amount of time possible -in the case of COVID-19 as well as future pandemic outbreaks. The authors would like to thank Professor Roy Kishony (Technion -Israel Institute of Technology) for valuable contributions on PCR test characteristics; Dr Shmona Simpson (Bill and Melinda Gates Foundation, USA) for valuable discussions regarding planning and implementation of pandemic response measures; Dr Stefan Zimmer (University of Stuttgart, Germany) for assistance in code review and discussions regarding optimised pooling approaches; Dr Nikita Kaushal (Nanyang Technological University, Singapore) for manuscript review; and Dr Semen Trygubenko (Arctoris, UK) for assistance in implementing the online tool that allows for user-directed modelling for different testing scenarios. TdW, DP and MIB conceived the study and were in charge of overall direction, planning, and manuscript writing. TdW, DP, MR and JH wrote the code and MR performed the simulations. All authors provided critical feedback and helped shape the research, analysis and manuscript. MIB is a shareholder and director of Arctoris Ltd. The remaining authors declare no potential conflicts of interest. Given the nature of the work, the study was exempted from ethics approval. With increasing infection rate, the optimal pool size decreases -with the exception of the Sobel-R1 method -until they approach pool-size 1. Parameters: sensitivity p = 0·99, false positive rate q = 0·01, population 50, 000, test duration 5h, averaged over 10 runs The following Figures S1 and S2 complement our study with plots for a decreased test sensitivity of p = 0·75. Figure S1 . The best pool-size (lowest total time) depends on the infection rate, here for ir=1%, 10%, 20%. Parameters: sensitivity p = 0·75, false positive rate q = 0·01, population 50, 000, test duration 5h, averaged over 10 runs Figure S2 . Screening the whole population. Parameters: sensitivity p = 0·75, false positive rate q = 0·01, test duration 5h, averaged over 10 runs. Optimal (max.) pool-size each (c.f. (2)); for ir=1% as in (S2b)-(S2d) we obtain individual testing: 1; 2-level pooling: 12; binary splitting: 32; recursive binary splitting: 32; Purim: 31, Sobel-R1: 32 A National COVID-19 Surveillance System: Achieving Containment. Duke-Margolis Center for Health Policy Testing Is Our Way Out Universal weekly testing as the UK COVID-19 lockdown exit strategy. The Lancet Covid-19 mass testing facilities could end the epidemic rapidly Detection of Acute Infections during HIV Testing in North Carolina Sample Pooling as a Strategy to Detect Community Transmission of SARS-CoV-2 Accelerated Emergency Use Authorization (EUA) Summary COVID-19 RT-PCR Test. Laboratory Corporation of America Evaluation of COVID-19 RT-qPCR test in multi-sample pools. Preprint, medRxiv Boosting test-efficiency by pooled testing strategies for SARS-CoV-2. Preprint, ArXiv The Detection of Defective Members of Large Populations On the detection of defective members of large populations Group Testing To Eliminate Efficiently All Defectives in a Binomial Sample A new strongly competitive group testing algorithm with small sequentiality Purim: a rapid method with reduced cost for massive detection of CoVid-19 Determination of Varying Group Sizes for Pooling Procedure. Computational and mathematical methods in medicine Large-Scale, and Effective Detection of COVID-19 Via Non-Adaptive Testing Evaluation of Group Testing for SARS-CoV-2 RNA. medRxiv Efficient and Practical Sample Pooling for High-Throughput PCR Diagnosis of COVID-19. medRxiv Increasing testing throughput and case detection with a pooled-sample Bayesian approach in the context of COVID-19 Pooling RT-PCR or NGS samples has the potential to cost-effectively generate estimates of COVID-19 prevalence in resource limited environments Evaluating the accuracy of different respiratory specimens in the laboratory diagnosis and monitoring the viral shedding of 2019-nCoV infections. medRxiv Advisory on feasibility of using pooled samples for molecular testing of COVID-19. Indian Council of Medical Research Disease X: accelerating the development of medical countermeasures for the next pandemic. The Lancet Infectious Diseases