key: cord-0929934-v7nbck0n authors: Barak, N.; Ben-Ami, R.; Sido, T.; Perri, A.; Shtoyer, A.; Rivkin, M.; Licht, T.; Peretz, A.; Magenheim, J.; Fogel, I.; Livneh, A.; Daitch, Y.; Oiknine-Djian, E.; Benedek, G.; Dor, Y.; Wolf, D. G.; Yassour, M. title: Lessons from applied large-scale pooling of 133,816 SARS-CoV-2 RT-PCR tests date: 2020-10-20 journal: nan DOI: 10.1101/2020.10.16.20213405 sha: c9fa69afb11655fe0cfb384f808836a47d0a0f71 doc_id: 929934 cord_uid: v7nbck0n Pooling multiple swab samples prior to RNA extraction and RT-PCR analysis was proposed as a strategy to reduce costs and increase throughput of SARS-CoV-2 tests. However, reports on practical large-scale group testing for SARS-CoV-2 have been scant. Key open questions concern reduced sensitivity due to sample dilution; the rate of false positives; the actual efficiency (number of tests saved by pooling) and the impact of infection rate in the population on assay performance. Here we report analysis of 133,816 samples collected at April-September 2020, tested by pooling for the presence of SARS-CoV-2. We spared 76% of RNA extraction and RT-PCR tests, despite the reality of frequently changing prevalence rate (0.5%-6%). Surprisingly, we observed pooling efficiency and sensitivity that exceed theoretical predictions, which resulted from non-random distribution of positive samples in pools. Overall, the findings strongly support the use of pooling for efficient large high throughput SARS-CoV-2 testing. The ongoing COVID-19 pandemic, caused by SARS-CoV-2, has resulted in substantial clinical morbidities and mortality, urging comprehensive virological testing. Major diagnostic challenges have emerged, mainly, the need for high throughput SARS-CoV-2 RT-PCR tests, aimed to detect not only symptomatic but also asymptomatic infectious viral carriers and to screen special or at-risk populations (such as health care personnel or nursing home tenants), in order to contain viral spread and guide control measures. These diagnostic challenges together with the consequent shortage in laboratory equipment, reagents and resources call for the development of a more efficient testing strategy. One promising solution is the application of sample pooling or group testing, a well-developed field in mathematics that allows the identification of carriers in a population of N using a number of tests that is smaller than N. Group testing can alleviate the supply-chain blocks and cut costs while increasing testing throughput. Sample pooling techniques differ in the number and size of pools into which each sample is assigned. In Dorfman pooling (1) , which is the simplest pooling scheme, each sample is assigned to a single pool, the pools contain equal numbers of samples and samples are retested individually only if the pool's test result is positive. In other pooling methods, samples are assigned to multiple overlapping pools in order to eliminate or at least reduce the number of retested samples (2) (3) (4) (5) . The commonly used diagnostic test for SARS-CoV-2 is based on detection of viral RNA in nasopharyngeal samples by RT-PCR amplification following RNA extraction. Pooling of samples in this context could potentially be employed at any stage along the diagnostic workflow, from pooled sample collection, to pooled RNA extraction and RT-PCR, or pooled final RT-PCR only (2, (6) (7) (8) (9) (10) (11) (12) (13) (14) , with each approach having pros and cons with regard to test saving versus logistics/delays associated with patient and sample re-testing. We and others, have recently described the validation and early implementation of sample pooling for SARS-CoV-2 detection (2, [6] [7] [8] [9] [10] [11] [12] [13] [15] [16] [17] , and, perhaps reflecting an increased confidence in this approach, the Food and Drug Administration (FDA) has already issued the first Emergency Use Authorization (EUA) for pooled testing of SARS-CoV-2 in July (18) . The majority of these studies have employed Dorfman pooling (with 4-32 samples per pool), and, while largely differing in protocols and stages of pooling used, have suggested sufficient diagnostic accuracy despite an expected loss of sensitivity. When considering any of the SARS-CoV-2 pooling schemes, there are three crucial concerns: Efficiency -how many tests are spared in practice and its relation to the prevalence rate? Sensitivity -can we detect samples with lower viral load of clinical significance despite sample dilution? And Operational feasibility -can we technically and logistically implement the pooling scheme and quickly adapt it to changes in infection prevalence rates? These concerns cannot be addressed by all currently reported studies, as they were conducted as a proof of concept, consisting of only hundreds to a few thousands of tested samples, examined over a short time period with a relatively constant positive samples rate (usually <1%). Here, we describe lessons learned from a five-month period, testing 133,816 samples using 17,945 pools. Based on early evidence, theoretical considerations and practical limitations we chose to implement adaptive Dorfman pooling with pool sizes of 5 and 8. We evaluate the theoretical and empirical efficiency and sensitivity of our pooling approach, and its adaptation to fluctuating rates of positive samples. Overall, we spared 76% of the PCR reactions compared with individual testing, with an acceptable reduction in sensitivity. To our knowledge, this is the most extensive analysis, addressing key considerations of efficiency, sensitivity and feasibility in the actual reality of routine, large-scale implementation of sample pooling for SARS-CoV-2 detection. Between mid-April and mid-September of 2020, we tested 133,816 samples in pools. One challenge to the pooling scheme stemmed from the fluctuating rates of infection during the pandemic. The infection prevalence rate of pooled samples changed considerably (despite the fact that the vast majority were obtained from asymptomatic individuals; Figure 1A ), mandating a dynamic adaptation of the pooling scheme. In principle, at low prevalence, using fewer pools of larger pool sizes would lead to a gain in efficiency, since the majority of pools test negative. However, as prevalence increases, using a larger number of smaller-size pools would be more efficient, as every positive individual lead to retesting a smaller amount of samples ( Supplementary figure 1A ) . Thus, when the prevalence rate in pooled samples increased (from~1% to~6%), we switched from 8-sample pooling to 5-sample pooling, and employed a dynamic approach thereafter (alternating the pool size between 8 and 5) to maintain optimal pooling efficiency ( Methods, Supplemental figure 1AB ). In total, we tested 14,697 8-and 3,248 5-sample pools, where 9.35% and 22.1% of the pools is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. A dominant consideration in planning and evaluating the pooling approach is the efficiency, defined as the expected number of samples tested using a single RT-PCR reaction. In theory, efficiency is mostly impacted by the pool size and the prevalence rate (see Methods, Supp Figure 1 ). We calculated our empirical efficiency (defined as the total number of tested samples divided by the total number of actual RT-PCR reactions performed), and found it to be 4.587 and 2.377 for the 8-and 5-sample pools, respectively. Strikingly, these values are better than the expected optimal efficiency values for both the 8-and the 5-pool sizes, under the observed prevalence rates of 1.7% and 5.7%, respectively ( Table 1 ) . As discussed below, the reason for this supra-optimal efficiency is the non-random distribution of positive samples among pools in our real-life setting. As the prevalence of infection changes, so does the pooling efficiency. Indeed, we observed fluctuations in efficiency values over time, when the empirical efficiency was higher or lower than the theoretical efficiency (Supp Figure 1B) . Nevertheless, across time and pool sizes we performed better than expected. Overall, we tested 133,816 samples using 32,466 RT-PCR tests with a global efficiency of 4.121, saving 101,350 (76%) reactions. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.16.20213405 doi: medRxiv preprint A major concern regarding sample pooling is the expected loss of sensitivity upon sample dilution. We evaluated the sensitivity in our large-scale 8-sample pooling approach, comparing the Ct value of each positive pool with the Ct value of the individually-tested positive samples within the pool. Theoretically, an 8-sample pool with a single positive sample should contain only ⅛ of the viral load, which requires 3 additional PCR cycles (log2 of the dilution factor) for detection ( Methods ). Since our PCR assay has a practical limit of sensitivity at 40 cycles, we expect pooling tests to be able to detect samples with viral Cts up to 37. Individual samples with Figure 2C ). This can be caused by either across the board increase in prevalence, or by clusters of positive samples that are tested in the same pool. We have employed and monitored a large-scale, adaptive 8-and 5-sample pooling of nasopharyngeal sample lysates for detection of SARS-CoV-2 over a 5-month period. Data analysis of nearly 135,000 pooled samples revealed high empirical efficiency of sample pooling, overweighting a minor clinically-insignificant sensitivity loss. We were able to spare 76% of RNA extraction and RT-PCR tests, even in the reality of frequently changing prevalence rate (<1% to 6%). Adaptive pooling approaches can maximize resource saving under a fluctuating prevalence rate. The fraction of positive samples tested in pools ( p ) can vary over time ( Figure 1A ) due to multiple factors affecting the epidemic kinetics, including changes in public health mitigation measures (i.e., social distancing regulations, travel restriction, lockdown, school closure) (19) . As a result, the pool size (n) required to achieve optimal efficiency shifts. For example, the optimal . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.16.20213405 doi: medRxiv preprint pool size for p =0.02 (2%) is n =8, but as p rises to 0.05 (5%), optimal pool size shrinks to 5 (1) ( Supp Figure 1B ) . Consequently, for improved efficiency, we have adopted a dynamic strategy, alternating between pool sizes of 8 and 5, based on the positive rate observed in the past few days, as well as on epidemiological information on the source of samples (e.g. switching to pools of 5 when receiving samples from a source highly suspected to have higher probability of infection) ( Figure 1A, Methods ) . Strikingly, we observed supra-optimal empirical efficiency of pooling, exceeding the predicted efficiency, which could not be explained only by the dynamic switching in pool sizes (see below). When considering the clinical implementation of group testing, loss of sensitivity is a major concern. The dilution of samples due to pooling may lead to lack-of-detection in samples with low viral presence (manifested by high Ct in individual testing). Our empirical results show a loss of sensitivity as expected based on sample dilution. Given the high sensitivity of current SARS-CoV-2 RT-PCR assays, and the accumulated information on the negligible risk of infectiousness associated with low viral RNA level (high Ct), we believe that the loss of 3 Cts is a minor and clinically acceptable tradeoff, as recently suggested (20) . Interestingly, our pooling scheme did uncover many positive samples with high Ct values (>37) that would be expected to be missed in pools, presenting real-life performance that exceeds theoretical expectations, similarly to the observed efficiency trend. We propose that the better-than-expected performance of pooling in both efficiency and sensitivity aspects is rooted in a single factor: the non-random distribution of positive samples in pools. In theory, increased prevalence rates result in decreased efficiency since a common assumption in most models is that samples arrive at random to the diagnostic lab. In reality, samples arrive in batches: from colleges, nursing homes, or healthcare personnel. We sort samples into pools as they arrive at the lab, such that family members and roommates are often ( Figure 2 ). A single strongly positive sample is sufficient to make the viral load in the pool detected. If the same pool contains additional low-viral load samples that would have been otherwise missed upon dilution, these would now "benefit" from the higher-viral-load samples co-existing in the pool, and be discovered when the pool is opened for individual testing. Thus, a non-random pool assignment, as well as increased prevalence rate (which by itself increases the likelihood of having pools is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.16.20213405 doi: medRxiv preprint with multi-positive samples), contribute to the increased sensitivity. A non-random pool assignment together with an adaptive pool size approach further explain our better-than-expected efficiency. One practical implication of our findings is the importance of using pre-existing knowledge about incoming samples. Using such information for clever co-assignment of samples suspected to be positive or negative can exceed the theoretical performance of pooling typically calculated under the assumption of random assignment. We have encountered considerable logistic hurdles in obtaining pre-test probability for each swab sample, but argue that success in such efforts could make pooling work extremely efficiently even in settings of very high prevalence. Finally, a common concern with regard to pooling refers to the ease and simplicity of implementation. While some methods may be theoretically more efficient, they need to be manageable at large-scale in a diagnostic lab. We have developed a pipeline that consists of guidelines of which samples to pool, hardware to pool the samples (liquid handlers) and software to pool and track the samples for the second stage of examining individual samples within a positive pool. Details regarding this process including a video demonstrating the entire process can be found in Supplementary Note 1. The long-term containment of Covid-19 will likely involve early identification of outbreaks on the background of very low prevalence in the population. Our empirical evidence from testing over 130,000 samples in pools strongly projects on the feasibility and benefits of carefully-conducted pooling for surveillance, control, and community re-openings. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020. 10.16.20213405 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020.10.16.20213405 doi: medRxiv preprint Successful implementation of high-scale adaptive pooling for SARS-CoV-2 tests requires an appropriate IT infrastructure and automation of information flow. In this supplementary note we will describe the protocol and tools designed and used by the Hadassah Hebrew University COVID-19 diagnosis team. We will address the main challenges and the solutions we used to overcome them. Our standard operating procedure (SOP) steps are stated and illustrated below, and a video demonstrating the complete pooling procedure can be found here . 3. Open and load the IS and the empty tubes to a Liquid Handling (LiHa) robot (we use Tecan Freedom Evo 100). Execute Pool Protocol: first 8 IS will be pooled to the 1 st PS, the next 8 IS will be pooled to the 2 nd PS, etc (50 ul from each, to a total of 400 ul). Alternative faster protocols are available, depending on specifications of the LiHa robot and number of IS. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . https://doi.org/10.1101/2020. 10.16.20213405 doi: medRxiv preprint The Detection of Defective Members of Large Populations The Annals of Efficient high-throughput SARS-CoV-2 testing to detect asymptomatic carriers A Non-Adaptive Combinatorial Group Testing Strategy to Facilitate Healthcare Worker Screening During the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) Outbreak, medRxiv Tapestry: A Single-Round Smart Pooling Technique for COVID-19 Testing A strategy for finding people infected with SARS-CoV-2: optimizing pooled testing at low prevalence arXiv Evaluating the efficiency of specimen pooling for PCR-based detection of COVID-19 Assessment of Specimen Pooling to Conserve SARS CoV-2 Testing Resources Pooling of SARS-CoV-2 samples to increase molecular testing throughput Pooling of nasopharyngeal swab specimens for SARS-CoV-2 detection by RT-PCR Pooled RNA sample reverse transcriptase real time PCR assay for SARS CoV-2 infection: A reliable, faster and economical method Pooling of samples for testing for SARS-CoV-2 in asymptomatic people Novel multiple swab method enables high efficiency in SARS-CoV-2 screenings without loss of sensitivity for screening of a complete population Others, Swab pooling for large-scale RT-qPCR screening of SARS-CoV-2, medRxiv Hebrew University-Hadassah COVID-19 Diagnosis Team, Large-scale implementation of pooled RNA extraction and RT-PCR for SARS-CoV-2 detection Pooled Testing for SARS-CoV-2 in Hospitalized Patients Sample Pooling as a Strategy to Detect Community Transmission of SARS-CoV-2 COVID-19) Update: FDA Issues First Emergency Authorization for Sample Pooling in Diagnostic Testing Full genome viral sequences inform patterns of SARS-CoV-2 spread into and within Israel Rethinking Covid-19 Test Sensitivity -A Strategy for Containment Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR A quantification of human cells using an ERV-3 real time PCR assay