key: cord-0968469-pid58g2r authors: Ben-Ami, Roni; Klochendler, Agnes; Seidel, Matan; Sido, Tal; Gurel-Gurevich, Ori; Yassour, Moran; Meshorer, Eran; Benedek, Gil; Fogel, Irit; Oiknine-Djian, Esther; Gertler, Asaf; Rotstein, Zeev; Lavi, Bruno; Dor, Yuval; Wolf, Dana G.; Salton, Maayan; Drier, Yotam title: Large-scale implementation of pooled RNA extraction and RT-PCR for SARS-CoV-2 detection date: 2020-06-23 journal: Clin Microbiol Infect DOI: 10.1016/j.cmi.2020.06.009 sha: 558f2ccfc8670c19d0fba3e4343a10bd7e566ec6 doc_id: 968469 cord_uid: pid58g2r OBJECTIVES: Testing for active SARS-CoV-2 infection is a fundamental tool in the public health measures taken to control the COVID-19 pandemic. Due to the overwhelming use of SARS-CoV-2 RT-PCR tests worldwide, availability of test kits has become a major bottleneck, while the need to increase testing throughput only rises. We aim to overcome these challenges by pooling samples together, performing RNA extraction and RT-PCR in pools. METHODS: We tested the efficiency and sensitivity of pooling strategies for RNA extraction and RT-PCR detection of SARS-CoV-2. We tested 184 samples both individually and in pools to estimate the effects of pooling. We further implemented Dorfman pooling with a pool size of 8 samples in large-scale clinical tests. RESULTS: We demonstrated pooling strategies that increase testing throughput while maintaining high sensitivity. A comparison of 184 samples tested individually and in pools of 8 samples, showed that test results were not significantly affected. Implementing the 8-sample Dorfman pooling to test 26,576 samples from asymptomatic individuals, we identified 31 (0.12%) SARS-CoV-2 positive samples, achieving a 7.3-fold increase in throughput. CONCLUSIONS: Pooling approaches for SARS-CoV-2 testing allow a drastic increase in throughput while maintaining clinical sensitivity. We report the successful large-scale pooled screening of asymptomatic populations. An emerging novel severe acute respiratory syndrome-related coronavirus, SARS-CoV-2, is the virus behind the global COVID-19 pandemic. Among the foremost priorities to facilitate efficient public health interventions is a reliable and accessible diagnosis of an active SARS-CoV-2 infection. The standard laboratory diagnosis of COVID-19 involves three main steps, namely, viral inactivation and lysis of the nasopharyngeal swab sample, extraction (or purification) of viral RNA, and reverse transcription (RT)-PCR. Due to the rapid spread of the virus and the increasing demand for tests, the limited availability of test reagents, mainly RNA extraction kits, has become (and will likely continue to be) a major bottleneck as the pandemic expands [1, 2] . Of particular importance is the ability to survey large asymptomatic populations-(1) to trace asymptomatic COVID-19 carriers which are otherwise difficult to identify and isolate; (2) to assure key personnel (e.g. healthcare personnel) are not contagious; (3) to screen high risk populations (such as nursing homes) to help protect them; (4) to accurately estimate the spread of infection and the effectiveness of community measures and social distancing; and (5) to allow and monitor a safe return to work. Efficient and higher-throughput diagnostic approaches are needed to support such efforts. While some of these applications (e.g. (4)) may be achieved with less sensitive detection approaches, most applications do require adhering to the current high standards of RT-PCR. Several attempts to address this challenge were recently reported, and can be categorized into three major approaches. The first approach is to replace PCR-based methods by other direct diagnostic methods such as Loop-mediated Isothermal Amplification (LAMP) [3] [4] [5] [6] and CRISPR based diagnostic tools [7] [8] [9] . The second approach involves serological surveys [10] [11] [12] [13] , and the third approach involves the improvement of the PCR methods capacity by optimization and automation [1, 14, 15] or by reducing the required number of tests via pooling samples together, known as group testing. Group testing is a field of research in the intersection of mathematics, computer science and information theory, with applications in biology, communication and more. A group testing algorithm is a testing scheme directed towards minimizing the number of tests conducted on a set of samples by using the ability to test pooled subsets of samples. If a pool of n samples tests negative, all samples must be negative, and therefore their status has been determined in only one test instead of n individual tests. Various group testing algorithms exist, with different assumptions and constraints [16, 17] . While many such algorithms, most notably binary splitting, may be very efficient in theory, they might be unsuitable because of practical limitations. Some key constrains are (1) a limit on the number of stages due to the importance of delivering a test result quickly, exemplified by the urgent clinical context of COVID-19 diagnosis; (2) a limit on the ability to dilute samples and still safely identify a single positive sample in a pool; and (3) favorability of simple algorithms which may minimize human error in a laboratory setting. While several pooling approaches for SARS-CoV-2 detection were recently suggested [2, [18] [19] [20] [21] [22] , these studies mostly discussed theoretical considerations. Here we describe and demonstrate practical pooling solutions that save time and reagents by performing RNA extraction and RT-PCR on pooled samples. We offer two such pooling approaches, based either on simple (Dorfman) pooling or matrix pooling [23, 24] , and demonstrate their efficiency and sensitivity in the daily reality of COVID-19. At the Hadassah Medical Center (HMC), two distinct populations are tested for SARS-CoV-2 at present. First, we receive samples from symptomatic patients, from the hospital and from the community. In these samples, about 10% of SARS-CoV-2 tests are positive. Second, we receive samples from prospectively screened asymptomatic populations such as hospital employees and workers in essential industries. According to the Israeli Ministry of Health guidelines all samples were collected using a single swab for combined deep nasal and oropharyngeal collection from the same patient. Nasopharyngeal swab samples were collected in 2ml Viral Transport Medium (VTM) or collected directly to 2ml Zymo lysis buffer. The first pooling strategy is a simple two-stage testing algorithm known as Dorfman pooling [25] . In the first stage, the samples are divided into disjoint pools of n samples each, and each such pool is tested. A negative result implies that all samples in the pool are negative, while a positive result implies that at least one sample in the pool is positive. In the second stage, the samples of each pool that tested positive are individually tested. To reduce the need to retest positive pools we have also tested a two-stage matrix pooling strategy [23, 24] , where n 2 samples are ordered in an n x n matrix. Each row and each column are pooled, resulting in 2n tests, QIAsymphony DSP Virus/Pathogen kit on a QIAsymphony platform. We pooled equal volumes of sample lysate to a final volume of 400µL. Positive pools were validated by individual tests as described above. Both Qiagen kits were used with Zymo lysis buffers, and therefore we skipped the lysis and Proteinase K step. RNA was eluted into 60µL; 10µL of RNA was used for a 30µL reaction using Real-Time Fluorescent RT-PCR kit (BGI). To reduce the risk of contamination, daily ultraviolet irradiation of RNA extraction robots was performed, and different rooms for processing before and after PCR were set up, without mixing personnel or machines between the two compartments. Note that analysis of pool results requires close attention to indeterminate-result pools, as these may contain individual positive samples. Therefore, all pools detected with Ct ≤ 39 were retested (see Table 2 , batch 3), while maintaining standard criteria for the individual tests when retesting. We define the efficiency of a pooling algorithm as the total number of samples divided by the expected number of tests conducted on them. We assume all samples are independent and identically distributed, and denote the probability of a sample to be positive by p (prevalence of detectable COVID-19 patients in the relevant population) and the pool size by n. The efficiency of the algorithms described above depends on both p and n. The best theoretical efficiency is ‫݈݃−(‬ ଶ ‫)(‬ − (1 − ‫݈݃)‬ ଶ (1 − ‫))‬ ିଵ [26] . The efficiency of Dorfman [25] . We chose a pool size of n=8 samples as it allows low false negative rate ( Figure 1) and high efficiency for a wide range of COVID-19 prevalence ( Table 1 and Supplementary Table S1 ). The prevalence of detectable COVID-19 in an asymptomatic population is estimated to be considerably below 1% [27] , and indeed of the 26,576 samples tested in the present study only 0.12% were found positive. Therefore, efficiency is likely to be 5-7.5. For higher prevalence the efficiency of matrix pooling is somewhat higher (see Table 1 and supplementary note). We provide a tool (https://github.com/matanseidel/pooling_optimization) to help choose the approach and pool size based on the prevalence. These studies were part of the approved diagnosis optimization and validation procedures at the HMC, and therefore no additional Institutional Review Board approvals were required. A key requirement of pooled RNA extraction and RT-PCR tests is to retain sufficient sensitivity. Theoretically, This approach yielded highly accurate results, with no loss of diagnostic assay sensitivity: each of the pools that contained one or more positive samples was found to be positive, and all the pools that contained only negative samples were found to be negative (Figure 1) . Of the 5 pools which contained one individual sample with an "indeterminate" result (in each pool), one was found to be negative per definitions of individual tests, but still with Ct < 39 that allows pool retesting. In addition, we tested matrix pooling (see Methods) by pooling 75 samples into three 5 x 5 matrices, and identified all positive samples accurately (Figure 2) . Importantly, the positive samples were detected in both the row and the column pools at a similar cycle in all three tested matrices, suggesting the pooling scheme is robust. Given the successful validation of both pooling strategies, and the low prevalence in asymptomatic population, we have adopted a Dorfman pooling protocol of 1:8 and employed it for the routine testing of nasopharyngeal swab samples from screened asymptomatic healthcare personnel, employees of essential industries, and residents and employees of nursing homes. In the first three batches run at the HMC ( Table 2) We demonstrate in a real-life situation the usefulness of pooled sampling starting at the early lysate stage. The simplicity of the method, similarity to currently approved procedures, and the fact that we do not require special sample handling or additional information make it easily adoptable on a large scale. This saves time, work and reagents, allowing a considerable throughput increase of clinical diagnostic labs and opening the door for efficient screening of large asymptomatic populations for the presence of SARS-CoV-2 infection. An important consideration before implementing group testing is the expected rate of false positive and false negative results. Based on our experience with over 26,500 samples from asymptomatic individuals, we did not encounter any false positives in the pools (see Figure 1 and Table 2 ). False negatives are in principle more worrisome when testing in pools, because samples that failed at the RNA extraction step will be missed (while our individual testing includes amplification of a human transcript serving as an internal control for proper RNA extraction and RT-PCR of each sample). To define the magnitude of this potential problem, we examined a set of 13,781 individual tests done at our center, which were all expected to show a signal for a human gene serving as internal assay control. Amplification of the human gene failed in 52 samples (0.38%). Thus, we estimate that our current protocol of pooled sampling carries a risk of missing 0.38% of the positive samples. In a population of 1,000,000 individuals tested, of which 1,000 are positive (rate of 0.1%), this predicts that 4 positive individuals will be missed when using pools. We posit that this is a tolerable situation, particularly given the potentially much higher rates of false negative results due to swab sampling and other errors upstream. Largescale implementation of the pooling scheme should be carefully done to assure pre-analytical influences (e.g. inadequate sampling, transport time, temperature influences) do not lead to significant loss of signal, which may further increase the risk of false negatives. Increase in throughput applies to RNA extraction and RT-PCR, but not to viral inactivation and lysis or reporting of results. Reporting at the HMC is automated and was adapted to the pooling scheme and therefore does not require additional work. Viral inactivation and lysis typically take 30min while RNA extraction and RT-PCR 4-5 hours, and therefore 7.3-fold increase in efficiency translates to ~4.5 increase in efficiency of the entire workflow, or more if efficiency of viral inactivation is increased by other means (e.g. automation). The increase in efficiency allowed the HMC to survey healthcare personnel and multiple nursing homes, and identify a nursing home with 16 positive individuals, helping to stop the spread at that center. Specifically, we have demonstrated that pooling lysates from 5 or 8 nasopharyngeal swab samples retains sufficient sensitivity of viral RNA detection, allowing identification of SARS-CoV-2-positive individuals, while increasing throughput 5-fold to 7.5-fold. The prevalence of COVID-19 in the tested population is not always known, which could affect the optimal pool size. This could be addressed either by external estimates, such as a previous run of individual samples, rate of symptomatic patients, or alternative methods such as serological screening or wastewater titers monitoring [28, 29] . Alternatively, it is possible to dynamically adapt pooling sizes, when the measured rate of positive samples is different than expected. Finally, some group testing algorithms (reviewed in [17] ) estimate the number of positive samples while using a relatively small (logarithmic) number of tests, and may be adapted to clinical constraints and parameters. If samples are not independent, and we have information regarding their dependency, we can further improve efficiency by grouping together dependent samples, that is, samples that are likely all positives or all negatives, such as members of the same family, or samples that are likely to be all negative since they have a low risk profile. This will increase the number of negative pools, and therefore decrease the overall number of tests conducted. Future improvement of the sensitivity of the test, such as better sets of primers and improved sample collection will allow retaining sensitivity even when pooling a large number of sample lysates together. This will enable further improving efficiency, especially when prevalence is low, by increasing the pool size. All authors disclose no conflicts of interest. Agnes Klochendler and Matan Seidel contributed equally to this work. Yuval Dor, Dana Wolf, Maayan Salton and Yotam Drier contributed equally to this work Maayan Salton and Yotam Drier. Acquisition, analysis, or interpretation of data Writing original draft, review and editing: Yuval Dor, Dana Wolf, Maayan Salton and Yotam Drier. Supervision: Yuval Dor, Dana Wolf, Maayan Salton and Yotam Drier An alternative workflow for molecular detection of SARS-CoV-2 -escape from the NA extraction kit-shortage Evaluation of COVID-19 RT-qPCR test in multi-sample pools Rapid Molecular Detection of SARS-CoV-2 (COVID-19) Virus RNA Using Colorimetric LAMP. Infectious Diseases (except HIV/AIDS) 2020 Rapid and visual detection of 2019 novel coronavirus (SARS-CoV-2) by a reverse transcription loop-mediated isothermal amplification assay. Clinical Microbiology and Infection SARS-CoV-2 Onthe-Spot Virus Detection Directly From Patients. Public and Global Health Era of molecular diagnosis for pathogen identification of unexplained pneumonia, lessons to be learned An ultrasensitive, rapid, and portable coronavirus SARS-CoV-2 sequence detection method based on CRISPR-Cas12 Novel Coronavirus SARS-CoV-2 Using a CRISPR-based DETECTR Lateral Flow Assay. Infectious Diseases (except HIV/AIDS) 2020 Profiling Early Humoral Response to Diagnose Novel Coronavirus Disease (COVID-19) Evolving status of the 2019 novel coronavirus infection: Proposal of conventional serologic assays for disease diagnosis and infection monitoring Molecular and serological investigation of 2019-nCoV infected patients: implication of multiple shedding routes Evaluation of nine commercial SARS-CoV-2 immunoassays Analytical sensibility and specificity of two RT-qPCR protocols for SARS-CoV-2 detection performed in an automated workflow High-throughput extraction of SARS-CoV-2 RNA from nasopharyngeal swabs using solid-phase reverse immobilization beads. Infectious Diseases (except HIV/AIDS) 2020 Group Testing: An Information Theory Perspective Evaluation of Group Testing for SARS-CoV-2 RNA. Infectious Diseases (except HIV/AIDS) 2020 Efficient and Practical Sample Pooling High-Throughput PCR Diagnosis of COVID-19. Public and Global Health Pooled-sample analysis strategies for COVID-19 mass testing: a simulation study Efficient high throughput SARS-CoV-2 testing to detect asymptomatic carriers. Infectious Diseases (except HIV/AIDS) 2020 Group Testing against COVID-19 Rapid identification of yeast artificial chromosome clones by matrix pooling and crude lysate PCR Theoretical analysis of library screening using a N-dimensional pooling strategy The Detection of Defective Members of Large Populations Group testing with prior statistics Suppression of COVID-19 outbreak in the municipality of Vo SARS-CoV-2 RNA in wastewater anticipated COVID-19 occurrence in a low prevalence area Early SARS-CoV-2 outbreak detection by sewage-based epidemiology