key: cord-0687493-7vizeuha authors: Carpenter, Rob E.; Tamrakar, Vaibhav; Chahar, Harendra; Vine, Tyler; Sharma, Rahul title: Confirming Multiplex Q-PCR Use in COVID-19 with Next Generation Sequencing: Strategies for Epidemiological Advantage date: 2022-02-23 journal: bioRxiv DOI: 10.1101/2022.02.22.481485 sha: da9d4296e76de2ce5c2aaacc20d823ecd6ef19de doc_id: 687493 cord_uid: 7vizeuha Rapid classification and tracking of emerging SARS-CoV-2 variants are critical for understanding the transmission dynamics and developing strategies for interrupting the transmission chain. Next-Generation Sequencing (NGS) is an exceptional tool for whole-genome analysis and deciphering new mutations. The technique has been instrumental in identifying the Variants of Concern and tracking this pandemic. However, NGS remains expensive and time-consuming for large-scale monitoring of COVID-19. This study analyzed a total of 78 de-identified samples that screened positive for SARS-CoV-2 from two timeframes, August 2020 and July 2021. All 78 samples were classified into WHO lineages by whole genome sequencing then compared with two commercially available Q-PCR assays for spike protein mutation(s). The data showed good concordance with Q-PCR and NGS analysis for specific SARS-COV-2 lineages and characteristic mutations. Deployment of Q-PCR testing to detect known SARS-COV-2 variants may be extremely beneficial. These assays are quick and cost-effective, thus can be implemented as an alternative to sequencing for screening known mutations of SARS-COV-2 for clinical and epidemiological interest. The findings support the great potential for Q-PCR to be an effective strategy offering several COVID-19 epidemiological advantages. as Variants of Concern (VOC; Table 1 The virus has demonstrated that genomic changes in its receptor binding domain (RBD)-a region of the spike protein that studs SARS-CoV-2 to the outer cell surface-inserts increased capacity to strike in several outbreak phases in different parts of the world. 9 More recently, South Africa Table 1 . T95I G496S (S494P*) D614G D614G Δ157 G142D Q498R N501Y R158G Δ143-145 N501Y A570D L452R Δ211 Y505H D614G T478K L212I T547K P681H D614G ins214EPE D614G P681R G339D H655Y D950N S371L N679K S373P P681H S375F N764K K417N D796Y N440K N856K G446S Q954H S477N N969K T478K L981F *Detected in some sequences but not all. There continues to be a need for swift and cost-effective SARS CoV-2 variant detection and monitoring. Genomic sequencing is the gold standard and most reliable method for the detection of such changes in the viral genome. The standard Sanger sequencing method 12 is highly accurate but it can only sequence a small fraction of the genome. Sanger sequencing is also laborious, time-consuming, and expensive for large-scale sequencing projects that require rapid turnaround times. These attributes make Sanger sequencing less attractive for SARS CoV-2 sequencing for variant identification and monitoring. Targeted Next-Generation Sequencing (NGS) is also a reliable method to identify variant strains of pathogens, including viruses. 13 The principal advantage of NGS over other techniques like Sanger sequencing or RT-PCR is that scientists and laboratorians do not require prior knowledge of existing nucleotide sequences. Moreover, NGS has higher discovery power and higher throughput. 13 In the current pandemic, NGS has widely been employed to detect and identify novel mutated viral variants of SARS COV-2. 14 CoV-2 lineage assignment using Q-PCR). The 78 However, the raw data from the 20 samples sequenced at Advanta Analytical Laboratories was analyzed for phylogenetic relationship and mutation discovery ( Table 2) . This data revealed novel mutations belonging to existing prominent lineages along with convergent mutations of different lineages and one unique mutation (1). We then turned our focus to testing the 67 Delta (B. Moreover, 5 of 67 samples were negative for T478K (GT Molecular), and 12/67 were negative for P681R specific PCR (Thermo Fisher) using Q-PCR, (Table 3) . Unfortunately, we could not verify the absence of these mutations because NGS data was not available for the 64 samples sequenced at Fulgent Genetics. Thus, the L452R mutation remained the most informative marker for Q-PCR based detection of the Delta variant. All 11 samples sequenced as non-Delta variants were negative for all three Delta variant-specific mutations (Table 3) . Interestingly, a Beta and Gamma variant classifying mutation (E484K) was identified (both by Q-PCR assays and NGS) in one sample, which is otherwise classified as Delta variant by NGS and carries a L452R mutation. This mutation combination is suggestive of the continuous evolution of the coronavirus genome. Of note, the 4 samples with lower viral amplification (3530) that were included in this study were able to be characterize by NGS and both Q-PCR assays. Two out of the four samples were identified as Delta (B.1.617.2) variants with the remaining two identified as non-VOC. Therefore, NGS and Q-PCR methodologies can potentially be used for SARS CoV-2 variant detection from the samples with lower viral amplification (1000-10 copies). The novelty of this research is that it demonstrated that Q-PCR is as effective as NGS in detecting SARS-CoV-2 mutations. Two Q-PCR-based assays for the detection of SARS-CoV-2 mutagenic variants were tested and compared with NGS data. Both assays were able to detect L452R mutation with 100% (67/67; GT Molecular) and 94% (63/67; Thermo Fisher) accuracy when compared to NGS. While NGS is an essential tool for sequencing the entire genome and identification of new mutations, this study suggests Q-PCR can aptly serve as an easy to deploy, cost-effective, and time-sensitive solution for the detection of known mutations for mass surveillance. Likewise, this approach has been previously applied for surveillance of leprosy and identification of zoonotic transmission in the United States. 17 The FASTQ sequence file was analyzed and visualized for evolutionary relationships through the open-source toolkit Nextstrain (https://clades.nextstrain.org/). GSAID database for global SARS-CoV-2 sequence analysis, available from the Nexstrain server was used to retrieve representative variant sequences. 1 The NCBI databank was used to retrieve the original Wuhan strain SARS-CoV-2 sequence. All the individual consensus genome sequence files were aligned by using Clustal-W multiple sequence alignment tool. 2 The phylogenetic analysis was carried out utilizing the Clustal omega server and the phylogenetic tree was constructed using Mega X tool 3 with default parameters of maximum likelihood method. The further analysis aimed at investigating the conservation of spike protein in reference sequences vs clinical strains of SARS-CoV-2 from our study using bioinformatics tools. The protein sequences for different ORFs were determined by either annotation by IBM Functional Genomics Platform. 4 T-COFFEE and PRALINE software 5,6 were used for the alignment of spike proteins from different isolates and mutation position analysis. Commercially available assays from two vendors (GT Molecular [Colorado USA], and Thermo Fisher Scientific [Massachusetts, USA]) were evaluated for detection of known variants, and results were compared to the NGS-based variant detection of the same samples (Table-3 Assays were provided in two different kits containing the variant-specific reference standard and mutation-specific primer-probe. Amplifications were performed according to the manufacturer's instructions in four separate master mix preparations as described in Table-3. Briefly, RNA was reverse transcribed for 10 minutes at 53˚C followed by enzyme activation 2 minutes at 95˚C, and 40 cycles of 15 seconds at 95˚C for Denaturation and 60 seconds at 52˚C for Annealing/Extension. Reactions were performed by using qScript 1-Step Virus ToughMix Genomic study of COVID-19 corona virus excludes its origin from recombination or characterized biological sources and suggests a role for HERVS in its wide range symptoms The origin of COVID-19 and why it matters CSSEGISandData. COVID-19 data repository by the center for systems science and engineering (CSSE) at Johns Hopkins University. GitHub. Accessed Diagnostics for SARS-CoV-2 infections The coronavirus E protein: assembly and beyond Current status of laboratory diagnosis for COVID-19: a narrative review Features, evaluation, and treatment of coronavirus (COVID-19) A critical analysis of the impacts of COVID-19 on the global economy and ecosystems and opportunities for circular economy strategies Classification of Omicron (B.1.1.529): SARS-CoV-2 variant of concern. World Health Organization )-sarscov-2-variant-of-concern 11. SARS-CoV-2 variants of concern DNA sequencing with chain-terminating inhibitors Targeted next-generation sequencing can replace Sanger sequencing in clinical diagnostics Evolution and genetic diversity of SARS-CoV-2 in Africa using whole genome sequences Guidelines for diagnostic next-generation sequencing Transmission dynamics of an outbreak of the COVID-19 Delta variant B.1.617.2-Guangdong Province, China Probable zoonotic leprosy in the southern United States Zoonotic leprosy in the Southeastern United States Presence of mismatches between diagnostic PCR assays and coronavirus SARS-CoV-2 genome Mutations in animal SARS-CoV-2 induce mismatches with the diagnostic PCR assays Missed detections of influenza A(H1)pdm09 by real-time RT-PCR assay due to haemagglutinin sequence mutation Appendix A1 Primer and Probe design N1 Gene: 2019-nCoV_N1-Forward Primer -GACCCCAAA ATCAGCGAAAT 2019-nCoV_N1-Reverse Primer-TCTGGTTACTGCCAGTTGAAT CTG 2019-nCoV_N1-Probe-FAM-ACCCCGCATTACGTTTGGTGGACC-BHQ1 N2 Gene: 2019-nCoV_N2-Forward Primer-TTACAAACATTGGCCGCAAA 2019-nCoV_N2-Reverse Primer -GCGCGACATTCCGAAGAA 2019-nCoV_N2-Probe-FAM-ACAATTTGCCCCCAGCGCTTCAG-BHQ1 Nextstrain: real-time tracking of pathogen evolution Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets Semi-supervised pipeline for autonomous annotation of SARS-CoV-2 genomes fr: robust phylogenetic analysis for the non-specialist PRALINE: a versatile multiple sequence alignment toolkit Expanded Description of RNA Extraction, Library Preparation and Sequencing, NGS Data Analysis, Phylogenic Analysis, and COVID-19 Lineage Assignment Using Q-PCR Each sample was extracted from nasopharyngeal or oropharyngeal swabs collected and transported to the lab in MANTACC Transport Medium or Viral Transport Medium (VTM) purchased from Criterion Clinical (https://criterionclinical.com/). RNA extraction was carried out in a preamplification environment within a Biosafety level 2 (BSL-2) facility. RNA isolation was performed as part of routine diagnostic testing using the Roche MagNA Pure 96 System and Viral NA Small Volume Kits. Briefly, samples were lysed with 340 uL of lysis buffer and 10 uL of proteinase K at 55°C for 10 minutes followed by extraction via the Roche MagNA Pure 96 instrument. Extracted nucleic acids were immediately sealed with a PCR clean sealing film (Cat # T329-1 Simport Scientific Inc. QC J3G 4S5 Canada) and frozen at -80°C until sequencing was imminent. The libraries were prepared using Illumina COVIDSeq protocol (Illumina Inc, USA). Total RNA was primed with random hexamers and first-strand cDNA was synthesized using reverse transcriptase. The SARS-CoV-2 genome was amplified using the two sets of primers (COVIDSeq Primer Pool-1 & 2) in two multiplex PCR protocols to produce two sets of amplicons spanning the entire genome of SARS-CoV-2. The PCR amplified product was then processed for tagmentation and adapter ligation using 24 IDT for Illumina Nextera UD Indexes Set A. Further enrichment and cleanup were performed as per protocols provided by the manufacturer (Illumina Inc, USA). A