key: cord-0951153-lpoelp91 authors: Artesi, Maria; Bontems, Sébastien; Göbbels, Paul; Franckh, Marc; Maes, Piet; Boreux, Raphaël; Meex, Cécile; Melin, Pierrette; Hayette, Marie-Pierre; Bours, Vincent; Durkin, Keith title: A Recurrent Mutation at Position 26340 of SARS-CoV-2 Is Associated with Failure of the E Gene Quantitative Reverse Transcription-PCR Utilized in a Commercial Dual-Target Diagnostic Assay date: 2020-09-22 journal: J Clin Microbiol DOI: 10.1128/jcm.01598-20 sha: bc6cf7e6529af2516345930df7b00122fd8ef8d6 doc_id: 951153 cord_uid: lpoelp91 Control of the ongoing severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic requires accurate laboratory testing to identify infected individuals while also clearing essential staff to continue to work. At the current time, a number of quantitative real-time PCR (qRT-PCR) assays have been developed to identify SARS-CoV-2, targeting multiple positions in the viral genome. While the mutation rate of SARS-CoV-2 is moderate, given the large number of transmission chains, it is prudent to monitor circulating viruses for variants that might compromise these assays. Here, we report the identification of a C-to-U transition at position 26340 of the SARS-CoV-2 genome that is associated with failure of the cobas SARS-CoV-2 E gene qRT-PCR in eight patients. As the cobas SARS-CoV-2 assay targets two positions in the genome, the individuals carrying this variant were still called SARS-CoV-2 positive. Whole-genome sequencing of SARS-CoV-2 showed all to carry closely related viruses. Examination of viral genomes deposited on GISAID showed this mutation has arisen independently at least four times. This work highlights the necessity of monitoring SARS-CoV-2 for the emergence of single-nucleotide polymorphisms that might adversely affect RT-PCRs used in diagnostics. Additionally, it argues that two regions in SARS-CoV-2 should be targeted to avoid false negatives. of the SARS-CoV-2 genome (10) , with sharing of the resultant data (11) and phylogenetic analysis (12, 13) . Laboratory testing for SARS-CoV-2 is a cornerstone of the strategy to mitigate its spread, as it facilitates the identification and isolation of infected individuals, while negative tests can allow essential personnel to continue to work (14) . In the context of SARS-CoV-2, due to its high transmissibility (15) , false negatives could have particularly adverse effects on efforts to control its spread. As qRT-PCR oligonucleotides rely on binding to small ϳ20-bp regions, mutations in these targets have the potential to impair efficient amplification or probe binding, thereby generating false negatives. In contrast to other RNA viruses, coronaviruses have a moderate mutation rate due their ability to carry out RNA proofreading (16) . Nevertheless, given the large number of ongoing transmission chains, it remains prudent to monitor the integrity of qRT-PCR assays. Here, we report the identification of a single-nucleotide polymorphism (SNP) in the E gene of SARS-CoV-2 that is associated with the failure of the qRT-PCR that targets the E gene in the cobas SARS-CoV-2 test (Roche). As this dual-target assay also detects a region in ORF1b, these samples were still correctly identified as SARS-CoV-2 positive. This observation highlights the necessity of targeting two regions in SARS-CoV-2 RT-PCR assays and shows the role sequencing can play in resolving and anticipating problems with the qRT-PCR assays in use. RNA extraction and real-time PCR. The study was approved by the Comité d'Ethique Hospitalo-Facultaire Universitaire de Liège (reference number CE 2020/137). COVID-19 detection was routinely performed using the cobas 6800 platform (Roche). For this, 400 l of nasopharyngeal swabs in a preservative medium (Amies or UTM) were first incubated at room temperature for 30 min with 400 l of cobas PCR media kit (Roche) for viral inactivation. Samples were then loaded on the cobas 6800 platform using the cobas SARS-CoV-2 assay for the detection of the ORF1ab and E genes. For qRT-PCR control and sequencing analysis, RNA was extracted from clinical samples (300 l) on a Maxwell 48 device using the Maxwell RSC viral RNA kit (Promega) following a viral inactivation step using proteinase K according to the manufacturer's instructions. RNA elution occurred in 50 l RNase-free water, and 5 l was used for the RT-PCR. Reverse transcription and RT-PCR were performed on a LC480 thermocycler (Roche) based on the Corman et al. (9) protocol for the detection of RdRP and E genes using the TaqMan fast virus 1-step master mix (Thermo Fisher). Primers and probes (Eurogentec, Belgium) were used as described by the authors (9) . SARS-CoV-2 whole-genome sequencing. Reverse transcription was carried out using SuperScript IV VILO master mix, and 3.3 l of RNA was combined with 1.2 l of master mix and 1.5 l of H 2 O. This was incubated at 25°C for 10 min, 50°C for 10 min, and 85°C for 5 min. PCRs used the primers and conditions recommended in the nCoV-2019 sequencing protocol (17) . Primers from version 3 of the Artic Network were used and were synthesized by Integrated DNA Technologies. Samples were multiplexed using the Oxford Nanopore native barcoding expansion kits 1 to 12 and 13 to 24, in conjunction with a ligation sequencing kit. Sequencing was carried out on a Minion using R9.4.1 flow cells. Data analysis followed the nCoV-2019 novel coronavirus bioinformatics protocol of the Artic network (17) . The resulting consensus viral genomes have been deposited at the Global Initiative on Sharing All Influenza Data (GISAID) (11) . Sanger sequencing. Reverse transcription was carried out as described above. The primers nCoV-2019_87_LEFT and nCoV-2019_87_RIGHT from the Artic Network nCoV-2019 amplicon set (17) were used to amplify the regions between positions 26198 and 26590. The resultant PCR product was purified using Ampure XP beads (Beckman Coulter), sequenced using a BigDye Terminator cycle-sequencing kit (Applied Biosystems), and run on an ABI PRISM 3730 DNA analyzer (Applied Biosystems). Phylogeny. SARS-CoV-2 genomes and the associated metadata were downloaded from GISAID (https://www.gisaid.org/) on 25 May 2020. Viral genomes marked by GISAID as complete (Ͼ29,000 bases) and high coverage (Ͻ1% Ns, Ͻ0.05% unique amino acid mutations, and no insertion/deletions unless verified by submitter) were selected, leaving 20,386 viral genomes. We also downloaded viral genomes using a less stringent cutoff, requiring the virus to be complete (Ͼ29,000 bases) and excluding viruses with low coverage (Ͼ5% Ns); in this case, 29,699 viral genomes remained. Viruses carrying a variant at position 26340 were identified with SeqKit (18) using the following grep command and motif encompassing the variant (underlined): "seqkit grep -s -i -p TTACACTAGCTATCCTTACTG." The viruses containing the variant were added to the list of viruses to include in the Nextstrain build. Viruses from nonhuman hosts were excluded from the analysis. Nextstrain phylogenetic trees were generated for both data sets using the default configuration (https://github.com/nextstrain/ncov). The SARS-CoV-2 genomes were assigned to a lineage via pangolin (https://github.com/hCoV-2019/ pangolin), which used the virus nomenclature proposed by Rambaut et al. (19) . Real-time PCR. The cobas system (Roche) implements a dual-target assay to detect SARS-CoV-2, with qRT-PCRs targeting both the ORF1ab region and the E gene (see Fig. S1 in the supplemental material). During the course of routine SARS-CoV-2 testing, we observed eight samples that were negative for the E gene qRT-PCR but positive for the ORF1ab qRT-PCR (Table 1) . These represented 0.2% of the SARS-CoV-2 positive samples we identified between 23 March and 25 May 2020 (during this time period, 25,619 tests were carried out on the system, and 3,398 [13.3%] were positive). Four of these samples were retested using the Corman et al. (9) SARS-CoV-2 assay that targets the RdRP and E genes. In this instance, both the RdRP and E gene qRT-PCRs were positive in all four samples (Table 1 ). All came from Belgian health care workers in the same service, with sampling dates that ranged between 23 March and 17 April 2020 (Fig. 1A) . As the samples were positive for the ORF1ab qRT-PCR, all samples were correctly classified as positive by the cobas system (Roche). SARS-CoV-2 whole-genome sequencing. We speculated that these samples carried a common variant that interfered with the E gene qRT-PCR and carried out whole-genome sequencing of the viruses using the Artic Network protocol (17) . The consensus genomes generated showed six individuals to be infected with a genetically identical virus (Fig. 1A) . The remaining two viruses shared the same SNPs as the previous six but had accumulated additional mutations (suggesting continued spread of the lineage in the area). In two cases we also had a 2-week follow-up sample from the same patient; in each case, the consensus viral genomes generated were identical (Fig. S2) . The six identical viruses (derived from different patients) deviated from the MN908947.3 reference isolated in Wuhan at only three positions (Fig. 1A) . The first two SNPs were toward the 5= end of the virus at positions 1,440 and 2,891, respectively. The third SNP, a C-to-U transition at position 26340, is within the E gene of the virus and was validated by Sanger sequencing in four samples (Fig. 1B) . This SNP overlaps the E gene probe used in the Corman et al. (9) RT-PCR assay; however, as was mentioned above, it does not appear to affect the performance of this assay in our hands. Unfortunately, the position of primers and probes utilized in the cobas E-gene assay (Roche) are not publicly available, nevertheless it is parsimonious to assume that this SNP is the cause of the failure of the E-gene qRT-PCR implemented in the cobas system. Phylogeny. Out of the 229 SARS-CoV-2 genomes we have sequenced at the time of writing, eight carry the SNP at position 26340. To see if the same variant was circulating more widely, we examined the SARS-CoV-2 sequences deposited in GISAID for a variant at the same position. When only complete, high-coverage genomes are considered (20,386 genomes), 18 were found to carry a C-to-U transition at position 26340 (0.09%). Eight of these were sequenced by us and seven were isolated in England, two in Switzerland, and one in Turkey. As can be seen in Fig. 2 , viruses isolated in the same country cluster together; however, they do not cluster with other viruses carrying the SNP at position 26340. We also classified the viral genomes according to the nomenclature proposed by Rambaut et al. (19) . Table 2 shows that samples isolated in the same country belong to the same lineage, with no overlap in lineage between countries. As a consequence, it appears that this variant has arisen multiple times in different transmission chains (homoplasic site). Finally, we relaxed the filtering of viral genomes, selecting genomes of Ͼ29,000 bases in length and with less than 5% Ns (we no longer required the virus to be classified as high coverage). This added 9,313 genomes (29,699 in total) and revealed eight additional viruses carrying a C-to-U transition at 26340 (0.09%) (Fig. S3 ). Of these, six were isolated in England, four clustered with the previous English samples, while the other two fell in different parts of the tree. Of the remaining two viruses, one was isolated in Australia and the second was sequenced in Luxembourg. Interestingly, the Luxembourg virus clustered with the samples identified by us and was assigned to the same B.3 lineage, suggesting it is part of the same cluster of infections. As the positions of the primers and probes used in the cobas (Roche) E gene qRT-PCR have not been disclosed to us upon request, we cannot definitively conclude that the C-to-U transition at position 26340 of the SARS-CoV-2 genome causes the failure in the E gene qRT-PCR in the patients examined. However, given the available data, causality appears likely. The cobas E gene qRT-PCR may use a primer-probe combination that is more sensitive to the presence of the SNP than the Corman et al. (9) E gene assay. Alternatively, it may target the same positions, but differences in reagents used and cycling conditions may prevent binding of the probe in the presence of the SNP. The E gene qRT-PCR implemented in the cobas assay is intended to facilitate pan-Sarbecovirus detection (20) . However, a U is found at the same relative position in the SARS-CoV-1 MA15 isolate P3pp5 (GenBank accession no. FJ882961.1) and in the bat coronavirus Cp/Yunnan2011 (GenBank accession no. JX993988.1). This suggests that similar variability occurs at this position in other coronaviruses, which could impair the effectiveness of the assay for pan-Sarbecovirus detection. It should be stressed that despite the failure of the E gene qRT-PCR in these patients, the cobas assay correctly called these individuals as positive for SARS-CoV-2 due to the ORF1ab qRT-PCR. This highlights the prudence of targeting more than one position in the viral genome in a diagnostic assay. The Corman et al. (9) protocol recommends the use of its E gene assay as a first-line screening tool, with confirmatory testing using the RdRp gene assay (9) . This SNP does not affect the Corman et al. (9) E gene qRT-PCR in our hands; however, our results highlight how a mutation in the virus can generate a false negative in a single qRT-PCR. In most cases these mutations will be rare; however, as our examination of the GISAID data have shown, such mutations have the potential to arise independently in separate transmission chains. Recently, Vogels et al. (21) examined the efficiency as well as frequency of variants impacting a number of the qRT-PCRs commonly used for SARS-CoV-2 testing. They found a number of variants that fell within the primer and probe binding sites, with the majority present at a low frequency and involving only a single base. A prominent exception involved a GGG¡AAC mutation at genome positions 28881 to 28883 that overlaps the first three bases of the 5= end of the Chinese CDC N gene forward primer (7) . This mutation is found in approximately 25% of the viruses on GISAID (accessed 25 May 2020). As the Chinese CDC assay also includes an ORF1ab qRT-PCR, viruses carrying this variant will still be detected, even if this variant impairs the N gene qRT-PCR. Nevertheless, given the high frequency of this variant, it would appear prudent to avoid using this qRT-PCR primer. This work shows the danger of relying on an assay targeting a single position in the viral genome. It also highlights the utility of combining testing with rapid sequencing of a subset of the positive samples, especially in cases where one of the qRT-PCRs fails. The sequencing allowed us to pinpoint the likely reason behind the failure of the E gene qRT-PCR. The identification of viruses carrying additional mutations as well as the clustering of the Luxembourg virus with the Belgian viruses also suggests that only a fraction of the virus carrying this variant came to our attention. This emphasizes that while the variant is at a low frequency globally, at the local level it could be much higher. This example shows that it remains prudent to continue monitoring viral genomes for variants that can negatively impact this and other diagnostic assays. Finally, it would be preferable if manufacturers were transparent about the primer and probes used, as this would allow problematic variants to be more readily identified from the available viral sequences. Supplemental material is available online only. SUPPLEMENTAL FILE 1, PDF file, 2.3 MB. China Novel Coronavirus Investigating and Research Team. 2020. A novel coronavirus from patients with pneumonia in China A new coronavirus associated with human respiratory disease in China Coronavirus disease 2019 (COVID-19) situation report-51 Coronavirus disease 2019 (covid-19) situation report 123 Identification of a novel coronavirus in patients with severe acute respiratory syndrome Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia Specific primers and probes for detection 2019 novel coronavirus COVID-19) real-time RT-PCR primer and probe Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR CDC SARS-CoV-2 sequencing guide GISAID: global initiative on sharing all influenza data-from vision to reality Nextstrain: real-time tracking of pathogen evolution A phylodynamic workflow to rapidly gain insights into the dispersal history and dynamics of SARS-CoV-2 lineages Implementation of mitigation strategies for communities with local covid-19 transmission High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2 Coronaviruses: an RNA proofreading machine regulates replication fidelity and diversity SARS-CoV-2 resources and documents SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation A dynamic nomenclature proposal for SARS-CoV-2 to assist genomic epidemiology Roche Holding AG. 2020. cobas SARS-CoV-2 user manual. Roche Holding AG Analytical sensitivity and efficiency comparisons of SARS-COV-2 qRT-PCR assays This work was supported by the Région Wallonne project WALGEMED (convention no. 1710180) and the FNRS (H.C.008.20).We thank the laboratories who submitted to and shared their sequences with GISAID. We also thank the members of the GIGA-Genomic platform for the Sanger sequencing. Thanks to Lize Cuypers (KU Leuven) for carrying out confirmation qRT-PCR assays and Josh Quick (University of Birmingham) for providing an aliquot of the V2 Artic Network primers. Finally, we thank the reviewers of the manuscript for their helpful comments and suggestions.