key: cord-0969538-2w6mi69s
authors: Gand, Mathieu; Vanneste, Kevin; Thomas, Isabelle; Van Gucht, Steven; Capron, Arnaud; Herman, Philippe; Roosens, Nancy H. C.; De Keersmaecker, Sigrid C. J.
title: Deepening of In Silico Evaluation of SARS-CoV-2 Detection RT-qPCR Assays in the Context of New Variants
date: 2021-04-13
journal: Genes (Basel)
DOI: 10.3390/genes12040565
sha: bd61a8ef684137dd62f2b590f67b93c7910e0577
doc_id: 969538
cord_uid: 2w6mi69s

For 1 year now, the world is undergoing a coronavirus disease-2019 (COVID-19) pandemic due to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The most widely used method for COVID-19 diagnosis is the detection of viral RNA by RT-qPCR with a specific set of primers and probe. It is important to frequently evaluate the performance of these tests and this can be done first by an in silico approach. Previously, we reported some mismatches between the oligonucleotides of publicly available RT-qPCR assays and SARS-CoV-2 genomes collected from GISAID and NCBI, potentially impacting proper detection of the virus. In the present study, 11 primers and probe sets investigated during the first study were evaluated again with 84,305 new SARS-CoV-2 unique genomes collected between June 2020 and January 2021. The lower inclusivity of the China CDC assay targeting the gene N has continued to decrease with new mismatches detected, whereas the other evaluated assays kept their inclusivity above 99%. Additionally, some mutations specific to new SARS-CoV-2 variants of concern were found to be located in oligonucleotide annealing sites. This might impact the strategy to be considered for future SARS-CoV-2 testing. Given the potential threat of the new variants, it is crucial to assess if they can still be correctly targeted by the primers and probes of the RT-qPCR assays. Our study highlights that considering the evolution of the virus and the emergence of new variants, an in silico (re-)evaluation should be performed on a regular basis. Ideally, this should be done for all the RT-qPCR assays employed for SARS-CoV-2 detection, including also commercial tests, although the primer and probe sequences used in these kits are rarely disclosed, which impedes independent performance evaluation.

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the coronavirus disease . It emerged at the end of 2019 in Wuhan (China) and spread globally in 2020, leading to a massive pandemic still ongoing. This potentially life-threatening new coronavirus was estimated to be responsible for already 2,728,732 deaths in 192 countries and 123,968,726 confirmed COVID-19 cases (COVID-19 dashboard accessed 23 March 2021 [1] ), putting health care systems under severe pressure [2] [3] [4] [5] .

Regarding the burden of SARS-CoV-2, many countries have set in place control measures "of concern" by the scientific community and public health authorities because they are linked to multiple amino-acid changes in the S protein, with some of them (K417N, K417T, E484K, and N501Y) located in the Receptor Binding Domain (RBD), the main functional motif interacting with the human Angiotensin-Converting Enzyme 2 (ACE2) receptor for cell entry [36, 37] . These changes are thought to improve the interaction of SARS-CoV-2 with human cells, in line with the epidemiological data showing a sudden rise of COVID-19 cases in UK and SA, associated with the prevalence of their respective new variants [30] [31] [32] [33] 38] . In addition to this enhanced transmissibility, concerns exist regarding the immunological response and vaccine efficiency, as the S protein is the primary target of neutralizing antibodies and the currently distributed vaccines [38] [39] [40] [41] .

Given the potential threat of the new variants 20I/501Y.V1, 20H/501Y.V2, and 20J/501-Y.V3, it is crucial to assess if these can be correctly detected by the RT-qPCR assays currently used for COVID-19 diagnosis. As these 3 variants carry several mutations, if some of them are located in oligonucleotide annealing sites, the resulting mismatches can lead to test failure or loss of sensitivity [18] [19] [20] [21] . However, this kind of specificity (inclusivity) evaluation can be difficult to perform in the wet-lab as SARS-CoV-2 is a new virus and no laboratory has a complete representative collection of circulating strains, including all new emerging variants. To overcome this limitation, bioinformatics tools were previously used to perform an in silico specificity evaluation, as a first step of a full evaluation process. This was done by taking benefit of the huge sequencing efforts, making large amounts of Whole Genome Sequencing (WGS) data available in public databases such as NCBI and GISAID [42] [43] [44] . Ideally, this kind of in silico evaluation has to be performed on a frequent basis, considering that new genomes are continuously uploaded into the databases and especially when new SARS-CoV-2 VOC are identified. In addition, not only the effects of the mutations defining some specific VOC need to be investigated but the impact of all the mutations present in the variant genomes should to be taken into account, including nucleotide changes that were slowly acquired through time or emerged independently and are not representative of the variants population.

We previously used a BLAST-based user-friendly open-access bioinformatics tool named "SCREENED" (polymeraSe Chain Reaction Evaluation through largE-scale miNing of gEnomic Data) [45] to investigate mismatches between 30 primers and probe sets and large amounts of WGS data downloaded from the databases cited above [46] . For each oligonucleotide set and analyzed genomes, SCREENED generates mismatch scores and estimates the production of a positive or negative theoretical RT-qPCR signal according to the total number of mutations present in the annealing sites as well as their positions. In the present study, a selection of these primers and probe sets were evaluated again for their inclusivity using SCREENED, with 84,305 new SARS-CoV-2 unique genomes collected between June 2020 and January 2021. Additionally, a specific focus was put on the new variants 20I/501Y.V1, 20H/501Y.V2 and 20J/501Y.V3 by investigating the effect of their mutations on the evaluated assays.

On 7 January 2021, 31,244 and 199,600 SARS-CoV-2 genomes, coming from samples collected since 7 June 2020, were, respectively, downloaded from the NCBI Virus (https://www.ncbi.nlm.nih.gov/labs/virus/, accessed on 7 January 2021); Supplementary Materials File S1) and GISAID EpiCoV (https://www.epicov.org, accessed on 7 January 2021; Supplementary Materials File S2) databases. When downloaded from GISAID, only the complete genomes coming from human samples were selected, and the low coverage genomes were excluded. NCBI genomes were complete sequences with "SARS-CoV-2" as the species (taxid: 2697049) and Homo sapiens as the "host" (taxid: 9606).

For the GISAID genomes, the lineage assignment was extracted from the associated metadata in the database (Supplementary Materials File S2). For the NCBI genomes, lineage assignment was performed using the tool Pangolin (version 2.1.10; https://github. com/cov-lineages/pangolin, accessed on 7 January 2021; [47] ) with pangoLEARN 02-01-2021 and default parameters (Supplementary Materials File S1). From the total number of downloaded genomes, 8860 belonged to lineage B.1.1.7, i.e., 20I/501Y.V1, and 366 to lineage B.1.351, i.e., 20H/501Y.V2. None of the downloaded genomes were determined to belong to lineage P.1, i.e., 20J/501Y.V3.

From the downloaded dataset, genomes showing more than one undetermined nucleotide "N" in their sequences were discarded, to retain only high-quality genomes (154,602). Finally, to avoid redundancies in the dataset, all the identical genomes were clustered together using CD-HIT-EST v4.6.8 (https://github.com/weizhongli/cdhit, accessed on 7 January 2021; [48, 49] ) with sequence identity cut-off equal to 1.0 (other parameters were left at default settings). Only the representative genomes (84,305; Supplementary Materials File S3) of each cluster were used for further analyses.

To determine the theoretical production of RT-qPCR signals, SCREENED v1.0 [45] was used as described in our previous study, with identical settings. Briefly, SCREENED performs a two-step BLAST approach to first fish out in each genome the complete amplicon sequence targeted by the evaluated primers and probe sets, and secondly to produce mismatch statistics from the hybridization between these oligonucleotides and their corresponding annealing sites in the amplicon. For more details, we refer to [45, 46] . In the present study, if no mismatch was detected in the first 5 nucleotides of primers' 3' end, if the total number of mismatches did not exceed 10% of oligonucleotides length, and if at least 90% of the oligonucleotides sequence aligned correctly with their targets, SCREENED considered that a positive RT-qPCR signal was produced. These criteria were selected according to what is generally described in the scientific literature for mismatches potentially affecting the performance of PCR-like methods [18] [19] [20] [21] . Considering the primers and probe sets investigated in this study, none exceeding a length of 30 nucleotides except for the forward primer of Assay 8 S (Table 1) , this meant that no more than 1-2 mismatches were tolerated. For the Assay 8 S forward primer with a length of 30 nucleotides, no more than 3 mismatches were tolerated. Finally, greedy clustering of the amplicon was enabled as an option in SCREENED. *: to allow comparisons with the study of Gand et al., 2020 [46] , the assay numbering as used in this previous study was conserved. **: starting and ending position of the sequence amplified by the corresponding forward and reverse primers, in the NCBI SARS-CoV-2 reference sequence NC_045512. ***: this primers and probe set is also used in the RT-qPCR test from Institut Pasteur Paris (France) [14] . Fw: forward primer; Rv: reverse primer; P: probe.

As input, SCREENED used a FASTA file containing the 84,305 representative SARS-CoV-2 sequences (Supplementary Materials File S3) and a tab-delimited text file containing the sequences of the primers and probes to be evaluated and their corresponding amplicon sequence to be mined in the genomes (Supplementary Materials File S4).

As only SARS-CoV-2 genomes were used in our study for the evaluation of COVID-19 diagnostic RT-qPCR assays, every negative signal reported by SCREENED was considered as a theoretical False Negative (FN) result and used for in silico inclusivity evaluation as follows (1):

Only inclusivity was assessed here as exclusivity was already verified during our previous evaluation with genomes belonging to other coronaviruses and common respiratory viruses, as well as the human reference genome, and would not change for the same RT-qPCR assays evaluated [46] .

In May 2020, the in silico specificity of 30 primers and probe sets was investigated with SCREENED [46] . In the current study, 11 of these sets, listed in Table 1 , were evaluated again for their inclusivity with a new dataset of 84,305 representative SARS-CoV-2 genomes (obtained from 230,844 SARS-CoV-2 sequences; see Materials and Methods section) coming from samples collected between 7 June 2020 and 7 January 2021. These 11 oligonucleotides sets were selected because they belong to RT-qPCR tests commonly used as reference methods for comparison studies (Assay 2 RdRp-P2, Assay 2 E, Assay 2 N, Assay 3 RdRp_IP2, Assay 3 RdRp_IP4, Assay 4 N-1, Assay 4 N-2, and Assay 4-N3) [52] [53] [54] [55] [56] [57] [58] [59] , or because they showed the best (Assay 8 S and Assay 9 ORF1a) and worst (Assay 1 N) specificity results during our first evaluation in May 2020 [46] .

From the 84,305 representative SARS-CoV-2 genomes analyzed, 30,445 gave a negative theoretical RT-qPCR signal when evaluating the primers and probe set of Assay 1 N, resulting in an inclusivity of 63.89%. This low inclusivity was mostly due to a 3-nucleotides substitution (GGG to AAC) in the 5' end of the forward primer, as reported previously [46] , sometimes in combination with other nucleotide changes. In total, 64 different combinations of substitutions and deletions in the forward primer sequence were reported by SCREENED as potentially leading to Assay 1 N failure (Supplementary Material File S5). Furthermore, the AAC substitution was found in all the analyzed 20I/501Y.V1 genomes representing 13% of the FN results obtained with Assay 1 N. In contrast, all other assays showed inclusivity results above 99%, as previously obtained [46] , with only between 8 (Assay 2 E) and 833 (Assay 4 N-2) negative theoretical RT-qPCR signals. The 3 best inclusivity results were obtained for Assay 2 E (99.99%), Assay 8 S (99.97%), and Assay 3 RdRp_IP4 (99.95%) ( Table 2) . Nevertheless, amongst these 3 best assays, only Assay 2 E and Assay 3 RdRp_IP4 were estimated to correctly detect all the investigated variant genomes included in the analysis. ORF1a 95 (0%) 99.89% 100% † : number of representative genomes that produced a theoretical negative RT-qPCR signal according to the SCREENED settings (detailed in Section 2.4) and, consequently, considered as false negative (FN). The percentage of the genomes resulting in negative results and belonging to one of the new variants is indicated between brackets. All turned out to belong to the B.1.1.7 lineage (20I/501Y.V1). * results obtained in the present study with 84,305 representative SARS-CoV-2 genomes collected between 7 June 2020 and 7 January 2021. ** results obtained in the previous study with 2569 representative SARS-CoV-2 genomes collected up to 7 April 2020 [46] .

Except for the Assay 8 S forward primer, all the primers and probes evaluated here had a length below 30 nucleotides (Table 1) , which means that no more than 1-2 mismatches could be tolerated for these oligonucleotides according to the applied SCREENED criteria (see Section 2.4). The forward primer of Assay 8 S is composed of 30 nucleotides and 3 mismatches would still result in a positive RT-qPCR theoretical signal with SCREENED.

As it was already demonstrated that more than 2 mismatches can potentially impact the performance of PCR-based methods [19, 21] , the number of mismatches for the Assay 8 S forward primer was investigated in the detailed output data produced by SCREENED (data not shown). The detailed SCREENED results showed that no more than one mismatch was reported for this oligonucleotide sequence over all analyzed genomes, thus confirming the excellent inclusivity of this assay.

In addition to the production of mismatch statistics, SCREENED allowed clustering of the sequences amplified by the evaluated primers sets from all analyzed genomes. Table 3 shows the total number of clusters obtained per set and the repartition of the genomes in the 3 first clusters, ordered from largest to smallest. With these data, the level of conservation of the amplicons targeted by the different assays can be assessed. The highest number of generated amplicon clusters (496) was observed for Assay 1 N, with a repartition of the analyzed genomes in 3 main clusters containing each between 26.9% and 30.3% of all the WGS data, thus illustrating the high level of diversity in this region of the SARS-CoV-2 genome. In contrast, for the other investigated assays, the majority of the analyzed genomes were clustered together in one main cluster, i.e., the first cluster, with a repartition ranging from 98.6% (Assay 8 S) to 91.1% (Assay 9 ORF1a) of the genomes in this cluster. It can be noticed that the second cluster of Assay 9 ORF1a contained 4.8% of analyzed genomes, whereas no more than 1.8% is included in the second cluster of Assays 2 to 8, all targets considered. This Assay 9 ORF1a second cluster contained almost solely genomes sequenced in England and Wales during the end of 2020 and belonging to Pangolin lineage B1.1.7, the specific lineage of SARS-CoV-2 variant 20I/501Y.V1 (Supplementary Materials File S6). *: number of amplicon clusters produced by SCREENED for each evaluated primers and probe set. **: Repartition of the number of amplicons among the clusters for the 3 largest clusters for each evaluated primers and probe set.

As 20I/501Y.V1, 20H/501Y.V2, and 20J/501Y.V3 were recently identified as SARS-CoV-2 VOC, at the time of analysis, they were not well represented in the dataset downloaded from GISAID and NCBI, which was used for the specificity evaluation performed in this study using SCREENED (Section 3.1). Genomes belonging to Pangolin lineage B. (Table 1) . When this was the case, their presence in the primers' and probes' annealing sites, in addition to their potential impact on the RT-qPCR outcome according to the criteria used for the SCREENED analysis (see Section 2.4), was investigated.

Six nucleotide changes, C3267T and C28977T from variant 20I/501Y.V1, G22813T, and C28887T from variant 20H/501Y.V2, and C12778T and A22812C from variant 20J/501Y.V3, were found in sequences amplified by oligonucleotide sets evaluated in this study ( Table 4 ). The 2 mutations C3267T and C12778T were found in the amplicon of Assay 9 ORF1a and Assay 3 RdRp_IP2 but with no impact on the tests' outcome because they were located in neither the primers' nor probe's annealing sites. Interestingly, the substitution C3267T of variant 20I/501Y.V1 is the nucleotide change defining the amplicon representative sequence of the second cluster of Assay 9 ORF1a (Table 3) , what is in line with the content of this cluster made up mostly of genomes sequenced in UK and belonging to lineage B.1.1.7

(Supplementary Materials File S6). The mutations A22812C and G22813T, responsible for 2 amino acid changes, i.e., K417T and K417N, located in the RBD of the S protein, were found in the probe annealing site of Assay 8 S. Finally, the nucleotide substitutions C28887T and C28977T were reported to be, respectively, located in the forward and reverse primer sequences of Assay 1 N. Surprisingly, another study [42] evaluating the impact of variants' mutations on publicly available assays (including Assay 1 N) with BLAST and 20I/501Y.V1 genome EPI_ISL_744131, did not report the mismatch in the Assay 1 N forward primer due to the C28977T mutation. To better understand the reason of these inconsistent results, the same BLAST analysis was reproduced. This indicated an issue with the manual interpretation of the BLAST output in the other study [42] , as in the reproduction of their analysis, we could not find a perfect match between the complete forward primer sequence of Assay 1 N and EPI_ISL_744131 (Supplementary Materials File S8). Considered alone, the mutations defining the new variants were, based on the SCREENED criteria, not estimated to impact the outcome of the evaluated RT-qPCR assays (Table 4 ). Nevertheless, it can be noticed that the 20I/501Y.V1 mutation C28977T, present in reverse primer of Assay 1 N, was found to be combined with another non-variant specific mutation present in four B.1.1.7 genomes (Supplementary Materials File S5), resulting in FN results when evaluating Assay 1 N with SCREENED (Section 3.1). Moreover, during the SCREENED analysis (Section 3.1), some variant genomes led to FN results (Table 2) due to other mutations than those defining these new variants (Supplementary Materials File S7). For instance, the 3-nucleotides substitution (GGG to AAC) observed in the Assay 1 N forward sequence was found in all the 20I/501Y.V1 genomes but is not specific to this variant (Supplementary Materials Files S5 and S7). This demonstrates that despite the fact that variant defining mutations were not estimated to impact RT-qPCR outcome on their own, the variants can carry additional mutations, all together potentially impacting the test's performance.

As some RT-qPCR tests for SARS-CoV-2 detection were developed early in the pandemic based on WGS data available at that time, there is a need to periodically evaluate whether these assays are still performant to detect the virus that has evolved since its first occurrence 1 year ago. As this kind of specificity evaluation would not be feasible in the wet-lab due to the lack of a representative strains collection, in the present study, this was performed in silico for 11 primers and probe sets using the bioinformatics tool SCREENED and 84,305 representative SARS-CoV-2 genomes obtained from GISAID and NCBI. The WGS data used in this study were obtained from samples collected between 7 June 2020 and 7 January 2021. Therefore, this allowed comparison with our previous study that evaluated the same 11 primers and probe sets between April and May 2020 with WGS data available at that time and determined Assay 1 N and Assay 8 S as the least and most specific assays, respectively [46] .

The Assay 1 from China CDC targeting the gene N was once again the one showing the lowest inclusivity (63.89%), which continued to decrease since April (86.03%) and May (74.54%) 2020. This low score is again mostly due to a substitution of 3 nucleotides (GGG to AAC) in the 5' end of the forward primer. Nevertheless, although in our first study only 4 different combinations of nucleotide substitutions were reported in the forward primer of Assay 1 N, 64 combinations of substitutions and even deletions were identified in the same sequence with the new dataset until January 2021. This, in combination with the diversity observed in the amplicon sequences of this assay, clearly demonstrates how the accumulation of mutations in some parts of the SARS-CoV-2 genome can dramatically affect the specificity of an RT-qPCR test. In comparison, the 10 other evaluated primers and probe sets, including those from the widely used tests developed by the Charité Hospital, Institut Pasteur Paris and US CDC, retained their high inclusivity above 99%. The Assay 8 S, determined as the best assay during our first evaluation, showed the second highest inclusivity result (99.97%) after Assay 2 E (99.99%), with a high level of amplicon conservation, illustrated by one major cluster containing 98.6% of the amplified sequences.

The data generated with SCREENED, for the 11 primers and probe sets evaluated in this study, showed how the virus evolution can potentially impact the performance of RT-qPCR assays used for SARS-CoV-2 detection. Recently, this virus evolution took a new turn with the emergence of new SARS-CoV-2 variants. At the time we started this analysis, three main new variants, i.e., 20I/501Y.V1, 20H/501Y.V2, and 20J/501Y.V3, were reported to carry an abnormal number of mutations, with some resulting in an estimated enhanced transmissibility and concerns about the effects on the immunological response and vaccine efficiency [32] [33] [34] 38, 40] . Four of these mutations, among which two are of concern because they effectuate amino acid changes (K417T and K417N) in the RBD of the S protein, were found to be located in oligonucleotide sequences of the China CDC assay targeting gene N (Assay 1 N) and the Chan et al. assay targeting gene S (Assay 8 S) evaluated in this study. Although these mutations were not estimated to cause a total test failure, they might affect the sensitivity of the RT-qPCR. Furthermore, it cannot be excluded that the variants will continue to evolve and acquire additional nucleotide changes, which can impact the test's performance when combined to their lineage-specific mutations. This was already shown in the present study for four 20I/501Y.V1 genomes in which the C28977T variant mutation was combined with other nucleotide changes, leading to FN results. Additionally, some nucleotide changes previously acquired by the variant lineages (such as the GGG to AAC change in the Assay 1 N forward sequence), and not specific to those variants, can also lead to FN results. Therefore, when evaluating if primers and probe sets can still cover variant detection, it is also important to consider all the nucleotide changes that can be present in the variant population, as done in the present study. As soon as more variant genomes, especially belonging to 20H/501Y.V2 and 20J/501Y.V3, will be available in the database, the current analytical procedure should be reproduced, preferably with dedicated datasets per variants and in a more dynamic set-up (e.g., dataset per variant, per month). Furthermore, with the efforts made to improve the global surveillance of circulating SARS-CoV-2 strains, an increasing number of VOCs will be identified and genomes belonging to these should be included in future in silico specificity evaluation study as well.

The most accurate RT-qPCR test is required for proper detection of SARS-CoV-2, including its variants of concern for COVID-19 diagnosis and SARS-CoV-2 surveillance. Concerning the assays for which a low inclusivity was obtained (Assay 1 N) or for which mutations belonging to variants was identified in the sequence of their oligonucleotides (Assay 1 N and Assay 8 S), it could be considered to correct these to improve their specificity for SARS-CoV-2 detection. However, this would require modifying the concerned primers and probes with degenerated nucleotides (taking into account all the possibilities; Supplementary Material File S5), with the aim to be specific to both SARS-CoV-2 variants and historical strains and to validate these new assays experimentally. To avoid this extra work, and also to avoid to target regions affected by mutations, we would rather recommend to use Assay 2 E and Assay 3 RdRp_IP4, determined as the best ones based on the data obtained in this study. These 2 assays showed an excellent inclusivity, a high level of conservation in their amplicon sequence, and no variant mutations in the annealing site of their primers and probes. Assay 3 RdRp_IP4 was initially designed to be strictly specific to SARS-CoV-2, whereas Assay 2 E has a broader intended specificity to Sarbecovirus and is usually used for results' confirmation [14, 46] . These 2 assays could be included in a dual-target RT-qPCR test for reliable detection of SARS-CoV-2, also considering that the use of 2 molecular markers in a RT-qPCR test is usually recommended to lower the probability of incorrect results in case of mutational drift in one of the targets. Moreover, RT-qPCR detection is also a good approach for SARS-CoV-2 variants surveillance and a good alternative to WGS that is more expensive and time consuming. This surveillance is definitely needed to monitor the spread of variants having mutations potentially impacting vaccine efficiency, which might increase once the vaccination reaches full speed. However, it would be more meaningful to develop RT-qPCR assays targeting some key mutations, e.g., E484K and N501Y, that have been already identified as being of concern based on epidemiological and experimental data [38, 40] , rather than identifying the variants themselves according to their lineage or region of emergence. This would also be more efficient as more variant lineages, and other VOC, are expected to emerge in the future [60] . To develop such tests, SCREENED can be employed to produce evidence-based data from thousands of SARS-CoV-2 genomes available in NCBI and GISAID, to evaluate the in silico specificity of the designed primers and probes targeting mutations of concern (or significant mutations).

In the present study, potential FN RT-qPCR results were predicted by SCREENED, based on mismatch scores, for oligonucleotide sets coming from publicly available assays, which we previously evaluated in the first phase of the pandemic. We re-evaluated some of these assays, as a proof-of-concept, to assess the impact of the evolution of the pandemic, and hence the evolution of the virus, on the inclusivity of those RT-qPCR assays. This was possible as for these RT-qPCRs, their full sequences have been described. However, COVID-19 diagnosis at large scale is usually done in many laboratories with commercial diagnostic kits. Considering the results obtained in our study for some publicly available assays, it would not be surprising that SARS-CoV-2 mutations can also impact the performance of commercial assays, as suggested by some data in the scientific literature. For instance, the presence of the SARS-CoV-2 mutations S:∆69-70 deletion (specific to the 20I/501Y.V1 variant) and E:C26340T was demonstrated to be strongly associated with detection failure of the TaqPath™ COVID-19 kit (Thermo Fisher) [30] and Cobas SARS-CoV-2 assay (Roche) [61] for one of their corresponding targets (S and E), respectively. Therefore, it is highly suspected that these mutations are responsible for mismatches between primers and probes of these kits, resulting in FN results. Unfortunately, these kinds of assumptions are difficult to demonstrate. Indeed, unlike the publicly available methods that are well described in the scientific literature, commercial kits are usually black boxes with only the targeted SARS-CoV-2 genes known, and neither the corresponding sequences of the oligonucleotides nor the exact location of their annealing sites in the viral genome are specified, even when asked to the kits' manufacturers [61] . Information on which and how many genomes were used to verify the specificity of the primers and probes during the validation process is also often very limited. Consequently, when there is a suspicion of FN results obtained with a commercial kit because of mutations in the SARS-CoV-2 genome, this cannot be completely verified, even though this could be meaningful to avoid further inaccurate diagnoses. Additionally, the lack of communication on the primers and probe sequences included in the kits makes the specificity assessment, such as performed with SCREENED in the present study, of these commercial methods by external and independent laboratories nearly impossible. It can also not be properly assessed whether a failure in the test might be the result of a modified inclusivity of those primers and probe sets because of the evolving virus or because of possible unwanted mismatches in the primers and probes introduced during the synthesis process. Although the latter is less likely, given the quality control systems in place at the commercial vendors, the inclusion of a positive control for each RT-qPCR assay will be required to elucidate this. Next to this, standardized and reference materials are not always available for the kit's manufacturers to properly evaluate the specificity of their products in the wet-lab. Fortunately, the S-drop out of the TaqPath ™ COVID-19 kit could be used as a proxy for the 20I/501Y.V1 variant detection, as still two other SARS-CoV-2 specific genes were detected by the kit. Nevertheless, considering the situation highlighted above, it would be recommended that commercial companies, if not publicly disclosing their primer and probe sequences, (continue to) collaborate more intensively with public health organizations (such as the WHO) to regularly evaluate if their assays are impacted by SARS-CoV-2 mutations. A central repository, containing the sequences of all the commercial and publicly available primers and probe sets used for SARS-CoV-2 detection, could be created and made available to a central RT-qPCR evaluation team composed of a panel of scientific experts with testing expertise worldwide. This team would be in charge of the regular evaluation of the sets, including the monitoring of mutations of concern in the oligonucleotide sequences. It could then be communicated to the clinical laboratories and public health community if some kits were or should be adapted to take the virus evolution into account for accurate detection.

In conclusion, the data presented in this study show the importance of regularly assessing the impact of SARS-CoV-2 evolution on the performance of RT-qPCR assays widely used as the gold-standard method for COVID-19 diagnosis, especially in the context of new emerging variants accumulating high numbers of mutations. This can easily be done in a first step by identifying potentially impacting mismatches using bioinformatics tools, such as SCREENED or others, and WGS data being deposited on a daily basis in publicly available repositories. Of course, this in silico approach does not take into account all the other in vitro parameters that can affect PCR-like reactions. However, this preliminary in silico analysis is valuable to know specifically what should be tested in the laboratory, in a second step, to confirm experimentally the effect of these mismatches on RT-qPCR performance. Nevertheless, only the RT-qPCR assays fully described in the scientific literature can be evaluated in this manner and not the commercial kits commonly used for COVID-19 testing at large scale because these remain black boxes. This situation is unfortunate, now more than ever, as so-called "third waves" are threatening several countries in Europe [62] . This makes the correct detection of all circulating SARS-CoV-2 strains, including their emerging variants with eventually more dedicated VOC detection assays, crucial to limit their spread in the population. Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: https://www.ncbi.nlm.nih.gov/labs/virus/, (accessed on 7 January 2021), (accession numbers available in Supplementary Materials File S1) and https://www.epicov.org, (accessed on 7 January 2021), (accession numbers available in Supplementary Materials File S2). The data presented in this study are available within the text and in supplementary materials. The detailed SCREENED output data are available upon request.

An Interactive Web-Based Dashboard to Track COVID-19 in Real Time

When Health Professionals Look Death in the Eye: The Mental Health of Professionals Who Deal Daily with the 2019 Coro-navirus Outbreak

Supporting the Health Care Workforce During the COVID-19 Global Epidemic

Respiratory Support in COVID-19 Patients, with a Focus on Resource-Limited Settings

The Italian Health System and the COVID-19 Challenge

The SARS-CoV-2 and Mental Health: From Biological Mechanisms to Social Consequences

Coronavirus Disease 2019 (COVID-19) Pandemic and Economic Impact. Pak

The Social, Economic and Sanitary Impact of COVID-19 Pandemic

The Main Molecular and Serological Methods for Diagnosing COVID-19: An Overview Based on the Literature

Diagnostic Testing for Severe Acute Respiratory Syndrome-Related Coronavirus 2

Weathering COVID-19 Storm: Successful Control Measures of Five Asian Countries

Asia's COVID-19 Lessons for the West: Public Goods, Privacy, and Social Tagging

Karunasagar, I. Detection Technologies and Recent Developments in the Diagnosis of COVID-19 Infection

WHO Molecular Assays to Diagnose COVID-19: Summary Table of Available Protocols

Laboratory Diagnosis of Emerging Human Coronavirus Infections-the State of the Art

Potential Preanalytical and Analytical Vulnerabilities in the Laboratory Diagnosis of Coronavirus Disease 2019 (COVID-19)

Molecular Diagnostic Technologies for COVID-19: Limitations and Challenges

Effects of Primer-Template Mismatches on the Polymerase Chain Reaction: Human Immunodeficiency Virus type 1 Model Studies

Single-Nucleotide Polymorphisms and Other Mismatches Reduce Performance of Quantitative PCR Assays

The Effects of Internal Primer-Template Mismatches on RT-PCR: HIV-1 Model Studies

Sequence Variation in Primer Targets Affects the Accuracy of Viral Quantitative PCR

Temporal Signal and the Phylodynamic Threshold of SARS-CoV-2

The Phylogenetic Relationship Within SARS-CoV-2s: An Expanding Basal Clade

Evolution Patterns of SARS-CoV-2: Snapshot on its Genome Variants

Insights into SARS-CoV-2 Evolution, Potential Antivirals, and Vaccines

Biomedicine & Pharmacotherapy Impact of Virus Genetic Variability and Host Immunity for the Success of COVID-19 Vaccines

Next-Generation Sequencing: An Eye-Opener for the Surveillance of Antiviral Resistance in Influenza

SARS-CoV-2 Evolution During Treatment of Chronic Infection

Emergence of Y453F and ∆69-70HV Mutations in a Lymphoma Patient with Long-Term COVID-19

Investigation of novel SARS-CoV-2 variant: Variant of Concern 202012/01

Investigation of novel SARS-CoV-2 variant: Variant of Concern 202012/01

Emergence and Rapid Spread of a New Severe Acute Respiratory Syndrome-Related Coronavirus 2 (SARS-CoV-2) Lineage with Multiple Spike Mutations in South Africa

Phylogenetic Relationship of SARS-CoV-2 Sequences from Amazonas with Emerging Brazilian Variants Harboring Mutations E484K and N501Y in the Spike Protein

SARS-CoV-2 Reinfection by the New Variant of Concern (VOC) P.1 in Amazonas, Brazil

Much More Than Just a Receptor for SARS-CoV-2. Front

The outbreak of the Novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2): A Review of the Current Global Status

Molecular Mechanism of the N501Y Mutation for Enhanced Binding between SARS-CoV-2's Spike Protein and Human ACE2 Receptor

Viral Targets for Vaccines Against COVID-19

Comprehensive Mapping of Mutations in the SARS-CoV-2 Receptor-Binding Domain that Affect Recognition by Polyclonal Human Plasma Antibodies

Structural Basis of Fitness of Emerging SARS-COV-2 Variants and Considerations for Screening, Testing and Surveillance Strategy to Contain their Threat

Summary of the Available Molecular Methods for Detection of SARS-CoV-2 During the Ongoing Pandemic

Analysis of the Potential Impact of Genomic Variants in Global SARS-CoV-2 Genomes on Molecular Diagnostic Assays

Will the emergent SARS-CoV2 B.1.1.7 Lineage Affect Molecular Diagnosis of COVID19?

Application of Whole Genome Data for in Silico Evaluation of Primers and Probes Routinely Employed for the Detection of Viral Species by RT-qPCR Using Dengue Virus as a Case Study

Use of Whole Genome Sequencing Data for a First in Silico Specificity Evaluation of the RT-qPCR Assays Used for SARS-CoV-2 Detection

A dynamic Nomenclature Proposal for SARS-CoV-2 Lineages to Assist Genomic Epidemiology

Accelerated for Clustering the Next-Generation Sequencing Data

Cd-hit: A Fast Program for Clustering and Comparing Large sets of Protein or Nucleotide Sequences

Improved Molecular Diagnosis of COVID-19 by the Novel, Highly Sensitive and Specific COVID-19-RdRp/Hel Real-Time Reverse Transcription-PCR Assay Validated In Vitro and with Clinical Specimens

Genomic Characterisation and Epidemiology of 2019 Novel Coronavirus: Implications for Virus Origins and Receptor Binding

Société française microbiologie (SFM) Rapport tests moléculaires reçus par la SFM

Evaluation of the RealStar ® SARS-CoV-2 RT-PCR kit RUO performances and limit of detection

Comparative performance of SARS-CoV-2 Detection Assays Using Seven Different Primer/Probe Sets and one Assay Kit

Comparison of SARS-CoV-2 Detection from Nasopharyngeal Swab Samples by the Roche Cobas 6800 SARS-CoV-2 Test and a Laboratory-Developed Real-Time RT-PCR test

Novel Rapid Identification of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) by Real-Time RT-PCR Using BD Max Open System in Taiwan

Comparison of Seven Commercial RT-PCR Diagnostic Kits for COVID-19

Rapid and Sensitive Detection of SARS-CoV-2 RNA Using the Simplexa TM COVID-19 Direct Assay

Comparison of Commercial Realtime Reverse Transcription PCR Assays for the Detection of SARS-CoV-2

Future Scenarios for the COVID-19 Pandemic

A Recurrent Mutation at Position 26340 of SARS-CoV-2 Is Associated with Failure of the E Gene Quantitative Reverse Transcription-PCR Utilized in a Commercial Dual-Target Diagnostic Assay

Envoy Fears Third Wave, Calls Europe Response "Incomplete

The authors gratefully acknowledge the authors from the originating laboratories responsible for obtaining the specimens; the submitting laboratories where the genetic sequence data were generated and shared via NCBI; and the GISAID Initiative (Supplementary Material File S9), on which this research is based.

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.