key: cord-0980997-am0yjt2i authors: Lopez, E.; Barthelemy, M.; Baronti, C.; Masse, S.; Falchi, A.; Durbesson, F.; Vincentelli, R.; de Lamballerie, X.; Charrel, R. N.; Coutard, B. title: Endonuclease-based genotyping of the RBM, a first-line method for the surveillance of emergence or evolution of SARS-CoV-2 Variants. date: 2021-06-21 journal: nan DOI: 10.1101/2021.06.16.21259035 sha: 132b6508952176fa0df4aa66ad6ffcc6ce4808f6 doc_id: 980997 cord_uid: am0yjt2i Since the beginning of the Covid-19 pandemics, variants have emerged. Whereas most of them have no to limited selective advantage, some display increased transmissibility and/or resistance to immune response. To date, most of the mutations involved in the functional adaptation are found in the Receptor Binding Module (RBM), close to the interface with the human receptor ACE2. In this study, we thus developed and validated a fast and simple molecular assay allowing the detection and partial identification of the mutations in the RBM coding sequence. After the amplification of the region of interest, the amplicon is heat-denatured and hybridized with an amplicon of reference. The presence of a mutation in the heteroduplex can be cleaved by a mismatch-specific endonuclease and the cleavage pattern is analysed by capillary electrophoresis. The approach was first validated on viral RNA purified different SARS-CoV-2 variants produced in the lab before being implemented for clinical samples. The results highlighted the performance of the assay for the detection of mutations in the RBM from clinical samples. The procedure can be easily set up for high throughput identification of the presence of mutations and serve as a first-line screening to select the samples for full genome sequencing. In December 2019, individuals with pneumonia of unknown aetiology were recorded in the city of Wuhan, China. The number of cases increased steadily in the following weeks including in the countries surrounding China (Taiwan, Thailand, Malaysia, etc.), and then throughout the world via aerial transport, triggering the WHO to announce a Public Health Emergency of International Concern (PHEIC) on the 1 st of February 2020. Shortly after, the disease was named "COVID-19", for "Coronavirus disease 2019". In April 2021, more than 176 million people contracted the virus and 3.8 million died. The coronavirus in question, SARS-CoV-2, is an enveloped, positive single stranded RNA virus. The viral particle exposes at its surface the envelope glycoprotein Spike (S). The S protein is a multi-domain protein ( Figure 1A ) involved in host cell recognition, in particular via its receptor binding domain (RBD) which specifically binds the human Angiotensin-2 Converting Enzyme (hACE2). The S protein is considered a major determinant of viral infectivity and antigenicity [1] [2] , and mutations in the coding sequence of the S protein are susceptible to affect the biology of SARS-CoV-2. Since the emergence of SARS-CoV-2 several non-synonymous mutations with biological implications have been reported in the coding sequence of the S protein. For instance the first one led to D614G, a substitution contributing to the enhancement of viral loads in the upper respiratory tract with possible increased transmission [3] . Latter, several variants designed as Variant of Concern (VoC) have emerged and are disseminating. Those variants may demonstrate increased transmissibility or severity of the disease, reduction of seroneutralization by antibodies induced by previous infection or vaccination, or resistance to therapeutic treatments. Among the VoCs, the first one -501Y.V1 or B.1.1.7 lineage -was identified in the UK and showed enhanced human-to-human transmission and increased disease severity [4] [5] . Then, variants of B1.351 (501Y.V2), P.1 (501Y.V3) and B.617 lineages were isolated and characterized in South Africa, Brazil/Japan and India, respectively. Both B1.351 (501Y.V2), P.1 (501Y.V3) variants show increased resistance to antibody neutralization [5] - [7] . VoCs have in common to present at least one non-synonymous mutation in the spike receptor binding motif (RBM). RBM is the sub-domain of the RBD containing most of the hACE2-contacting residues and is also characterized by the presence of epitopes for neutralising antibodies ( Figure 1B) . Some of the characterized mutations are N501Y for 501Y.V1, E484K/N501Y for 501Y.V2 and 501Y.V3. By themselves, these mutations can affect the binding of the S protein to hACE2 and/or the potency of neutralizing antibodies [8] - [12] . The genetic evolution of this region is specifically scrutinized to identify possible new VoCs. The rapid detection of VoCs is thus pivotal for mitigating transmission in hospital settings and for adjusting therapies to avoid lowering efficacy. The detection of VoCs and surveillance of the evolution of SARS-CoV-2 population are currently surveyed by two approaches: variant-specific real time RT-PCR for the search of mutations at given positions and massive campaigns of New Generation Sequencing (NGS), whose data can contribute to the public-health decision making [13] . On one side, real time RT-PCR is fast and operational on site, but it can detect only known mutations and does not address the newly emerging ones. On the other side, NGS can detect any mutation along the genome but the results is obtained in days rather than hours, delaying information required for medical decision to be taken upon sequence identification. In addition, all the biological samples cannot be sequenced and upstream sampling is mandatory for the selection of the most relevant biological samples to characterize. Here we present the proof-of-concept for an alternative method allowing the surveillance of the genetic drift of SARS-CoV-2 in the RBM region where mutations are susceptible to affect the dissemination, pathogenicity or antibodyresistance of the virus. The technique, relying on the amplification of the RBM coding sequence followed by an assay using a mismatch-specific endonuclease, has been validated on biological samples demonstrating its feasibility. The target of the SARS-CoV-2 Spike coding sequence ranges from nt position 1273 to 1587, including the primers, and is 315 nucleotide long. The corresponding amino acid sequence (amino acid 425 to 529) encompasses the RBM module ( Figure 1 ). VeroE6/TMPRSS2 + (CFAR#100978) cells were grown in minimal essential medium (MEM) (Life Technologies) with 7 .5% heat-inactivated foetal calf serum (FCS; Life Technologies), at 37°C with 5% CO 2 with 1% penicillin/streptomycin (PS, 5000 U.mL −1 and 5000 µg.mL −1 respectively; Life Technologies), supplemented with 1 % non-essential amino acids (Life Technologies) and L-Glutamine (Life Technologies). SARS-CoV-2 strain BavPat1 was obtained from Pr. C. Drosten through EVA GLOBAL (https://www.european-virus-. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (Invitrogen)) using 3.5 µL of RNA and 6.5 µL of RT qPCR mix and standard fast cycling parameters, i.e., 10 min at 50 °C, 2 min at 95 °C, and 40 amplification cycles (95 °C for 3 s followed by 30 s at 60 °C) [14] . Quantification was provided by four 2 log serial dilutions of an appropriate T7-generated synthetic RNA standard of known quantities (10 2 to 10 8 copies). To determine the linearity of the designed system, real time RT-PCR was performed with SARS-CoV-2 BavPat1 strain was selected as the reference for this study. A large scale production of amplicons was done for this strain, following the RT-PCR protocol defined above. Production of the PCR products from biological samples . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 21, 2021. Each sample used on this study have been amplified following the RT-PCR protocol described above. The presence of a mutation on the amplified target was detected using the Surveyor® Mutation Detection Kit (#706021, Integrated DNA Technologies). Two PCR products (a reference product and a sample to be analysed) were mixed in a 10µL final volume and the endonuclease mismatch specific cleavage has been proceed following the manufacturer recommendations. The mixture was then loaded on capillary electrophoresis system (Fragment Analyser 5200, Agilent or GXII, Perkin Elmer) prior to analysis of the cleavage profile. When needed for confirmation of the presence of mutations, the PCR products were sequenced using the Sanger method with the forward and reverse primers used for the RT-PCR (Genewiz). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Within the RBD region of the Spike protein, the RBM contains amino-acids subjected to mutations (AA) positions -eg 452, 484 and 501 -with functional relevance and observed in VoCs ( Figure 1A ). Its coding sequence was thus well-suited for a genotyping assay. Prerequisite was also that the targeted region is centred on the positions of interest and short enough so that it excludes the identification of mutations with low to no functional effect, ie synonymous or non-synonymous with functional consequence. The size of the PCR product was thus set to 315-nt, allowing the detection of mutations in the region of the Spike protein spanning AA positions 431 to 524 of the S protein. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 21, 2021. To set up the assay, we first evaluated the sensitivity of the RT-PCR system in the range 1-10 6 of RNA copies/µL. The evaluation was done either by real-time RT-PCR with SYBR green (Figure 2A, 2B) , or by end-point analysis of the PCR products on agarose gel electrophoresis ( Figure 2C ). The amplification is linear from 10 to 10 5 copies of RNA/µL, with correlation coefficient R 2 =0.9986 (Figure 2A ), and has a limit of detection on agarose gel up to 10 copies/µL ( Figure 2C ). However, given the quantity of material needed for the nuclease assay (>25 ng/µL), we arbitrarily applied the threshold at 10 3 /10 4 copies/µL, which corresponds to samples with Ct values between 28 to 30 in reference detection systems [17] . It should be noted that the SYBR green inhibits the endonuclease used for the detection of mismatches, likely by altering the structure of the DNA helix, thus rendering the resulting PCR product not suitable for subsequent capillary electrophoresis analysis (data not shown). To establish the proof of concept, first experiments were conducted using viral RNA derived from cell cultures infected by three well-characterized variants: SARS-CoV-2 BavPat1, 501Y.V1 and 501Y.V2; (i) 501Y.V1 has N501Y (nt A1501T) mutation, (ii) and 501Y.V2 has E484K and N501Y (nt G1450A and A1501T, respectively) mutations by reference to the BavPat1 respectively. The theoretical cleavage profiles for pairwise combinations of the three variants are presented in Table 1 . In practice PCR products were mixed in pairs in approximate equimolar quantities before denaturation/hybridization prior to the mismatchspecific endonuclease assay. The results are presented in Figure 2D . The electropherogram (blue curve) corresponding to the hybridization of the BavPat1 amplicon with itself, with no mismatch expected, resulted in a unique 330-nt long fragment. When mixed with the one of 501Y.V1 (Fig 2D, orange curve) , three DNA fragments were observed at 330, 232 and 93 bp, close to the anticipated profile (Table 1) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 21, 2021. To further assess the mutation detection method, 8 samples were tested after they were diagnosed as SARS-CoV-2 RNA positive using the routine diagnostic assay TaqPath realtime RT-PCR TM (ThermoFisher), known to discriminate the 501Y.V1 variant on the deletion observed in the S coding sequence, leading to the amplification of only two targets out the three of the test [18] . In order to make the analysis easy and rapid, it is advocated to test each sample against itself and against the selected reference with no prior quantification. In our case, the reference was the European lineage of SARS-CoV-2 (BavPat1 strain) since it was For the other samples ( Figure 3, samples 1, 3, 4 and 7) , described as non-501Y.V1 according to the TaqPath assay, no cleavage was expected when compared to the BavPat1 reference amplicon, and two cleaved products with the 501Y.V1. However, at least one additional fragment was observed against both reference samples, of about 135 bp length ( Figure 3 , lanes 2, 3,8,9,11,12,20 and 21) . The PCR products were submitted to Sanger sequencing and a non-synonymous mutation yielding S477N substitution was detected, in agreement with the size of the cleaved products. This mutation had already been reported in viral populations circulating in Europe (https://www.gisaid.org/), and the corresponding variant has been shown to slightly increase the receptor binding domain's affinity for hACE2 [19] . We next blindly evaluated the assay on 92 SARS-CoV-2 positive samples, for which Ct values were below 28 and NGS data available. The sensitivity, defined as the ability to generate an amplicon with yields compatible for the nuclease assay, is 97.83% as 2 samples out of 92 were not properly amplified. The specificity was defined as the ability to detect in the RT-PCR-positive samples a mutation compared to the BavPat1 strain or detect the lack of mutation for sequences identical to the reference. Compared to the sequence data, 2 samples . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 21, 2021. ; out of 90 were inadequately identified, with selectivity of 97.78%. Out of the 88 remaining profiles, 82 had the same profile as in Figure 2D . Yet unmet patterns were observed and confirmed by the sequence analysis with E471D mutation detected in two samples, E484K in one sample, L452R in one sample and both L452R / N501Y in 2 others. Altogether, these results demonstrate that the mismatch-specific nuclease assay coupled to capillary electrophoresis is a suitable assay for the detection in clinical samples of already reported mutations. The identification of atypical profiles, confirmed in this study by sequencing, also demonstrates that it is relevant for the discovery of yet unmet variants. With the incremental characterization of variants, the set of reference cleavage patterns can be updated to adapt the assay to circulating strains. As this technique can be dimensioned for 96/384 well devices and requires less than 4 hours from the extracted RNA to production of individual results, it can be used to filter clinical samples and identify those for which virus isolation and complete genome sequencing is justified for surveillance purpose. We developed a molecular assay dedicated to the surveillance of SARS-CoV-2 variants, specifically targeting the RBM coding sequence known to be involved in the functional adaptation of the virus. The assay is suitable to screen biological samples and identify the presence new or emerging mutations. The technique, based on RT-PCR amplification of the RBM coding sequence followed by mismatch-specific nuclease assay and detection by DNA capillary electrophoresis is made possible for samples with RNA titers yielding Ct values better than 28 to produce enough amplified DNA material. The procedure is amenable to high throughput and could meet the demand for large scale viral population analysis, as well as individual cases such as the surveillance of virus evolution during a chronic infection by SARS-CoV-2, a situation favourable for virus adaptation [20] . Results of the mismatch-specific assay followed by capillary electrophoresis are presented under the table in a "gel-like" format. For each sample, the first lane is the auto-control (sample/sample), the second one is sample/BavPat1 and the third sample/501Y.V1. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 21, 2021. ; https://doi.org/10.1101/2021.06.16.21259035 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 21, 2021. ; https://doi.org/10.1101/2021.06.16.21259035 doi: medRxiv preprint The Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein « Spike mutation D614G alters SARS-CoV-2 fitness Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England », Science, mars 2021 « Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7 », Nature, mars 2021 « Increased Resistance of SARS-CoV-2 Variants B.1.351 and B.1.1.7 to Antibody Neutralization « SARS-CoV-2 RBD in vitro evolution follows contagious mutation spread, yet generates an able infection inhibitor « Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human plasma antibodies « Identification of SARS-CoV-2 spike mutations that attenuate monoclonal and serum antibody neutralization « Antibody Resistance of SARS-CoV-2 Variants B.1.351 and B.1.1.7 », bioRxiv Structural basis of receptor recognition by SARS-CoV-2 » « Rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the Netherlands » « In vitro screening of a FDA approved chemical library reveals potential inhibitors of SARS-CoV-2 replication « Two-step strategy for the identification of SARS-CoV-2 variant of concern 202012/01 and other variants with spike deletion H69-V70, France A Sensitive, Rapid and Affordable Method to Analyze BRCA1 and BRCA2 Mutations in Breast Cancer Families « Development and Evaluation of a duo SARS-CoV-2 RT-qPCR Assay Combining Two Assays Approved by the World Health Organization Targeting the Envelope and the RNA-Dependant RNA Polymerase (RdRp) Coding Regions », Viruses « S-variant SARS-CoV-2 lineage B1.1.7 is associated with significantly higher viral loads in samples tested by ThermoFisher TaqPath RT-qPCR » « Emergence and spread of a SARS-CoV-2 variant through Europe in the summer of 2020 », medRxiv « SARS-CoV-2 evolution during treatment of chronic infection