key: cord-0301403-xipa2mtc authors: Borisova, N.I.; Kotov, I.A.; Kolesnikov, A.A.; Kaptelova, V.V.; Speranskaya, A.S.; Kondrasheva, L.Yu.; Tivanova, E.V.; Khafizov, K.F.; Akimkin, V. G. title: Monitoring the spread of SARS-CoV-2 variants in Moscow and the Moscow region using targeted high-throughput sequencing date: 2021-07-15 journal: bioRxiv DOI: 10.1101/2021.07.15.452488 sha: a430095ca906799650d13c5670978981d636e553 doc_id: 301403 cord_uid: xipa2mtc Since the outbreak of the COVID-19 pandemic caused by the SARS-CoV-2 coronavirus, the international community has been concerned about the emergence of mutations that alter the biological properties of the pathogen, for example, increasing its infectivity or virulence. In particular, since the end of 2020, several variants of concern have been identified around the world, including variants “alpha” (B.1.1.7, “British”), “beta” (B.1.351, “South African”), “gamma” (P.1, “Brazilian”) and “delta” (B.1.617.2, “Indian”). However, the existing mechanism for searching for important mutations and identifying strains may not be effective enough, since only a relatively small fraction of all identified pathogen samples can be examined for genetic changes by whole genome sequencing due to its high cost. In this study, we used the method of targeted high-throughput sequencing of the most significant regions of the gene encoding the S-glycoprotein of the SARS-CoV-2 virus, for which a primer panel was developed. Using this technique, we examined 579 random samples obtained from patients in Moscow and the Moscow region with coronavirus infection from February to June 2021. The study demonstrated the dynamics of the representation in the Moscow region of a number of SARS-CoV-2 strains and its most significant individual mutations in the period from February to June 2021. It was found that the strain B.1.617.2 began to spread rapidly in Moscow and the Moscow region in May, and in June it became dominant, partially displacing other varieties of the virus. The results obtained make it possible to accurately determine the belonging of the samples to the abovementioned and some other strains. The approach can be used to standardize the procedure for searching for new and existing epidemiologically significant mutations in certain regions of the SARS-CoV-2 genome, which allows studying a large number of samples in a short time and to get a more detailed picture of the epidemiological situation in the region. Since its first detection in December 2019 in Wuhan (Zhou et al. 2020) , the SARS-CoV-2 coronavirus has spread worldwide and has caused nearly 4 million deaths (https://coronavirus.jhu.edu/). Starting from the beginning of the pandemic, a number of therapeutic and preventive measures to combat COVID-19 have been developed, which include the use of immunological drugs, for example, monoclonal antibodies Weinreich et al. 2021) and vaccines (Baden et al. 2021; Polack et al. 2020; Jones and Roy 2021; Ryzhikov et al. 2021) , for which the S-protein SARS-CoV-2 usually acts as an antigen. At the end of 2020, the international scientific community described several SARS-CoV-2 variants of concern that require special attention. These now include "alpha" (formerly called "British", B.1.1.7), "beta" ("South African", B.1.351), "gamma" ("Brazilian", P.1), and "delta" They are also of concern because they contain the E484K mutation in the spike (S-) protein, which is likely to reduce the effectiveness of some therapeutic monoclonal antibodies, making it difficult to neutralize the virus in vitro, and which could lead to a potential escape from immunity caused by previous infection or vaccination Garcia-Beltran et al. 2021; Liu et al. 2021; Yuan et al. 2021; Ikegame et al. 2021; Ryzhikov et al. 2021 ). In addition, three variants ("alpha", "beta" and "gamma") have the N501Y mutation in the S-protein gene, which is associated with increased affinity for the angiotensin-converting enzyme 2 (ACE2) receptor, which possibly contributes to the increased transmissibility of the virus (Nelson et al. 2021; Tian et al. 2021) . Earlier, specialists from the Central Research Institute of Epidemiology of Rospotrebnadzor developed a set of reagents for the rapid detection of the N501Y mutation in the virus genome using the isothermal loop amplification method (Khafizov et al. 2021) , which dramatically reduced the number of samples transferred for whole genome sequencing in order to identify and monitor new strains containing the above mutation. However, the emergence of new strains, including those characterized by other mutations in the Sprotein gene, showed that genomic substitutions at the sites of LAMP primer annealing can reduce the effectiveness of the technique. In addition, as SARS-CoV-2 variants are discovered, the list of mutations to be tracked is growing. For example, in Russia, the "delta" strain was discovered in May, which previously caused a high incidence in India, and its prevalence in the country has grown rapidly since then. Moreover, in Russia, local strains have also been identified, including "Siberian" (B.1.1.397 +) and "North-Western" (B.1.1.370.1), which have mutations in the S-protein gene (Gladkikh et al 2021; Klink et al. 2021 ). At the moment, studies of strains circulating in Russia are constantly ongoing (Komissarov et al. 2021) . For these reasons, researchers need more efficient and versatile tools to identify a range of important mutations in a single analysis. Although whole genome sequencing is by far the most detailed method for genetic analysis of a pathogen (Long et al. 2021 ), this approach is not always the best from a financial point of view. The application of this technique is difficult in conditions of a constant increase in the number of infected people, including cases of reinfection; moreover, it may turn out to be irrational for quick epidemiological surveillance, given that the most significant changes occur in a small part of the pathogen's genome. In this article, we describe the identification of isolates belonging to epidemiologically For the study, we used nasopharyngeal swabs from patients with symptoms of COVID-19, for whom the presence of SARS-CoV-2 was confirmed by real-time PCR with reverse transcription using the AmpliSens Cov-Bat-FL reagent kit (AmpliSens, Russia). The study was conducted with informed consent of the patients. The samples were placed in a transport medium produced by the Central Research Institute of Epidemiology. Isolation of RNA from clinical material was performed using the RIBO-prep kit (AmpliSens, Russia), reverse transcription was performed using the REVERTA-L kit (AmpliSens, Russia). Only those clinical samples were selected in which the cycle threshold (Ct) for PCR did not exceed 20. Amplification was carried out using PCR mixture-2 blue (AmpliSens, Russia) containing Taq-polymerase on a T100 Thermal Cycler (Bio-Rad, USA). Next, the PCR product was purified from the reaction mixture using AMPureXP beads (Beckman Coulter, USA) Amplification Analyzer instrument (Thermo Fisher Scientific, USA). Using the blastn program (Altschul et al. 1990) the specificity of each obtained sequence was assessed for all known organisms, primarily Homo sapiens, the genetic material of which is present in the sample in the greatest amount, excluding many nonspecific interactions between the primer and regions of human and other organisms DNA. A total of 5 pairs of oligonucleotides were created, also containing additional adapter sequences necessary to reduce the time and cost of sample preparation. The primer structures are shown in Table 1 . The lengths of the amplicons were selected so as to provide complete coverage of the target regions during their high-throughput sequencing on Illumina MiSeq platforms using the MiSeq reagent kits v2 (300 cycles), v2 (500 cycles) and v3 (600 cycles), and for Illumina HiSeq using the v2 Rapid SBS Kit (500 cycles). To analyze the sequencing data, the resulting reads were aligned to the reference coronavirus genome with the bwa program (Li and Durbin 2009) . Bbtools (Bushnell, Rood, and Singer 2017) was used to trim adapter sequences in reads. The search for genetic variants was carried out using the GATK package (Poplin et al. 2017 ). The sequences obtained in this study were uploaded to the VGARus database (https://genome.crie.ru/). VMD software was used to visualize the S-protein molecule and create figures (Humphrey, Dalke, and Schulten 1996) . A structural model of S-protein (PDB ID: 7CAB) obtained by cryoelectron microscopy (Lv et al. 2020 ) was used. Oligonucleotide sequence Table 1 . Sequences of oligonucleotides in the primer panel. The specific part is separated from the adapters with a "-" symbol. Using the primer panel described in the previous section, we sequenced 579 samples collected from patients with coronavirus infection from February to June 2021 in Moscow and the Moscow region. The time period during which the studies were carried out was a period when SARS-CoV-2 strains began to actively spread in Russia and the rest of the world, causing concern, which could become one of the reasons for new waves of morbidity in a number of countries. In isolates obtained in February only a very small (~2%) proportion of the "alpha" strain was found; in March its frequency increased to ~20%, which is consistent with the data that this strain has increased contagiousness (Davies et al. 2021 ). However, it did not receive further widespread adoption, and its share gradually decreased, and in mid-June it fell to almost zero. Probably, this was caused by the appearance in Russia of the "delta" strain in May, which earlier, possibly, caused the increased rates of morbidity and mortality among the population of India. By mid-June, the proportion of this strain also rose sharply to 70% in Moscow, and according to data not included in the presented study, it continues to increase up to more than 90%. In addition, at the end of June, there were cases of the "delta-plus" strain, which has an additional replacement K417N, which was previously found in the "beta" strain, and also located in one of the regions of the SARS-CoV-2 genome amplified using the developed primer panel. It is noteworthy that the "beta" variant did not receive significant distribution in Moscow, although in April its share quickly rose to ~13%, causing some concern, since there is evidence that this strain partially escapes the neutralizing effect of antibodies caused by both previous coronavirus infection and vaccination. In addition, it is worth noting strain B.1.1.523, the proportion of which increased significantly by April. The presence of the E484K mutation indicated that, like the beta strain, it may be more resistant to the action of antibodies. In addition, this variant has changes in the spike protein: S494P substitution and a deletion of three amino acids. However, its share also fell sharply in June, when the above-mentioned "delta" variant became prevalent. At the same time, we did not find cases of infection with the "gamma" ("Brazilian") strain. Finally, Figure 3 shows the dynamics of the representation of variants of the virus with mutations N501Y or E484K, but not assigned to any of the above strains due to the absence of other necessary changes in the genome. In this methodological article, we described the possibility of detecting a number of SARS-CoV-2 strains, including variants "alpha" ("British", B.1.1.7), "beta" ("South African", B.1.351), "gamma" ("Brazilian", P.1), "delta" ("Indian", B.1.617.2), using targeted high-throughput sequencing. For this purpose, we have developed a primer panel (Table 1) Basic Local Alignment Search Tool Efficacy and Safety of the mRNA-1273 SARS-CoV-2 Vaccine BBMerge -Accurate Paired Shotgun Read Merging via Overlap SARS-CoV-2 Neutralizing Antibody LY-CoV555 in Outpatients with Covid-19 Estimated Transmissibility and Impact of SARS-CoV-2 Lineage B.1.1.7 in England Circulating SARS-CoV-2 Variants Escape Neutralization by Vaccine-Induced Humoral Immunity Characterization of a Novel SARS-CoV-2 Genetic Variant with Distinct Spike Protein Mutations VMD: Visual Molecular Dynamics Qualitatively Distinct Modes of Sputnik V Vaccine-Neutralization Escape by SARS-CoV-2 Spike Variants Sputnik V COVID-19 Vaccine Candidate Appears Safe and Effective Bazykin, and The CoRGI (Coronavirus Russian Genetic Initiative) Consortium. 2021 Genomic Epidemiology of the Early Stages of the SARS-CoV-2 Outbreak in Russia Fast and Accurate Short Read Alignment with Burrows-Wheeler Transform 501Y.V2 and 501Y.V3 Variants of SARS-CoV-2 Lose Binding to Bamlanivimab Sequence Analysis of 20,453 Severe Acute Respiratory Syndrome Coronavirus 2 Genomes from the Houston Metropolitan Area Identifies the Emergence and Widespread Distribution of Multiple Isolates of All Major Variants of Concern Structural Basis for Neutralization of SARS-CoV-2 and SARS-CoV by a Potent Therapeutic Antibody Molecular Dynamic Simulation Reveals E484K Mutation Enhances Spike RBD-ACE2 Affinity and the Combination of E484K, K417N and N501Y Mutations (501Y.V2 Variant) Induces Conformational Change Greater than N501Y Mutant Alone, Potentially Resulting in an Escape Mutant Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine Scaling Accurate Genetic Variant Discovery to Tens of Thousands of Samples A Single Blind, Placebo-Controlled Randomized Study of the Safety, Reactogenicity and Immunogenicity of the 'EpiVacCorona' Vaccine for the Prevention of COVID-19, in Volunteers Aged 18-60 Years (phase I-II) Mutation N501Y in RBD of Spike Protein Strengthens the Interaction between COVID-19 and Its Receptor ACE2 E484K Mutation in SARS-CoV-2 RBD Enhances Binding Affinity with hACE2 but Reduces Interactions with Neutralizing Antibodies and Nanobodies: Binding Free Energy Calculation Studies REGN-COV2, a Neutralizing Antibody Cocktail, in Outpatients with Covid-19 Structural and Functional Ramifications of Antigenic Drift in Recent SARS-CoV-2 Variants A Pneumonia Outbreak Associated with a New Coronavirus of Probable Bat Origin