key: cord-1005092-e5jh62ac authors: Fares, Wasfi; Ghedira, Kais; Gdoura, Mariem; Chouikha, Anissa; Haddad-Boubaker, Sondes; Khedhiri, Marwa; Ayouni, Kaouthar; Lamari, Asma; Touzi, Henda; Hammemi, Walid; Medeb, Zina; Sadraoui, Amel; Hogga, Nahed; ben Alaya, Nissaf; Triki, Henda title: Sequencing using a two-steps strategy reveals high genetic diversity in the S gene of SARS-CoV-2 after a high transmission period in Tunis, Tunisia date: 2021-06-19 journal: bioRxiv DOI: 10.1101/2021.06.18.449083 sha: 8b485906e1534339c665e9167f9d736a7e1aaf86 doc_id: 1005092 cord_uid: e5jh62ac Recent efforts have reported numerous variants that influence SARS-CoV-2 viral characteristics including pathogenicity, transmission rate and ability of detection by molecular tests. Whole genome sequencing based on NGS technologies is the method of choice to identify all viral variants; however, the resources needed to use these techniques for a representative number of specimens remain limited in many low and middle income countries. To decrease sequencing cost, we developed a couple of primers allowing to generate partial sequences in the viral S gene allowing rapid detection of numerous variants of concern (VOCs) and variants of interest (VOIs); whole genome sequencing is then performed on a selection of viruses based on partial sequencing results. Two hundred and one nasopharyngeal specimens collected during the decreasing phase of a high transmission COVID-19 wave in T unisia were analyzed. The results reveal high genetic variability within the sequenced fragment and allowed the detection of first introduction in the country of already known VOCs and VOIs as well as others variants that have interesting genomic mutations and need to be kept under surveillance. Importance The method of choice for SARS-CoV-2 variants detection is whole genome sequencing using NGS technologies. Resources for this technology remain limited in many low and middle income countries where it is not possible to perform whole genome sequencing for representative number of SARS-CoV-2 positive cases. In the present work, we developed a novel strategy based on a first partial sanger screening in the S gene including key mutations of the already known VOCs and VOIs for rapid identification of these VOCs and VOIs and helps to better select specimens that need to be sequenced by NGS technologies. The second step consisting in whole genome sequencing allowed to have a holistic view of all variants within the selected viral strains and confirmed the initial classification of the strains based on partial S gene sequencing. Abstract: 22 Recent efforts have reported numerous variants that influence SARS-CoV-2 viral 23 characteristics including pathogenicity, transmission rate and ability of detection by 24 molecular tests. Whole genome sequencing based on NGS technologies is the 25 method of choice to identify all viral variants; however, the resources needed to use 26 these techniques for a representative number of specimens remain limited in many 27 low and middle income countries. To decrease sequencing cost, we developed a 28 couple of primers allowing to generate partial sequences in the viral S gene allowing 29 rapid detection of numerous variants of concern (VOCs) and variants of interest 30 (VOIs); whole genome sequencing is then performed on a selection of viruses based 31 on partial sequencing results. Two hundred and one nasopharyngeal specimens 32 collected during the decreasing phase of a high transmission COVID-19 wave in 33 Tunisia were analyzed. The results reveal high genetic variability within the sequenced 34 fragment and allowed the detection of first introduction in the country of already known 35 VOCs and VOIs as well as others variants that have interesting genomic mutations 36 and need to be kept under surveillance. (6). The S Protein is composed of two sub-units, S1 containing the receptor-70 binding domain (RBD) and S2 that mediates membrane fusion (7) . The S protein 71 determines SARS-CoV-2 infectivity and transmissibility and is also the major antigen 72 inducing protective immune response (8). Since the beginning of the COVID-19 73 pandemic, the S protein has been undergoing several mutations and it is highly 74 important to follow the emergence of these variants and their biological, 75 epidemiological and clinical significance. Early in the pandemic, variants of SARS-76 CoV-2 containing a D to G substitution in the 614 amino-acid residue of the S protein 77 (D614G) were reported. This substitution increased receptor binding avidity and 78 D614G mutants became dominant in many geographic regions (9) (10) (11) We have also detected one sequence (SP062 -Sub-Cluster 1a) with a mutational Cryo-EM structure of the 2019-nCoV spike in the prefusion 400 conformation Antigenicity of the SARS-CoV-2 Spike Glycoprotein Tracking Changes in SARS-CoV-411 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus Natural deletions in the SARS-CoV-2 spike glycoprotein drive 467 antibody escape. bioRxiv S gene 469 dropout patterns in SARS-CoV-2 tests suggest spread of the H69del/V70del 470 mutation in the US. medRxiv Escape from 475 neutralizing antibodies by SARS-CoV-2 spike protein variants. Elife Bloom 479 JD. Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding 480 domain that affect recognition by polyclonal human plasma antibodies. Cell 481 Host Microbe Evolutionary and structural analyses of SARS-CoV-2 D614G 535 spike protein mutation now documented worldwide. Sci Rep Two-step strategy for the identification of SARS-542 CoV-2 variant of concern 202012/01 and other variants with spike deletion Early introductions and transmission 554 of SARS-CoV-2 variant B.1.1.7 in the United States Investigation of an outbreak of symptomatic SARS-CoV-2 VOC 559 202012/01-lineage B.1.1.7 infection in healthcare workers Epub ahead of print Variant Derived from Clade 19B, France. Emerg Infect Dis