key: cord-0301675-q5qsdkea authors: Stadtmueller, M.; Laubner, A.; Rost, F.; Winkler, S.; Patrasova, E.; Simunkova, L.; Reinhardt, S.; Beil, J.; Dalpke, A. H.; Yi, B. title: Emergence and spread of a sub-lineage of SARS-CoV-2 Alpha variant B.1.1.7 in Europe, and with further evolution of spike mutation accumulations shared with the Beta and Gamma variants date: 2021-11-02 journal: nan DOI: 10.1101/2021.11.01.21265749 sha: 38d0eff5b6381245c2b06105a1b822dd4f87271c doc_id: 301675 cord_uid: q5qsdkea SARS-CoV-2 evolution plays a significant role in shaping the dynamics of the COVD-19 pandemic. To monitor the evolution of SARS-CoV-2 variants, through international collaborations, we performed genomic epidemiology analyses on a weekly basis with SARS-CoV-2 samples collected from a border region between Germany, Poland and the Czech Republic in a global background. For identified virus mutant variants, active viruses were isolated and functional evaluations were performed to test their replication fitness and neutralization sensitivity against vaccine elicited serum neutralizing antibodies. Thereby we identified a new B.1.1.7 sub-lineage carrying additional mutations of nucleoprotein G204P and open-reading-frame-8 K68stop. Of note, this B.1.1.7 sub-lineage is the predominant B.1.1.7 variant in several European countries, such as Czech Republic, Austria and Slovakia. The earliest samples belonging to this sub-lineage were detected in November 2020 in a few countries in the European continent, but not in the UK. We have also detected its further evolution with extra spike mutations D138Y and A701V, which are signature mutations shared with the Beta and Gamma variants, respectively. Antibody neutralization assay of virus variant isolations has revealed that the variant with extra spike mutations is 3.2-fold less sensitive to vaccine-elicited antibodies as compared to other B.1.1.7 variants tested, indicating potential for immune evasion, but it also exhibited reduced replication fitness. The wide spread of this B.1.1.7 sub-lineage was related to the pandemic waves in early 2021 in various European countries. These findings about the emergence, spread, evolution, infection and transmission abilities of this B.1.1.7 sub-lineage add to our understanding about the pandemic development in Europe, and could possibly help to prevent similar scenarios in future. As one of the SARS-CoV-2 variants of concern (VOC) (1), the alpha variant B.1.1.7 was first detected in the UK in September 2020. This variant was shown to be more transmissible (2) (3) (4) compared to previously detected other variants. In Europe, B.1.1. 7 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. ; https://doi.org/10.1101/2021.11.01.21265749 doi: medRxiv preprint relevant information may help to understand why and how the B.1.1.7 waves could take place across Europe in the spring of 2021, thereby possibly promoting suitable strategies for preventing the spread of other variants of concern that evolve quickly. We combined SARS-CoV-2 sequences generated from samples collected in a border region between Germany, Poland and Czech Republic, with full-length SARS-CoV-2 sequences periodically downloaded from GISAID (9) to build up genome sequence data set for emerging variant monitoring (locally generated sequences were shared on GISAID as well). We first performed quality check and filtered out low-quality sequences that met any of the following criteria: 1) sequences with less than 90% genome coverage; 2) genomes with too many mutations (defined as having >20 nucleotide mutations relative to the Wuhan reference), which would violate the SARS-CoV-2 molecular clock at the time of study; 3) genomes with more than ten ambiguous bases; and 4) genomes with clustered mutations, defined as mutations in close proximity to one another. These are the standard quality assessment parameters utilized in NextClade (https://clades.nextstrain.org). The current study was based on the 2.17 million global viral genomes available as of 30 June 2021. We used the dynamic lineage classification method in this study through the Phylogenetic Assignment of Named Global Outbreak Lineages (PANGOLIN) software suite (https://github.com/hCoV-2019/pangolin) (10) . This is intended for identifying the most epidemiologically important lineages of SARS-CoV-2 at the time of analysis (11) . Phylogenetic analysis was carried out to infer the transmission routes of B.1.1.7 in Europe (12) with a custom build of the SARS-CoV-2 NextStrain build (https://github.com/nextstrain/ncov) . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. ; https://doi.org/10.1101/2021.11.01.21265749 doi: medRxiv preprint (13) . The pipeline includes several Python scripts that manage the analysis workflow. Briefly, it allows for the filtering of genomes, the alignment of genomes in MAFFT (14) , phylogenetic tree inference in IQ-Tree (15) , tree dating (16) and ancestral state construction and annotation. The phylogeny analysis is rooted by Wuhan-Hu-1/2019. We analyzed daily cases of SARS-CoV-2 in the Czech Republic from publicly released data provided by the Ministry of Health of the Czech Republic (https://onemocneniaktualne.mzcr.cz/covid-19), and 7-day incidence rates per 100K inhabitants were calculated accordingly based on the local population. Daily cases of SARS-CoV-2 in Poland was obtained from publicly released data provided by the Service of the Republic of Poland (https://www.gov.pl/web/koronawirus/wykaz-zarazen-koronawirusem-sars-cov-2), and 7-day incidence rates per 100K inhabitants were calculated accordingly as well. All viruses used were patient isolates cultured from nasopharyngeal swabs. Virus stocks were All sera were derived from healthy individuals fully vaccinated with BNT162b2. A 2-fold dilution series of each serum was prepared in PBS+ (supplemented with 0.3 % bovine albumin, is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint CO2 with occasional shaking. Afterwards, the inoculum was aspirated, the cells were washed with PBS and fresh medium (DMEM GlutaMAX supplemented with 10% FBS, 1 % nonessential amino acids, 1% sodium pyruvate and 1% penicillin/streptomycin) was added. Supernatants were removed at 8, 16, 24, 48, 72 and 96 hours post infection (hpi). Infectious virus particles in the supernatant were determined using plaque assay, which was performed analogously to the neutralization assay from the infection step onwards. Results are given as plaque forming units (PFU) per ml. Graphs were generated using GraphPad Prism 9. Identification of one specific B.1.1.7 sub-lineage with extra mutations in Europe. We is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. (17) (18) (19) , so this analysis focused on the cross-country transmission taking place in January. Phylogeny analysis was performed with B.1.1.7 samples collected in January that are available at GISAID from 10 European countries (Austria: 202; . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Poland: 51; France: 1524; Denmark: 1412). In a few countries, especially in the UK, the sampling density was much higher than other countries, so we downsized the sample numbers to 100 randomly selected samples collected in January from each country. This condition was chosen because similar sample size from each country could largely prevent statistical errors in transmission route estimation. Fig. 2A shows the phylogeny-inferred cross-country transmission routes, which revealed two centers in the transmission network: UK and Czech Republic, indicating the frequencies of export from these two countries were much higher than is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. ; https://doi.org/10.1101/2021.11.01.21265749 doi: medRxiv preprint wave occurred between December 2020 to January 2021 (Fig. 4A) . For SARS-CoV-2 lineage development, between December 2020 to January 2021, the prevalence of B. 1.1.7 (B.1.1.7 samples/all sequenced samples) increased from 0% in late December 2020 to ~ 60% in late January 2021, replacing the majority of other lineages (Fig. 4B ) (Note: as mentioned in the results section 3, the early growth of the B.1.1.7 sub-lineage might be missed out from the genome surveillance owing to the low sampling density in December), suggesting the major driving force for the sharp wave was the quick expansion of the B.1.1.7 variant, which was shown to be more transmissible than previously existing other lineages (2) (3) (4) . Although the January wave was curbed temporarily by some countermeasures, after the January peak, the 7day incidence rate was kept at a high level (above 400), and reached another peak in early is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. ; https://doi.org/10.1101/2021.11.01.21265749 doi: medRxiv preprint 5. Signature mutations. mutations of the B.1.1.7-N:G204P-ORF8:68stop sub- Within this sub-lineage, the most common signature mutations were the same as B.1.1.7 signature mutations, with the additional N:G204P and ORF8:68stop mutations. As described below, there were other novel common spike mutations detected in a small portion of samples from this sub-lineage. mutation accumulation in the B.1.1.7 sub- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. ; https://doi.org/10.1101/2021.11.01.21265749 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. ; https://doi.org/10.1101/2021.11.01.21265749 doi: medRxiv preprint susceptibilities to vaccine elicited serum neutralizing antibodies in individuals following vaccination with two doses BNT162b2. These experiments showed a decrease of neutralization sensitivity for B.1.1.7_S+, which carries two extra spike mutations D138Y and A701V, compared to the other two B. 1.1.7 variants B.1.1.7_O and B.1.1.7_S of around 3.2-fold (Fig. 5B ). To evaluate replication abilities of these three B. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint from the German Ministry of Health to A.D. (project LüSeMut). We thank Dr. R. Weidemann and A. Zabzinski for help with this project. We thank all the collaboration partners who contributed to the project LüSeMut. The authors declare no competing interests. All the SARS-CoV-2 genomes generated and presented in this study are publicly accessible through the GISAID platform (https://www.gisaid.org/). generated as a part of this study will be made available but may require execution of a materials transfer agreement. Data processing and visualization was performed using publicly available software, primarily RStudio v1.3.1093. Phylogenetic maximum likelihood (ML) and time trees were constructed using the SARS-CoV-2-specific procedures taken from github.com/nextstrain/ncov. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 2, 2021. ; clusterProfiler: an R package for comparing biological themes among gene clusters Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England Changes in symptomatology, reinfection, and transmissibility associated with the SARS-CoV-2 variant B.1.1.7: an ecological study Emergence and rapid transmission of SARS-CoV-2 B.1.1.7 in the United States Genomic characteristics and clinical effect of the emergent SARS-CoV-2 B.1.1.7 lineage in London, UK: a whole-genome sequencing and hospital-based cohort study Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human plasma antibodies Sensitivity of infectious SARS-CoV-2 B.1.1.7 and B.1.351 variants to neutralizing antibodies Spatiotemporal invasion dynamics of SARS-CoV-2 lineage B.1.1.7 emergence Global initiative on sharing all influenza data -from vision to reality A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool Explaining the geographic spread of emerging epidemics: a framework for comparing viral phylogenies and environmental landscape data Nextstrain: real-time tracking of pathogen evolution MAFFT: iterative refinement and additional methods IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era TreeTime: Maximum-likelihood phylodynamic analysis Emergence and Spread of SARS-CoV-2 Lineages B.1.1.7 and P.1 in Italy curfew measures on SARS-CoV-2 B.1.1.7 circulation in France Tracking the international spread of SARS-CoV-2 lineages B.1.1.7 and B.1.351/501Y-V2 Phylogenetic analysis of SARS-CoV-2 lineage development across the first and second waves in Eastern Germany in 2020: insights into the cause of the second wave Spread of a SARS-CoV-2 variant through Europe in the summer of 2020 Genomics and epidemiology of the P.1 SARS-CoV-2 lineage in Manaus Detection of a SARS-CoV-2 variant of concern in South Africa Increased resistance of SARS-CoV-2 variant P.1 to antibody neutralization Evidence of escape of SARS-CoV-2 variant B.1.351 from natural and vaccine-induced sera SARS-CoV-2 variants, spike mutations and immune escape Rapidly emerging SARS-CoV-2 B.1.1.7 sub-lineage in the United States of America with spike protein D178H and membrane protein V70L mutations Temporal and spatial analysis of the 2014-2015 Ebola virus outbreak in West Africa Toward a quantitative understanding of viral phylogeography We thank all researchers who are working around the clock to generate and share genome data on GISAID (http://www.gisaid.org). We specifically thank colleagues at the Institute of Medical Microbiology and Virology, University Hospital Carl Gustav Carus, for their work in performing SARS-CoV-2 sample testing and sequencing sample preparing, and we thank the Dresden concept Genome Center for their sequencing efforts. We thank the Robert Koch Institute for the data management and sharing. Parts of this study were supported by a grant