key: cord-1054204-k5t5rjhe authors: Wawina-Bokalanga, Tony; Martí-Carreras, Joan; Vanmechelen, Bert; Bloemen, Mandy; Wollants, Elke; Laenen, Lies; Cuypers, Lize; Beuselinck, Kurt; Lagrou, Katrien; André, Emmanuel; Van Ranst, Marc; Maes, Piet title: Genetic diversity and evolution of SARS-CoV-2 in Belgium during the first wave outbreak date: 2021-06-29 journal: bioRxiv DOI: 10.1101/2021.06.29.450330 sha: e66048a73271d3938a5e9b79d70f84eaa98d857e doc_id: 1054204 cord_uid: k5t5rjhe SARS-CoV-2, the causative agent of COVID-19 was first detected in Belgium on 3rd February 2020, albeit the first epidemiological wave started in March and ended in June 2020. One year after the first epidemiological wave hit the country data analyses reveled the temporal and variant distribution of SARS-CoV-2 and its implication with Belgian epidemiological measures. In this study, 766 complete SARS-CoV-2 genomes of samples originating from the first epidemiological were sequenced to characterize the temporal and geographic distribution of the COVID-19 pandemic in Belgium through phylogenetic and variant analysis. Our analysis reveals the presence of the major circulating SARS-CoV-2 clades (G, GH and GR) and lineages circulating in Belgium at that time. Moreover, it contextualizes the density of SARS-CoV-2 cases over time with non-intervention measures taken to prevent the spread of SARS-CoV-2 in Belgium, specific international case imports and the functional implications of the most representative non-synonymous mutations present in Belgium between February to June 2020. (https://www.worldometers.info/coronavirus/), lower than SARS-CoV-1 and MERS-CoV, with 9.6% 48 and 34.3%, respectively 5 , although the true infection fatality rate is likely lower, as many asymptomatic 49 or mildly symptomatic cases remain undiagnosed 6 . Nonetheless, there is an urgent need for effective 50 treatments, as the global death toll has already surpassed 3.7 million confirmed deaths (WHO 51 coronavirus dashboard, 12/06/2021), and measures taken to control the spread of the virus have 52 significantly impacted social and economic activity world-wide. 53 54 SARS-CoV-2 is a non-segmented, positive single-strand RNA (+ssRNA) virus with a genome length 55 of around 30,000 nucleotides. The genome is organized into 12 genes that code for structural and non-56 structural proteins 5 . Structural proteins are the Spike (S), Envelope (E), Membrane (M) and 57 Nucleocapsid (N) proteins. Non-structural proteins are all generated from the ORF1ab polyprotein, 58 including the core proteins Nsp12 (RNA synthesis), Nsp7 and Nsp8 7 . Additionally, at least 5 accessory 59 proteins are encoded alongside the structural proteins: ORF3a (putative apoptotic factor), ORF6 60 (putative IFN-1 antagonist), ORF7a (putative leukocyte modulator) ORF8 (putative 61 immunomodulator), and ORF10 (unknown function). Recent estimates situate the mutation rate of 62 SARS-CoV-2 between 0.8 × 10 −3 and 1.12 × 10 −3 substitutions/site/year 8,9 , equating to 2 to 2.8 63 substitutions/month. As per the writing of this manuscript, SARS-CoV-2 mutations have been 64 organized dynamically in three large clades, based on non-synonymous substitutions 10 : (i) clade G 65 (D614G in the Spike), (ii) clade V (G251 in ORF3a) and (iii) clade S (L84S in ORF8) 11 . The Spike 66 mutation at D614G has been recently linked to an increased virus production in host cells and seems to 67 be the predominant mutation in European clades since March 2020 12 . Genetic diversity analyses 68 currently play an important role in improving our understanding of SARS-CoV-2. Complete genome 69 sequences shared through the Global Initiative on Sharing All Influenza Data (GISAID) have been, and 70 still are, valuable to monitor and contain the pandemic. Additionally, this unprecedented international 71 cooperation has allowed to rapidly evaluate the viral origin and genomic diversity of SARS-CoV-2 13 72 based on sequence similarity. Likewise, the availability of sufficient and diverse genome sequences to 73 capture the variability of SARS-CoV-2 may allow estimations of which sets of antiviral drugs are most 74 likely to be repurposed for this virus 13 . Changes in infection and mortality rates are influenced by SARS-75 CoV-2 genetic variation 14 and host genetics 15 . 76 In Belgium, the first confirmed case of SARS-CoV-2 infection was reported on 3th of February 2020 78 (2020 week 5), from an asymptomatic individual who was part of a quarantined group of 10 travelers, 79 epatriated from Wuhan to Brussels 16 . During week 8 of 2020, the Belgian government started engaging 80 in travel regulations from and into China, trying to minimize the possible SARS-CoV-2 introduction 81 events into the country. Still, borders were not closed until week 11, thus several entry events occurred 82 prior to strict border regulations. Later, the Belgian government attributed part of these introduction 83 events to returning travelers from the Carnival break ( Figure 1 ). In March 2020, the rapidly growing 84 number of confirmed cases alarmed the government, who decided to take restrictive measures to reduce 85 the spread of SARS-CoV-2 into and inside the country. By week 10, the federal government limited 86 indoor activity nationwide by closing bars and restaurants and prohibiting sportive and cultural 87 activities (lockdown). From week 11 to week 18, strict social distancing measures were applied, and all 88 non-essential travel and gatherings were suspended. Strict isolation policy during this period, which 89 included the Easter break (weeks 15-16, Figure 1 ) were key to a substantial reduction of cases. In this study, we performed phylogenetic and mutational analysis of SARS-CoV-2 genome sequences obtained during the first wave outbreak across different provinces in Belgium before the establishment 98 of a Belgian consortia for genomic surveillance of SARS-CoV-2. Sequencing runs were processed using the ARTIC analysis pipeline and custom scripts. Sequence 132 metadata was collected and complemented to their respective GISAID records. SARS-CoV-2 lineages 133 were derived using the Pangolin tool (https://cov-lineages.org/pangolin.html, github.com/cov-134 lineages/pangolin) 10 . Epidemiological data was obtained from Sciensano (the Belgian federal institute 135 for health, https://epistat.wiv-isp.be/covid/). Sequences were aligned with MAFFT 18 , SNPs were 136 collected with snp-sites 19 and annotated with VCF-annotator (https://github.com/rpetit3/vcf-annotator). (Table 1 and Suppl. Figure 1, respectively) . Overall, more than Late March border control measures greatly contributed to reduce international import (i.e. 307 returning travelers from holidays). Domestic clades and lineage distribution were comparable 308 to the general European trend, with clades G, GR and GH being the most prominent. Despite the sequencing rate being relatively low during the peak of transmission of the first wave in 310 Belgium (weeks 13 -16, April 2020), it was sufficient to identify the presence of minor clades 311 L, S and V, with S being mostly linked to direct SARS-CoV-2 import due to (i) its very low 312 prevalence in Belgium (4 cases), (ii) its placement in a separate clade in the phylogenetic tree, 313 as seen in Figure 3 and (iii) its direct link to travel history. Identification of 2 S clade sequences Preparedness needs research: How fundamental science and 348 international collaboration accelerated the response to COVID-19 MERS coronavirus: diagnostics, epidemiology and 351 transmission COVID-19, 353 SARS and MERS: are they closely related? Epidemiology, transmission dynamics and control of SARS: the 356 2002-2003 epidemic A new coronavirus associated with human respiratory disease in China. 359 Excess Deaths From COVID-19 and Other Causes One severe acute respiratory syndrome coronavirus protein complex 363 integrates processive RNA polymerase and exonuclease activities Temporal signal and the phylodynamic threshold of SARS-CoV-2 The emergence of SARS-CoV-2 in Europe and North America SARS-CoV-2 genomic surveillance in Taiwan revealed novel 372 ORF8-deletion mutant and clade possibly associated with Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus The 377 proximal origin of SARS-CoV-2 Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on 379 Transmissibility and Pathogenicity Initial whole-genome sequencing and analysis of the host genetic 381 contribution to COVID-19 severity and susceptibility A phylodynamic workflow to rapidly gain insights into the dispersal 383 history and dynamics of SARS-CoV-2 lineages Symptomatic SARS-CoV-2 reinfection by a phylogenetically 386 distinct strain MAFFT multiple sequence alignment software version 7: 388 Improvements in performance and usability SNP-sites: rapid efficient extraction of SNPs from multi-FASTA 390 alignments R: A Language and Environment for Statistical Computing Welcome to the Tidyverse IQ-TREE 2: New Models and Efficient Methods for Phylogenetic 394 Inference in the Genomic Era Maximum-likelihood 396 phylodynamic analysis jModelTest 2: more models, new 398 heuristics and parallel computing CoVariants: SARS-CoV-2 Mutations and Variants of Interest Geographical and temporal distribution of SARS-CoV-2 clades in the 402 WHO European Region Spike mutation D614G alters SARS-CoV-2 fitness SARS-CoV-2 D614G variant exhibits efficient replication ex vivo and 406 transmission in vivo. Science (80-. ) center for respiratory pathogens, is supported by Sciensano, which is gratefully acknowledged. The authors acknowledge all research teams that have deposited SARS-CoV-2 genome data on 335 GISAID (www.gisaid.org).