key: cord-1004708-9d7ft33p authors: Viana, R.; Moyo, S.; Amoako, D. G.; Tegally, H.; Scheepers, C.; Lessells, R. J.; Giandhari, J.; Wolter, N.; Everatt, J.; Rambaut, A.; Althaus, C.; Wilkinson, E.; Mendes, A.; Strydom, A.; Davids, M.; Mayaphi, S.; Gaseitsiwe, S.; Choga, W. T.; Maruapula, D.; Zuze, B.; Radibe, B.; Koopile, L.; Shapiro, R.; Lockman, S.; Mbulawa, M. B.; Mphoyakgosi, T.; Smith-Lawrence, P.; Mosepele, M.; Matshaba, M.; Masupu, K.; Chand, M.; Joseph, C.; Kuate-Lere, L.; Lesetedi-Mafoko, O.; Moruisi, K.; Scott, L.; Stevens, W.; Wibmer, C. K.; Mnguni, A.; Ismail, A.; Mahlangu, B.; Martin, D. P.; Hill, V.; Colquhoun, R. title: Rapid epidemic expansion of the SARS-CoV-2 Omicron variant in southern Africa date: 2021-12-21 journal: nan DOI: 10.1101/2021.12.19.21268028 sha: 620080b3e88fcaf05619f54a765e6d73dcab1bce doc_id: 1004708 cord_uid: 9d7ft33p The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic in southern Africa has been characterised by three distinct waves. The first was associated with a mix of SARS-CoV-2 lineages, whilst the second and third waves were driven by the Beta and Delta variants respectively. In November 2021, genomic surveillance teams in South Africa and Botswana detected a new SARS-CoV-2 variant associated with a rapid resurgence of infections in Gauteng Province, South Africa. Within three days of the first genome being uploaded, it was designated a variant of concern (Omicron) by the World Health Organization and, within three weeks, had been identified in 87 countries. The Omicron variant is exceptional for carrying over 30 mutations in the spike glycoprotein, predicted to influence antibody neutralization and spike function4. Here, we describe the genomic profile and early transmission dynamics of Omicron, highlighting the rapid spread in regions with high levels of population immunity. Since the onset of the COVID-19 pandemic in December 2019, variants of SARS-CoV-2 have emerged repeatedly. Some variants have spread worldwide and made major contributions to the cyclical infection waves that occur asynchronously in different regions. Between October and December 2020, the world witnessed the emergence of the first variants of concern (VOC). These variants exhibited increased transmissibility and/or immune evasion properties that threatened global efforts to control the pandemic. Although the Alpha, Beta and Gamma VOCs 2,5 that emerged during this time disseminated globally and drove epidemic resurgences in many different countries, it was the highly transmissible Delta variant that subsequently displaced all other VOC in most regions of the world 6 . During its spread, the Delta variant evolved into multiple sub-lineages 7 , some of which demonstrated signs of having a growth advantage in certain locations 8 , prompting speculation that the next VOC to drive a resurgence of infections would be likely derived from Delta. However, in October 2021, while Delta was continuing to exhibit high levels of transmission in the Northern hemisphere, a large Delta wave was subsiding in southern Africa. The culmination of this wave coincided with the emergence of a novel SARS-CoV-2 variant that, within days of its near-simultaneous discovery in four individuals in Botswana, a traveler from South Africa in Hong Kong, and 54 individuals in South Africa, was designated by the World Health Organization as Omicron: the fifth VOC of SARS-CoV-2. The three distinct epidemic waves of SARS-CoV-2 experienced by southern African countries were each driven by different variants: the first by descendants of the B.1 lineage 1 , the second by the Beta VOC 2,9 , and the third by the Delta VOC 3 , with an estimated 2-5% of third wave cases in South Africa attributed to the C.1.2 lineage 10 (Fig. 1A) . Serosurveys conducted before . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint the Delta wave suggested high levels of exposure to SARS-CoV-2 (40-60%) in South Africa 11,12 , Malawi 13 , and Zimbabwe 14 , and modelled estimates suggested seroprevalence of 70-80% across South Africa by October 2021 15 . Accordingly, the weeks following the third wave in South Africa, between 10 October and 15 November 2021, were marked by a period of lower-level transmission as indicated by a low incidence of reported COVID-19 cases (100-200 new cases per day) and low (<2%) test positivity rates (Fig. 1A-1C) . A rapid increase in COVID-19 cases was observed in mid-November 2021 in Gauteng province, the economic hub of South Africa containing the cities of Tshwane (Pretoria) and Johannesburg. Specifically, rising case numbers and test positivity rates were first noticed in Tshwane, initially associated with outbreaks in higher education settings. 16 . Given the low prevalence of Alpha in South Africa (Fig. 1A) , targeted wholegenome sequencing of these specimens was prioritized. On 19 November 2021, sequencing results of an initial batch of 8 SGTF samples collected between 14-16 November 2021 indicated that all were a new and genetically-distinct lineage of SARS-CoV-2. Further rapid sequencing identified the same variant in 29 of 32 routine diagnostic samples from multiple locations in Gauteng Province, indicating widespread circulation of this new variant by the second week of November. Crucially, this rise immediately preceded a sharp increase in reported case numbers (Fig. 1C, Extended Data Fig. 1) . In the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint following four days this lineage was confirmed by sequencing in another two provinces: KwaZulu-Natal (KZN) and the Western Cape (Fig. 1B) . Omicron was causing a rapid and sustained increase in cases in South Africa and Botswana ( Fig. 1C, Extended Data Fig. 2 for Botswana). In Gauteng, weekly test positivity rates increased from <1% in the week beginning 31 October, to 16% in the week beginning 21 November 2021, and to 35% in the week beginning 28 November, concurrently with an exponential rise in COVID-19 incidence (Fig. 1C, Extended Data Fig. 1) . Nationally, daily case numbers exceeded 22 000 (84% of the peak of the previous wave of infections) by 9 December 2021. At the same time, the proportion of TaqPath PCR tests with SGTF increased rapidly in all . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint provinces of South Africa reaching ~90% nationally by the week beginning 21 November 2021, strongly indicating that the fourth wave was being driven by Omicron: an indication that has now been confirmed by virus genome sequencing in all provinces (Fig. 1C) . Similarly, Botswana experienced a sharp increase in cases, doubling every 2-3 days late November to early December 2021, transitioning from a 7-day moving average of <10 cases/100 000 to above 25 cases/100,000 in less than 10 days (Extended Data Fig. 2) . By 16 December 2021, Omicron had been detected in 87 countries, both in samples from travelers returning from southern Africa, and in samples from routine community testing (Extended Data Fig. 3) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint Fig. 2A) . Importantly, the BA.1/Omicron cluster is highly phylogenetically distinct from any known VOC or variants of interest (VOI) and from any other lineages known to be circulating in southern Africa (e.g. C.1.2) ( Fig. 2A) . More recently, two related lineages have emerged (BA.2 and BA.3), both sharing many, but not all of the characteristic mutations of BA.1/Omicron and both having many unique mutations of their own. We primarily focus here on the BA.1 lineage which is rapidly spreading in multiple countries around the world and is the lineage first officially designated as the Omicron VOC. Time-calibrated Bayesian phylogenetic analysis of all BA.1 assigned genomes from southern Africa (as of 11 December 2021, n=553) estimated the time when the most recent common ancestor of the analysed BA.1 lineage sequences existed to be 9 October 2021 (95% credible intervals 30 September -20 October) with a per-day exponential growth rate of 0.136 (95% confidence interval (CI) 0.100 -0.173) reflecting a doubling time of 5.1 days (95% CI 4.0 -6.9) ( Fig 2B) . These estimates are robust to whether the evolutionary rate is estimated from the data or fixed to previously estimated values (Extended Data Table S1 ). Limiting the analysis to a . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint with an inset of Gauteng province. Circles represent nodes of the maximum clade credibility phylogeny, coloured according to their inferred time of occurrence (scale in top panel). Shaded areas represent the 80% highest posterior density interval and depict the uncertainty of the phylogeographic estimates for each node. Solid curved lines denote the links between nodes and the directionality of movement is anticlockwise along the curve. Compared to Wuhan-Hu-1, Omicron carries 15 mutations in the spike receptor-binding domain (RBD) (Fig. 3) Omicron also has a cluster of three mutations (H655Y, N679K and P681H) adjacent to the S1/S2 furin cleavage site (FCS) which are likely to enhance spike protein cleavage and fusion with host cells 32,33 and which could also contribute to enhanced transmissibility 34 (Extended . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint Data Fig. 4) . Outside of the spike protein, a deletion in nsp6 (105-107del), in the same region as deletions seen in Alpha, Beta, Gamma and Lambda, may have a role in evasion of innate immunity 35 ; and the double mutation in nucleocapsid (R203K, G204R), also present in Alpha, Gamma and C.1.2, has been associated with enhanced infectivity in human lung cells 36 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint Botswana/R43B66 sequence was so low that we were unable to exclude the possibility that the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint apparent recombination signal was attributable to a combination of miscalled/uncalled nucleotides and alignment uncertainty. Although we found no convincing phylogenetic or statistical evidence of either the most recent common ancestor of BA.1 and BA.2 being recombinant, or of the most recent common ancestors of either the BA.1 or BA.2 lineages having been derived through recombination, it should be noted that recombination tests in general will not have sufficient statistical power to reliably identify evidence of individual recombination events that result in transfers of less than ~5 contiguous polymorphic nucleotide sites between genomes 37,40,41 . Further, if BA.1 and/or BA.2 are the products of a series of multiple partially overlapping recombination events occurring across multiple temporally clustered replication cycles, the complex patterns of nucleotide variation that might result could be extremely difficult to interpret as recombination using the methods applied here 42 . We applied a selection analysis pipeline to all available sequences designated as BA.1 in GISAID as of 8 December 2021. The analysis followed the procedure described previously 35 I n a l l s i x g e n e s , t h i s s e l e c t i o n w a s s t r o n g ( d N / d S > 1 0 ) , a n d o c c u r r e d i n b u r s t s ( ≤ 6 % o f b r a n c h / s i t e c o m b i n a t i o n s s e l e c t e d ) . The mutation, the precise impact of which is currently unknown, is one of 19 in a proposed "501Y lineage Spike meta-signature" comprising the set of mutations that were most adaptive during . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint the evolution of the Alpha, Beta and Gamma VOC lineages 35 . Further, both R346K and L452R are known to impact antibody binding 23 and both of the codon sites where these mutations occur display evidence for directional selection (using the FADE method 47 ). These selective patterns suggest that, during its current explosive spread, Omicron may be undergoing additional evolution to modify its neutralization profile. We estimated that Omicron had a growth advantage of 0.24 (95% CI: 0.16-0.33) per day over Delta in Gauteng, South Africa (Fig. 4A) . This corresponds to a 5.4-fold (95% CI: 3.1-10.1) weekly increase in cases compared to Delta. The growth advantage of Omicron is likely to be mediated by (i) an increase relative to other variants of its intrinsic transmissibility, (ii) an increase relative to other variants in its capacity to infect, and be transmitted from, previously infected and vaccinated individuals; or (iii) both. The predicted combination of transmissibility and immune evasion for Omicron strongly depends on the assumed level of current population immunity against infection by, and transmission of, the competing variant Delta that is afforded by prior-infections with wild-type Wuhan, Beta, Delta, and other strains, and/or vaccination (Fig. 4B) . For moderate levels of population immunity against Delta (Ω = 0.4), immune evasion alone cannot explain the observed growth advantage of Omicron (Fig. 4C) . For medium levels of immunity against Delta (Ω = 0.6), very high levels of immune evasion could explain the observed growth advantage without an additional increase in transmissibility (Fig. 4D) . For high levels of population immunity against Delta (Ω = 0.8), even moderate levels of immune evasion (~25-50%) can explain the observed growth advantage without an additional increase in transmissibility (Fig. 4E h a t t h e p r o p o r t i o n o f t h e p o p u l a t i o n w i t h p o t e n t i a l i m m u n i t y a g a i n s t D e l t a a n d e a r l i e r v a r i a n t s i s l i k e l y t o b e a b o v e 60% 11,12 . We thus argue that the population level of protective immunity against Delta is high, and that partial immune evasion is a major driver for the observed dynamics of Omicron in South Africa. This notion is supported by recent findings that show an increased risk of SARS-CoV-2 reinfection associated with the emergence of Omicron in South Africa 48 and the initial results from neutralization assays 26,27 . In addition to immune evasion, an increase, or decrease, in the transmissibility of Omicron compared to Delta cannot, however, be ruled out. There are a number of limitations to this analysis. First, we estimated the growth advantage of Omicron based on early sequence data only. These data could be biased due to targeted sequencing of SGTF samples and stochastic effects (e.g., superspreading) in a low incidence setting, which can lead to overestimates of the growth advantage, and consequently of the increased transmissibility and immune evasion. Second, without reliable estimates of the level of protective immunity against Delta in South Africa, we cannot obtain precise estimates of transmissibility or immune evasion of Omicron. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint December 2021. All data visualization was generated through the ggplot package in R 53 . As part of the NGS-SA, seven sequencing hubs in South Africa receive randomly selected samples for sequencing every week according to approved protocols at each site 54 . These For Oxford Nanopore sequencing, the Midnight primer kit was used as described by Freed and Silander 55 . cDNA synthesis was performed on the extracted RNA using LunaScript RT mastermix (New England BioLabs) followed by gene-specific multiplex PCR using the Midnight Primer pools which produce 1200bp amplicons which overlap to cover the 30-kb SARS-CoV-2 genome. Amplicons from each pool were pooled and used neat for barcoding with the Oxford Nanopore Rapid Barcoding kit as per the manufacturer's protocol. Barcoded samples were pooled and bead-purified. After the bead clean-up, the library was loaded on a prepared R9.4.1 flow-cell. A GridION X5 or MinION sequencing run was initiated using MinKNOW software with the base-call setting switched off. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint We assembled paired-end and nanopore .fastq reads using Genome Detective 1.132 (https://www.genomedetective.com) which was updated for the accurate assembly and variant calling of tiled primer amplicon Illumina or Oxford Nanopore reads, and the Coronavirus Typing Tool 56 . For Illumina assembly, GATK HaploTypeCaller --min-pruning 0 argument was added to increase mutation calling sensitivity near sequencing gaps. For Nanopore, low coverage regions with poor alignment quality (<85% variant homogeneity) near sequencing/amplicon ends were masked to be robust against primer drop-out experienced in the Spike gene, and the sensitivity for detecting short inserts using a region-local global alignment of reads, was increased. In addition, we also used the wf_artic (ARTIC SARS-CoV-2) pipeline as built using the nextflow workflow framework 57 . In some instances, mutations were confirmed visually with .bam files using Geneious software V2020.1.2 (Biomatters). The reference genome used throughout the assembly process was NC_045512.2 (numbering equivalent to MN908947.3). Raw reads from the Illumina COVIDSeq protocol were assembled using the Exatype NGS SARS-CoV-2 pipeline v1.6.1, (https://sars-cov-2.exatype.com/). This pipeline performs quality control on reads and then maps the reads to a reference using Examap. The reference genome used throughout the assembly process was NC_045512.2 (Accession number: MN908947.3). Several of the initial Ion Torrent genomes contained a number of frameshifts, which caused unknown variant calls. Manual inspection revealed that these were likely to be sequencing errors resulting in mis-assembled regions (likely due to the known error profile of Ion Torrent sequencers) 58 . To resolve this, the raw reads from the IonTorrent platform were assembled using the SARSCoV2 RECoVERY (REconstruction of COronaVirus gEnomes & Rapid analYsis) pipeline implemented in the Galaxy instance ARIES (https://aries.iss.it). This pipeline fixed the observed frameshifts, confirming that they were artefacts of mis-assembly; this subsequently . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. To test for the possibility that the Omicron lineage is a recombinant of other SARS-CoV-2 lineages, we used a global subsample of sequences spanning January 2021 to August 2021. Using the NCBI SARS-CoV-2 Data hub 60,61 , we constructed a dataset containing 221 sequences by randomly sampling five sequences from each month for each continent. No Oceania samples were available from July or August, and no South American sequences were available from July 2021 62 . These sequences were aligned together with a set of five high quality BA.1 and seven BA.2 sequences (representing the known diversity of these clades on 5 December 2021) using MAFFT 63 with default settings. Whereas 3SEQ 38 , and RDP5 39 were used to analyse this dataset, a subsample of the 39 most divergent sequences from the dataset was analysed using the GARD recombination detection method 37 . Default program settings were used throughout for recombination analyses, with the exception of RDP5 analysis, in which sequences were treated as linear and the window sizes for the SiScan and BootScan methods (two of the seven recombination detection methods applied in RDP5) were changed to 2000 nucleotides. We investigated the nature and extent of selective forces acting on BA.1 genes encoding individual protein products (a median of 25 unique BA.1 sequences per protein product encoding genome region). A subset of publicly available sequences (from the Virus Pathogen Database and Analysis Resource (ViPR) (https://www.viprbrc.org/) were included as . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All sequences on GISAID 17,18 designated Omicron (n=686; date of access: 7 December 2021) were analyzed against a globally representative reference set of SARS-CoV-2 genotypes (n=12 609) spanning the entire genetic diversity observed since the start of the pandemic. In short, the reference set included: 1. All genomes from Africa assigned to PANGO lineage B. 69 . The resulting tree was then visualized and annotated in ggtree in R 70 . To estimate a time-scale and growth rate from the genome sequence data, BEAST v1.10.4 71, 72 was used to sample phylogenetic trees under an exponential growth coalescent model using a strict molecular clock. All BA.1 assigned genomes from South Africa and Botswana (as of 11 December 2021) were included with some lower coverage genomes removed leaving a total of 553 genomes. The single South African BA.2 (CERI-KRISP-K032307, EPI_ISL_6795834) was included to help stabilize the root of the BA.1 clade but the exponential growth coalescent model was only applied to BA.1 (a constant population size coalescent was used for the rest of the tree). The rate of molecular evolution was estimated from the data. Two runs of 100 million iterations were compared to assess convergence and then post-burnin samples pooled to summarize parameter estimates. We analysed the full South Africa & Botswana dataset (n = 552) and the reduced dataset containing only Gauteng Province genomes (n = 277) using the serially sampled birth-death skyline (BDSKY) model 73 Table 3 . For each analysis, we used a strict clock model with a fixed clock rate of 7.5x10-4 substitution/site/year and a HKY substitution model. The mean duration of infectiousness was . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint fixed at 10 days 75, 76 . The effective reproductive number, Re, was assumed to be constant with time. The sampling proportion, s, was assumed to be 0 before the collection time of the first sample (2021-11-04) and allowed to change at fixed times that were approximately equidistantly spaced between the first sample and the most recent sample (2021-12-05). The maximum clade credibility (MCC) tree generated from the analysis of the full South Africa and Botswana dataset with a Skygrid coalescent tree prior was used as the starting tree. We kept the subsequent tree topologies fixed such that the resulting MCMC chain only sampled internal node heights. 77 based on high effective sample sizes (>200) and good mixing in the chains. Maximum clade credibility trees for each run were summarized in TreeAnnotator after discarding the first 10% of the chain as burn in. Finally, the spatiotemporal dispersal of Omicron was mapped using the R package "seraphim" 78 . We analyzed 805 SARS-CoV-2 sequences from Gauteng, South Africa, that were uploaded to GISAID with sample collection dates from 1 September -1 December 2021 17 . We used a multinomial logistic regression model to estimate the growth advantage of Omicron compared to Delta at the time point where the proportion of Omicron reached 50% 79, 80 . We fitted the model using the multinom function of the nnet package and estimated the growth advantage using the package emmeans in R. The difference in the net growth rates (i.e., the growth advantage) between a variant (Omicron) and the wild-type (Delta) can be expressed as follows 81 : where τ is the increase of the intrinsic transmissibility, is the level of immune evasion, β is the transmission rate of the wild-type, and S is the proportion of the population that is susceptible to the wild-type. This relation can be algebraically solved for τ and . We further define R w = β SD as the effective reproduction number of the wild-type with D being the generation time. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint We assumed D to be normally distributed with a mean of 5.2 days and a standard deviation of 0.8 days 82 . We sampled from publicly available estimates of the daily R w based on confirmed cases during the early growth phase of Omicron in South Africa (1 October -31 October 2021; range: 0.78-0.85 (https://github.com/covid-19-Re) 51 . All SARS-CoV-2 whole genome sequences produced by NGS-SA are deposited in the GISAID sequence database and are publicly available subject to the terms and conditions of the GISAID database. The GISAID accession numbers of sequences used in the phylogenetic analysis, including Omicron and global references, are provided in the Supplementary Table S1. All input files (e.g. alignments or XML files), along with all resulting output files and scripts used in the present study will be made available upon request and publicly shared on GitHub at final publication. 62. covid19-omicron-origins-recombination/aligned_234.shortnames.afa at main · bonilab/covid19-omicron-origins-recombination · GitHub. https://github.com/bonilab/covid19-omicron-originsrecombination/blob/main/4%20GS5%20plus%20Canada%20Outlier%20Lineage/4.2%20ali . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint t h e s h o r t l e n g t h o f t h i s r e g i o n . T w o o t h e r c a n d i d a t e B F R s --o n e i n S 2 ( p o s i t i o n s 2 6 1 5 9 -2 7 2 6 9 ) a n d o n e a t t h e 5 ' e n d o f t h e g e n o m e ( p o s i t i o n s 1 -1 0 6 0 ) --s h o w e d l i t t l e g e n e t i c d i v e r s i t y a n d n . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform HyPhy 2.5-A Customizable Platform for Evolutionary Hypothesis Testing Using Phylogenies Receptor binding and priming of the spike protein of SARS-CoV-2 for membrane fusion Nextclade: clade assignment, mutation calling and quality control for viral genomes FastTree 2 -approximately maximum-likelihood trees for large alignments Confidence limits on phylogenies: an approach using the bootstrap Maximum-likelihood phylodynamic analysis Using ggtree to Visualize Data on Tree-Like Structures Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. Virus Evol Bayesian phylogenetics with BEAUti and the BEAST 1.7 Birth-death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C virus (HCV) BEAST 2.5: An advanced software platform for Bayesian evolutionary . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted The global spread of 2019-nCoV: a molecular evolutionary analysis Inferred duration of infectious period of SARS-CoV-2: rapid scoping review and analysis of available evidence for asymptomatic and symptomatic COVID-19 cases Posterior summarization in bayesian phylogenetics using tracer 1.7 SERAPHIM: studying environmental rasters and phylogenetically informed movements Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England Increased transmissibility and global spread of SARS-CoV-2 variants of concern as at A tale of two variants: Spread of SARS-CoV-2 variants Alpha Estimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic International license Extended Data Table 1. Parameter estimates from BEAST for the full South Africa & Botswana data set and the reduced data set of only Gauteng Province genomes It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The authors declare no competing interests Supplementary Information is available for this paper . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint Reprints and permissions information is available at www.nature.com/reprints . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted December 21, 2021. ; https://doi.org/10.1101/2021.12.19.21268028 doi: medRxiv preprint