key: cord-0843438-xphoen8r authors: Butera, Y.; Mukantwari, E.; Artesi, M.; Umuringa, J. D.; O'Toole, A. N.; Hill, V.; Rooke, S.; Hong, S. L.; Dellicour, S.; Majyambere, O.; Bontems, S.; Boujemla, B.; Quick, J.; Resende, P. C.; Loman, N. J.; Umumararungu, E.; Kabanda, A.; Murindahabi, M. M.; Tuyisenge, P.; Gashegu, M.; Rwabihama, J. P.; Sindayiheba, R.; Gikic, D.; Souopgui, J.; Ndifon, W.; Rutayisire, R.; Gatare, S.; Mpunga, T.; Ngamije, D.; Bours, V.; Rambaut, A.; Nsanzimana, S.; Baele, G.; Durkin, K.; Mutesa, L.; Rujeni, N. title: Genomic Sequencing of SARS-COV-2 in Rwanda: evolution and regional dynamics date: 2021-04-07 journal: nan DOI: 10.1101/2021.04.02.21254839 sha: c5e94d818c45ef7369cce9f4164e9f73a813c020 doc_id: 843438 cord_uid: xphoen8r The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), responsible for coronavirus disease 19 (COVID-19), is a single-stranded positive-sense ribonucleic acid (RNA) virus that typically undergoes one to two single nucleotide mutations per month. COVID-19 continues to spread globally, with case fatality and test positivity rates often linked to locally circulating strains of SARS-CoV-2. Furthermore, mutations in this virus, in particular those occurring in the spike protein (involved in the virus binding to the host epithelial cells) have potential implications in current vaccination efforts. In Rwanda, more than twenty thousand cases have been confirmed as of March 14th 2021, with a case fatality rate of 1.4% and test positivity rate of 2.3% while the recovery rate is at 91.9%. Rwanda started its genomic surveillance efforts, taking advantage of pre-existing research projects and partnerships, to ensure early detection of SARS-CoV-2 variants and to potentially contain the spread of variants of concern (VOC). As a result of this initiative, we here present 203 SARS-CoV-2 whole genome sequences analyzed from strains circulating in the country from May 2020 to February 2021. In particular, we report a shift in variant distribution towards the newly emerging sub-lineage A.23.1 that is currently dominating. Furthermore, we report the detection of the first Rwandan cases of the VOCs, B.1.1.7 and B.1.351, among incoming travelers tested at Kigali International Airport. We also discuss the potential impact of COVID-19 control measures established in the country to control the spread of the virus. To assess the importance of viral introductions from neighboring countries and local transmission, we exploit available individual travel history metadata to inform spatio-temporal phylogeographic inference, enabling us to take into account infections from unsampled locations during the time frame of interest. We uncover an important role of neighboring countries in seeding introductions into Rwanda, including those from which no genomic sequences are currently available or that no longer report positive cases. Our results point to the importance of systematically screening all incoming travelers, regardless of the origin of their travels as well as regional considerations for durable response to COVID-19. The coronavirus disease 2019 (COVID- 19) due to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continues to impose a heavy death toll globally and represents a major global health challenge. Real-time whole-genome sequencing provides invaluable insights on the pandemic's transmission dynamics and enables effective surveillance. Moreover, genomic data provides useful information required for the ongoing development of vaccines, therapeutics and diagnostic tools. Analysis of SARS-CoV-2 mutations is particularly crucial when these affect epitopes involved in the induction of host immune responses as they may lead to immune evasion, with potential implications for vaccine (and immunotherapy) efficacy. The global SARS-CoV-2 lineage nomenclature has already been proposed with A and B as the initial phylogenetic lineages (1), followed by a number of sub-lineages. Emerging SARS-CoV-2 variants are circulating globally and a number of variants of concern (VOC) have been reported such as the B.1.1.7 VOC (also known as 20I/501Y.V1 or VOC 202012/01), which is characterized by twenty-three mutations (thirteen non-synonymous mutations, four deletions and six synonymous mutations), is associated with higher transmissibility (2) and increased mortality (3, 4) ; and the B.1.351 VOC (known as 20H/501Y.V2), which emerged independently of B.1.1.7, shares some mutations with the B.1.1.7 VOC and has recently also been associated with low vaccine efficacy in South Africa (5) . Another VOC known as P.1 was first identified in Brazil and is characterized by seventeen unique mutations, including three in the receptor binding domain of the spike protein (6) . In Rwanda, the first case of SARS-CoV-2 was confirmed in the capital city, Kigali, on March 14 th , 2020 following a series of testing at the borders and the Kigali International Airport, (KIA), and was linked to incoming travelers from Mumbai, India. Subsequently, a countrywide total lockdown, coupled with strict prevention measures including contact tracing, was enforced for nearly two months aiming to contain the spread of the virus. From May 2020, lockdown restrictions were lifted progressively, a number of commercial activities resumed and the KIA reopened on the 1 st of August 2020. However, despite continued massive testing (7) , contact tracing, hotspot mapping and preventive measures (8) , the number of cases continued to increase, mainly associated with cross-border land travels through truck drivers (9) and imported cases. This culminated in a 'first wave' of local transmission between July and September 2020. Additional containment measures led to the decline of cases until November 2020 when schools and most activities resumed. Since December 2020, another 'wave' of infections hit the country, peaking in January-February 2021. As a result, new movement restrictions were enforced, including a total lockdown in the capital city and a seven days' quarantine for international travelers in addition to two negative polymerase chain reaction (PCR) tests, one pre-departure and another one upon arrival. In this study, we describe the dynamics of transmission based on genomic analysis of isolates from the first and second waves of the epidemic in Rwanda. In particular, we highlight a shift from ancestral dominant B.1.380 lineage in the early stages of local transmission to a new lineage, A.23.1, that is currently dominating throughout the country. Combining the collected genomic sequence data with individual travel histories to perform travel history-aware phylogeographic inference, we infer introductions into Rwanda from all of its surrounding countries including those from which no genomic sequences are available. Given the importance . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint of these findings on regional surveillance of SARS COV-2, we emphasize the need for strengthening genomic surveillance at the country's points of entry following the detection of the first cases of the B.1.1.7 and B.1.351 VOCs among travelers arriving at KIA As of the 10th February 2021, a total of 16,865 cases have been confirmed in the country and the sequences analyzed represent 1.2% of the total cases. The proportion of daily confirmed cases versus the number of sequences taken is illustrated in Figure 1 . We sampled a total of 203 cases (reflecting the national screening efforts at points of entry and emerging hotspots) with an average age of 36.7 years of whom 131 were males and 70 females (and two unknowns) in this study. Of these, location data was available for 152 individuals, of whom 99 lived in Kigali while others were living in different districts of the country (Figure 1 ). Significant efforts were made to obtain associated metadata for all cases, with specific attention to individual travel history data, as these may shed light on the origins of viral variants introduced from neighboring countries. Of the 203 cases, 28 had recorded travel history (mainly sampled at the airport and other points of entry through the national monitoring and testing efforts) from Tanzania (6) Supplementary Table S1. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. analyses. Red crosses mark the collection dates of Rwandan sequences with travel history from the respective countries. Although few to no sequences are available from Burundi, Gabon, South Sudan and Tanzania, these travel history data point to SARS-CoV-2 lineages circulating in these countries, to the extent that returning travelers from these countries import those lineages into Rwanda. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint The genomes were analyzed using the Pangolin module. Overall, the majority of the SARS-CoV-2 sequences in Rwanda belong to two distinct lineages, A.23.1 and B.1.380. However, the dynamics of their distribution changed over time ( Figure 4) . Indeed, the early stages of local transmission (May through November 2020) were characterized by circulation of a dominant B.1.380 lineage, which has only been observed in Rwanda and Uganda. The (limited) diversity of the viral strains observed in the period of May to July 2020 are most likely early imports from Europe and Asia before suppressive measures (such as the countrywide lockdown and the airport closure) were enforced. Nevertheless, an increased strain diversity is observed from the period August-October 2020, most likely reflecting introductions through cross-border land travels for goods and cargo (9) . E92K and ORF9 N: S202N, Q418H. The Q613H mutation is predicted to be functionally equivalent to the D614G mutation that arose early in 2020 and is associated with increased viral transmissibility (11) . The P681R spike adds a basic amino acid adjacent to the spike furin cleavage site. This same change has been shown in vitro to enhance the fusion activity of the SARS-CoV-2 spike protein, likely due to increased cleavage by the cellular furin protease (12) . Mutations in nsp6 protein alter cellular autophagy pathways that promote replication (13) , suggesting that it may be more transmissible than its parent lineages A and A.23. is a more transmissible lineage, data inclusive of this paper does not report onward transmission of these variants. Importantly, a recent study suggests that B.1.1.7 is not only more transmissible than preexisting SARS-CoV-2 variants, but it may also cause more severe illness, as indicated by a reported higher hazard of death (4). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint We made use of publicly available data and the newly sequenced Rwandan SARS-CoV-2 genomes -all available in GISAID (14, 15) -to infer a time-scaled phylogenetic tree using maximum-likelihood inference (see Methods). This phylogeny enabled us to identify two subtrees with predominantly Rwandan sequences. Both of these subtrees consist of genetically distinct variants, with the larger cluster belonging to lineage B.1.380 (and hence referred to as subtree B.1) and the smaller one to A.23.1 (referred to as subtree A). The considerable difference in sampling dates and genetic distance between the sequences suggests that the currently lineage. In our analysis of both subtrees, we fit a travel history-aware asymmetric discrete-state diffusion process to model the spatial spread between countries. Our phylogeographic reconstructions included a total of 17 sequences with travel history, 11 for the analysis of subtree A and 6 for subtree B.1 (Table 1) . Interestingly, some of these sequences have associated travel histories originating from Tanzania (4 in subtree A and 1 in subtree B.1), a country that has not reported any COVID-19 cases since May 8th, 2020 (16) , and also has no publicly available genomes on GISAID. While Burundi and South Sudan have been consistently reporting case numbers, no genomic sequences are available on GISAID from these countries yet. Our . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. phylogeographic reconstructions are able to include those countries as locations with SARS-CoV-2 infections, by exploiting data on infected incoming travelers from those countries. Table S2 ). We found an expected number of two introduction events from Tanzania into Rwanda, corresponding to and being derived from the two arriving traveler cases, as well as single introduction events from South Sudan and China into Rwanda. Figure 5 also shows frequent mixing between Rwanda, Uganda and Kenya, with the latter two estimated to have seeded introductions into Tanzania, from where no genomic sequences are available to date (see Discussion). However, by employing a travel history-aware inference methodology, we are able to confirm the of lineage A.23.1 among travelers from Tanzania, despite the absence of genomic data. In our analysis of subtree B.1, which includes Rwandan lineage B.1.380, we inferred a minimum number of 9 (HPD 95%: [8] [9] [10] [11] [12] ) introduction events into Rwanda, with 3 of these events originating from Kenya ( Figure 7 ; Table S2 ). We also found an expected number of 2 introduction events from both Uganda and Italy. Using Bayesian stochastic search variable selection (BSSVS), we identified seven statistically supported (Bayes Factor >3) transition routes into Rwanda for subtree A and six for subtree B.1 ( Figure 7 ; Table S2 ). Our analysis on subtree A showed that Uganda accounted for the majority . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. Consistent with previously published analyses of SARS-CoV-2, we observe that our discrete Bayesian phylogeographic reconstructions resulted in MCC trees of which the internal nodes can be poorly supported, a common phenomenon in SARS-CoV-2 phylogenies (Figures 5 and 6 ). The considerable uncertainty in phylogenetic clustering results in a variety of diverging phylogeographic histories, which end up not being captured in the MCC trees as these only represent point estimates of the posterior distribution. To this end, we explored the ancestral spatial histories of individual samples of interest using Markov jump trajectory plots (17, 18) ( Figure S3 ). In the case of subtree A, the travel-aware reconstructions showed four sequences consistently forming two clusters with posterior support >0.9. However, the first two of these four cases correspond to cross-border truck drivers of Tanzanian nationality (sampled on the same day on the same sampling location, i.e. the Rusumo border), with no such metadata available for the other two cases in subtree A. Hence, the two inferred introductions actually correspond to four introduction events from Tanzania into Rwanda, which are clustered together by location in our joint inference, likely as a result of additional samples currently lacking from the border region. Because of this, sequences in each cluster result in nearly identical spatial histories. Figure S3A and S3B show the Markov jump trajectory plots for these two introductions. Overall, we see considerable ambiguity in the ancestral locations prior to Tanzania, as seen by the density of lines landing in "Other" alternate locations. More broadly, we see that in both cases the Rwandan sequences diverged from ancestors in Tanzania, Kenya and . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint Uganda, with considerable uncertainty placed at the root, among the Democratic Republic of the Congo (DRC), Sierra Leone and Mali. The introduction in subtree B, on the other hand, presents us with a different ancestral relation with Tanzania ( Figure S3C ). Although we also generally observe considerable uncertainty in the ancestral paths, we observe a strong signal for an ancestry in Rwanda prior to the introduction from Tanzania. This would imply a transmission chain starting in Rwanda, spreading into Tanzania, and then being reintroduced into Rwanda. A similar dynamic of outflow and inflow of Rwandan lineages can be seen in the ancestral histories for the sequences with travel history to Morocco, Italy and the DRC ( Figure S4 ). This suggests a bidirectional exchange of SARS-CoV-2 genomes between each of these countries and Rwanda. However, because of the differences in sequencing efforts across the globe, it remains difficult to conclude that such is the case, and the possibility of intermediary locations cannot be discarded. Nonetheless, all spatial trajectory plots imply the presence of SARS-CoV-2 lineages circulating in Tanzania after May 2020. The difference in ancestral histories coupled with the fact that these travel history sequences are genetically distant from each other imply that multiple SARS-CoV-2 lineages have circulated in Tanzania to this day. In addition, subtree A contains a sequence with travel history to South Sudan. Although over 9,000 COVID cases have been reported up to date (16) , no genomic sequences are publicly available for South Sudan. The sample tested at arrival in Rwanda presents us with evidence of lineage A having circulated in South Sudan during the months of May and June 2020 ( Figure S3D ). As expected, the Markov jump trajectory plots for this sample also show considerable uncertainty in the reconstruction of the ancestral locations prior to South Sudan. Regardless, we . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint see some support for ancestry in Kenya, Uganda and the DRC, which provides further evidence for viral transmission between the neighboring countries in the area. Once introduced in Rwanda, our continuous phylogeographic analysis of SARS-CoV-2 lineages highlight an important inter-connection of those lineages centered around Kigali ( Figure S5 ). Most sequences sampled outside the city appeared to be evolutionarily linked to sequences sampled within this city area, and would then correspond to independent dispersal events from Kigali. However, this phylogeographic pattern, i.e. the central importance of Kigali within the dispersal history of SARS-CoV-2 lineages, might to some extent result from the higher sampling effort within the capital city. Therefore, it is likely that a higher sampling effort outside Kigali would highlight more local transmission. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Table S2 for the Bayes factor support values for these reported transitions. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. In this study, we reported on the ongoing genomic sequencing efforts in Rwanda, which are complemented with careful collection of associated travel history metadata of incoming travelers. These efforts enabled us to exploit this information by performing joint Bayesian travel history-aware phylogeographic inference on these data. By applying this recently developed approach, we demonstrated considerable contributions of neighbouring countries' sequence introductions into Rwanda (as well as possible bidirectional exchanges). Of particular interest to this study, we were able to include traveler cases from Tanzania, Burundi and South Sudan. Incorporating travel history information in phylogeographic analysis can mitigate sampling bias (from unsampled or under-sampled countries) (17) , although this cannot fully replace the lack of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. Ongoing genomic surveillance in Rwanda indicates additional samples from these VOCs (mostly B.1.351) from travelers sampled at the airport. In an effort to curb the spread of the variants, following the upsurge of cases in November-December 2020, several measures were taken by the Rwandan government including a 7-day quarantine to all incoming passengers followed by a RT-PCR test, in addition to presenting a COVID-19 negative test upon arrival. Furthermore, the capital city of Kigali (i.e., a major COVID-19 hotspot) went through a total lockdown from mid-January to early February 2021, and travels between districts were prohibited until mid-March . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint 2021. A 7pm to 4am curfew was instituted in early February 2021; public offices were closed and employees were working from their homes. All schools in Kigali were closed as well, and classes were being held online. Cafés and restaurants were only providing take-away services. Churches, public swimming pools and gyms were closed (Office of the Prime Minister -Republic of Rwanda 2021). Such suppression mechanisms (population-wide social distancing, school closure, case isolation) have proven the greatest impact (as far as non-pharmaceutical approaches are concerned) in terms of transmission control (19) . Additionally, all public health facilities received free antigen rapid diagnostic tests for every single person presenting COVID These results suggest that neighboring countries play an important role in establishing the circulation of (different strains of) SARS-CoV-2 in Rwanda. However, due to the unevenness in sampling across countries, with several not providing any genomic sequences, additional data would be required to accurately assess the effect of short-distance (e.g. crossing border with neighboring countries) versus long-distance travel in shaping the Rwandan epidemic. Low spread of mutant virus may be partly explained by the fact that Rwanda is relatively less affected by the spread of SARS COV-2 in the region consecutive to several public health measures implemented in light with regular bi-weekly surveillance surveys across the country conducted by RBC to monitor the trend of the pandemic. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint This is an in-depth study of SARS-CoV-2 strains that circulate in Rwanda from May 2020 to February 2021, in which we describe the demography and epidemiology of 203 SARS-CoV-2 genomes from collected SARS-CoV-2 positive oropharyngeal swabs. These swabs were obtained from two distinct groups: from individuals residing in different provinces of Rwanda (n=189) and from returning travelers, whose samples were collected at the airport (n=14). All samples were extracted from the biorepository of the National Reference Laboratory (NRL), in Kigali, Rwanda. Samples with a cycle threshold (Ct) value below 33 were selected, ensuring a wide geographical representation as well as ports of entry, and case description variables (date and place of RT-PCR test, age, sex, occupation, residence, nationality, travel history) were reported. Ribonucleic acid (RNA) of the virus was extracted from confirmed SARS-CoV-2 positive clinical samples with Ct values ranging from 13.4 to 32.7 on a Maxwell 48 device using the Maxwell RSC viral RNA kit (Promega) following a viral inactivation step using proteinase K according to the manufacturer's instructions. Reverse transcription was carried out using SuperScript IV VILO master mix, and 3.3 μl of RNA was combined with 1.2 μl of master mix and 1.5 μl of H2O. This was incubated at 25°C for 10 min, 50°C for 10 min, and 85°C for 5 min. PCRs used the primers and conditions . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. The data generated via the Oxford Nanopore Technology (ONT) MinION was processed using the ARTIC bioinformatic protocol (https://artic.network/ncov-2019/ncov2019-bioinformaticssop.html). Briefly, the FAST5 sequence files were base called and demultiplexed using Guppy 4.2.2 in high accuracy mode, requiring barcodes at both ends of the read. FASTQ reads associated with each sample were filtered and concatenated via the guppy plex module. Consensus SARS-CoV-2 sequences were generated via the ARTIC nanopolish pipeline and assembled for each sample by aligning the respective sample reads to the Wuhan-Hu-1 reference genome (GenBank Accession: MN908947.3) with the removal of sequencing primers, followed by a polishing step using the raw Fast5 signal files. Positions with insufficient genome coverage were masked with N. We downloaded all SARS-CoV-2 genomes from the available nextstrain build (21) with Africafocused subsampling (https://nextstrain.org/ncov/africa) on February 23, 2021. These sequences were further complemented to include all 203 Rwandan sequences generated in this study and . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We used Squarify to construct the square treemaps of lineage diversity across three time points (https://github.com/laserson/squarify). We mapped the combined data set against the canonical reference (GISAID ID: EPI_ISL_406801) using minimap2 (22) and trimmed the data to positions 265-29,674 and padded with Ns in order to mask out 3' and 5' UTRs. We used the resulting alignment to estimate an unrooted maximum-likelihood phylogeny using IQ-TREE v2.1.2 (23) using its automated model selection approach that identified the GTR+F+R8 model as best fitting the data. We subsequently calibrated this phylogeny in time using TreeTime (24) while estimating the molecular clock and skyline coalescent model parameters and using three SARS-CoV-2 genomes from Wuhan, 2019, as the outgroup. We went on to perform a discrete Bayesian phylogeographic analysis in BEAST 1.10.5 (25) using a recently developed model that is able to incorporate available individual travel history information associated with the newly sequenced Rwandan samples (17, 18) . Exploiting such information can yield more realistic reconstructions of virus spread, particularly when travelers from unsampled or under sampled locations are included to mitigate sampling bias. To this end, and given that it is not feasible to perform such an analysis on the full data set due to the large number of sequences, we selected two subtrees in the overall phylogeny (see Results section) that predominantly consisted of Rwandan sequences, consisting of 172 (subtree A) and 218 sequences (subtree B.1), of which respectively 11 and 6 infected individuals have associated travel history information (see Table S1 ). We incorporated the collection dates for those sequences into our analyses, and treated the time when the traveler started the return journey to . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint Rwanda as a random variable given that the time of traveling to the sampling location (in Rwanda) was not known (with sufficient precision). We specify normal prior distributions over these 17 random variables informed by an estimate of time of infection and truncated to be positive (back-in-time) relative to sampling date. As in the work of Lemey et al. (2020) , we use a mean of 10 days before sampling based on a mean incubation time of 5 days (26) , and a constant ascertainment period of 5 days between symptom onset and testing (19) , and a standard deviation of 3 days to incorporate the uncertainty on the incubation time. We retrieved the 172 and 218 sequences from the full alignment and performed joint discrete phylogeographic inference on each resulting data set using BEAST 1.10.5, employing the BEAGLE 3.2.0 highperformance computational library (27) to improve performance. For each of these phylogeographic analyses, we make use of Bayesian stochastic search variable selection (BSSVS) to simultaneously determine which migration rates are zero depending on the evidence in the data and efficiently infer the ancestral locations, in addition to providing a Bayes factor test to identify significant non-zero migration rates (28) . We also estimated the expected number of transitions (known as Markov jumps) (29, 30) into Rwanda from all other countries in the data set. These analyses ran for a total of 200 and 250 million iterations, respectively, with the Markov chains being sampled every 100,000th iteration, in order to reach an effective sample size (ESS) for all relevant parameters of at least 200, as determined by Tracer 1.7 (31) . We used TreeAnnotator to construct maximum clade credibility (MCC) trees for each subtree. To explore the spread of SARS-CoV-2 lineages introduced in Rwanda, we also performed a continuous phylogeographic analysis following a procedure similar to one defined by Dellicour et al. (32) . Specifically, we used the relaxed random walk (RRW) diffusion model (33) available in BEAST 1. 10.5 (25) to infer the dispersal history of Rwandan lineages along Rwandan clades . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. identified within the two subtree-specific MCC trees that resulted from the discrete Bayesian phylogeographic inference described above. To achieve a sufficient level of spatial precision, the continuous phylogeographic analysis was only based on those sampled genomes for which the Rwandan sector of origin was known, which is the maximal level of spatial precision available for these samples. For each sampled genome associated with this level of sampling precision, which corresponds to 57% of available Rwandan genomes, we retrieved geographic coordinates from a point randomly sampled within its sector of origin. The MCMC chain was run in BEAST 1.10.5 for 30 million iterations and sampled every 10,000 th iteration, its convergence/mixing properties were again assessed with Tracer (31) , and an appropriate number of sampled trees was discarded as burn-in (10%). The resulting sets of plausible trees were used to obtain subtreespecific MCC summary trees using TreeAnnotator, and we then used functions available in the R package "seraphim" (34) to extract spatio-temporal information embedded within posterior trees and visualize the continuous phylogeographic reconstructions. Finally, we used the baltic Python library to visualize the phylogenies (35) . Table S1 . GISAID accession identifiers for the Rwandan sequences in this study for which individual travel history metadata is available. We list the collection date for each GISAID entry, along with the country from which infected travelers returned to Rwanda, and the subtree assignment based on Figure S1 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. ; . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 7, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Vertical lines represent a Markov jump between two locations. The seven most prominent locations across all ancestral paths in the posterior are displayed along the Y-axis, with "Other" representing the remaining locations. Trajectory plots A, B and D correspond to the isolates in subtree A, i.e. EPI_ISL_1064164, EPI_ISL_707772, and EPI_ISL_707712 respectively. Trajectory plot C corresponds to isolate EPI_ISL_960250 in subtree B.1. In all cases, considerable uncertainty in the ancestral reconstructions can be seen from the pattern of overlapping horizontal lines and the diffuse density of vertical lines, which indicate considerable support for different ancestral locations (i.e. uncertainty in the spatial reconstruction), and variance in the reconstructed timing of the introductions. For trajectory plots A, B and D, we observe similar patterns in the spatial paths reconstructed, where the isolates find ancestries in Mali / Sierra Leone / Democractic Republic of Congo, Uganda and Kenya prior to each corresponding travel location. In contrast, trajectory plot C shows support for an ancestry in Rwanda prior to the virus circulating in Tanzania and being reintroduced into Rwanda. This indicates a bidirectional exchange of viral lineages between the two countries, although the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint possibility of an intermediary country being involved cannot be discarded due to unevenness in sampling efforts between countries. Figure S3 C, the ancestral histories inferred for these three isolates show support for a bidirectional flow of viral lineages between each corresponding travel location and Rwanda. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2021. ; https://doi.org/10.1101/2021.04.02.21254839 doi: medRxiv preprint A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: Insights from linking epidemiological and genetic data Increased mortality in community-tested cases of SARS-CoV-2 lineage B.1.1.7 Escape of SARS-CoV-2 501Y.V2 variants from neutralization by convalescent plasma Genomics and epidemiology of a novel SARS-CoV-2 lineage in Manaus A pooled testing strategy for identifying SARS-CoV-2 at low prevalence Use of Technologies in COVID-19 Containment in Rwanda Easing Lockdown Restrictions during COVID-19 Outbreak in Rwanda A SARS-CoV-2 lineage A variant (A.23.1) with altered spike has emerged and is dominating the current Uganda epidemic mRNA-1273 vaccine induces neutralizing antibodies against spike mutants from global SARS-CoV-2 variants. bioRxiv : the preprint server for biology Article A Multibasic Cleavage Site in the Spike Protein of SARS-CoV-2 Is Essential for Infection of Human Lung Cells Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19 . The COVID-19 resource centre is hosted on Elsevier Connect , the company ' s public news and information disease and diplomacy: GISAID's innovative contribution to global health Global initiative on sharing all influenza data -from vision to reality COVID-19) Dashboard. WHO Web Accommodating individual travel history and unsampled diversity in Bayesian phylogeographic inference of SARS-CoV-2 Bayesian phylogeographic analysis incorporating predictors Current Protocols Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand Rapid and inexpensive whole-genome sequencing of SARS-CoV-2 using 1200 bp tiled amplicons and Oxford Nanopore Rapid Barcoding Nextstrain : real-time tracking of pathogen evolution Sequence analysis Minimap2 : pairwise alignment for nucleotide sequences IQ-TREE 2 : New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era TreeTime : Maximum-likelihood phylodynamic analysis Bayesian phylogenetic and phylodynamic data integration using BEAST 1 . 10 The incubation period of coronavirus disease 2019 (CoVID-19) from publicly reported confirmed cases: Estimation and application Improved Performance , Scaling , and Usability for a High-Performance Computing Library for Statistical Phylogenetics. Software for Systematics and Evolution Bayesian phylogeography finds its roots Counting labeled transitions in continuous-time Markov models of evolution Fast, accurate and simulation-free stochastic mapping Posterior Summarization in Bayesian Phylogenetics Using Tracer 1 . 7 A Phylodynamic Workflow to Rapidly Gain Insights into the Dispersal History and Dynamics of SARS-CoV-2 Lineages Phylogeography takes a relaxed random walk in continuous space and time SERAPHIM: Studying environmental rasters and phylogenetically informed movements MERS-CoV recombination: Implications about the reservoir and potential for adaptation