key: cord-0769065-wq5xcg78 authors: Tayoun, Ahmad Abou; Loney, Tom; Khansaheb, Hamda; Ramaswamy, Sathishkumar; Harilal, Divinlal; Deesi, Zulfa Omar; Varghese, Rupa Murthy; Al Suwaidi, Hanan; Alkhajeh, Abdulmajeed; AlDabal, Laila Mohamed; Uddin, Mohammed; Hamoudi, Rifat; Halwani, Rabih; Senok, Abiola; Hamid, Qutayba; Nowotny, Norbert; Alsheikh-Ali, Alawi title: Multiple early introductions of SARS-CoV-2 into a global travel hub in the Middle East date: 2020-10-20 journal: Sci Rep DOI: 10.1038/s41598-020-74666-w sha: 309592c1524498d82adff6209abfe4df92981810 doc_id: 769065 cord_uid: wq5xcg78 International travel played a significant role in the early global spread of SARS-CoV-2. Understanding transmission patterns from different regions of the world will further inform global dynamics of the pandemic. Using data from Dubai in the United Arab Emirates (UAE), a major international travel hub in the Middle East, we establish SARS-CoV-2 full genome sequences from the index and early COVID-19 patients in the UAE. The genome sequences are analysed in the context of virus introductions, chain of transmissions, and possible links to earlier strains from other regions of the world. Phylogenetic analysis showed multiple spatiotemporal introductions of SARS-CoV-2 into the UAE from Asia, Europe, and the Middle East during the early phase of the pandemic. We also provide evidence for early community-based transmission and catalogue new mutations in SARS-CoV-2 strains in the UAE. Our findings contribute to the understanding of the global transmission network of SARS-CoV-2. | (2020) 10:17720 | https://doi.org/10.1038/s41598-020-74666-w www.nature.com/scientificreports/ destinations from 18 March 2020 and Dubai airport was closed to passenger flights on 25 March 2020; hence, patients after 18 March 2020 were expected to be more likely a result of community transmission as opposed to imported infections. The index patient in the UAE was a female Chinese tourist (aged 63 years) travelling from Wuhan with other family members to visit her son in Dubai. The Chinese family arrived in Dubai on 16 January 2020 and tested positive on the 29 January 2020 (Table 1) . Over the next seven weeks, there were multiple new cases among tourists and residents with travel history (44.9% had travel history from Europe) ( Table 1) . Nearly two-thirds (63.3%) of patients were male and 61.2% were aged between 20 and 44 years reflecting the young age structure of the UAE population 5 . Majority of patients (88%) were asymptomatic or had mild symptoms and only four required intensive care with invasive ventilation (one death; Table 1 ). SARS-CoV-2 whole genome sequencing was performed on all 49 COVID-19 patient samples. Only genomes with almost complete coverage (n = 25, "Methods" section) were used for phylogenetic analysis. The 25 genomes were obtained from cases with disease onset in late January (n = 1), early February (n = 1), late February (n = 6), early March (n = 8), and late March (n = 9). Of those, approximately two-thirds were male and aged between 10 and 40 years (Table 1 ). To understand early viral transmission in Dubai in the global context, we performed phylogenetic analysis on the 25 novel viral genomes we sequenced from early patients in the UAE (Table 1) in this study ("Methods" section) along with 157 largely complete SARS-CoV-2 genomes deposited in GISAID from different countries between December 2019 and early March 2020 6, 7 (Supplementary Table S1 ). Consistent with multiple independent introductions, the UAE SARS-CoV-2 isolates were distributed across the phylogenetic tree (Fig. 1) . The majority (76%) clustered with clades A2a (48%) and A3 (28%) which are largely composed of isolates from COVID-19 patients in Europe and Iran, respectively. This clearly suggests that the major introductions into the UAE during the early phase of the pandemic originated from Europe and the Middle East/Iran. Supporting its European origin, all individuals with the A2a clade isolates were mostly European and/or with recent travel history to a European country, mainly to Italy (n = 4), Germany (n = 3), United Kingdom (n = 2), Spain (n = 1), and Norway (n = 1) ( Table 1 and Fig. 2 ). Onset of symptoms reported in this group was within or after the second week of March (Table 1) suggesting that the viral infections in this group could have occurred during late February to early March. Of note, a SARS-CoV-2 isolate submitted from Mexico (GISAID ID: EPI_ISL_412972) was 100% identical to that from an Italian expatriate working in the UAE (L0881), while another submitted in Germany (GISAID ID: EPI_ISL_412912) differed by a single mutation (Fig. 1 ). All three individuals had a recent travel history to Italy and overlapping infection time frames (late February-early March). Within this group, isolates from patients L1758, L0484, and L2185 were identical (Fig. 2) suggesting a possible common direct source of transmission. Isolates in the A3 clade were obtained from five individuals with travel history to Iran (L2409, L6627, L0904, L0184, and L4682), one Indian resident (L0231), and one Indian tourist (L0068) (Fig. 2) . Onset of symptoms for the five individuals with travel history in this group was reported to be around 21-24 February (Table 1) . Patient L0231 had no travel history and reported symptom onset on 7 March suggesting a possible communitybased transmission event. Interestingly, all but one isolate obtained from patient L4682-the only patient in this group with severe clinical presentation-shared a common ancestral strain identical to that obtained from patient L2409. The SARS-CoV-2 isolate from L4682 had two unique missense variants in the ORF1ab gene (Supplementary Table S3 ) which might be worth investigating for any possible biological effect(s). Consistent with its Iranian origin, a SARS-CoV-2 sequence submitted by the University of Sydney (GISAID ID: EPI_ISL_412975) on 28 February 2020 differed by only two mutations from that of L2409, and both this Iranian male tourist and the Australian male had a recent travel history to Iran. We speculate that individuals with travel history to Iran around this time frame (L8386, L6867, and L3280), for whom a full viral genome sequence could not be obtained, were also very likely to cluster within the A3 clade. Only one viral strain obtained from L5630, a family member of the early Chinese index patient, belonged to the B2 clade. Although we did not obtain full viral genome sequences from the other members of that Chinese family, we expect that all had a similar strain to L5630. Interestingly, our data do not suggest any transmission of this clade at least among the earliest patients ( Fig. 2 ) included in this study which is consistent with the reported early detection and isolation of this family. This finding also supports the notion of secondary source(s) for the ongoing local transmission. The remaining five isolates did not belong to A2a, A3, B2, or any of the clades on nextrain.org as of 12 May 2020, suggesting earlier introduction(s). Those isolates were obtained from four Asians, two residents (L4280, L6599) and two tourists (L4184, L9766), and one Czech resident (L1014) working as an airline cabin crew with travel history to Austria (Table 1) . Consistent with the Asian predominance among this patient group and the fewer (1 or 2) mutations for most of their isolates (4 out of 5) relative to the Wuhan reference genome (Fig. 2) , several early viral strains submitted in Asia clustered very closely to this group (Fig. 1) . L4280 was the first sequenced patient without travel history and became infected after transporting a work colleague, L0826, to hospital. Patient L0826 reported symptoms onset on 22 January suggesting that community-based transmission started in the UAE in early-to-mid January. L6599 was an Indian expatriate living with three other Filipino and Sri Lankan expatriates (L3715, L2771, L8480) ( Table 1 ). All four individuals had no documented recent travel history suggesting local transmission, and although full viral genome sequences could only be obtained from one patient L6599, it is very likely that all have related isolates. In aggregate, we identified 70 variants relative to the reference GenBank SARS-CoV-2 sequence NC_045512.2. The majority of these variants were missense (n = 41) with the most frequent nucleotide change being C > T (n = 33), and more than half (38/70) were localized in the ORF1ab gene (Supplementary Table S3 www.nature.com/scientificreports/ out of the 70 variants were novel as they were not identified in the Chinese National Center for Bioinformation Database (https ://bigd.big.ac.cn/ncov/varia tion/annot ation ; last accessed August 13, 2020). The novel variants were a coding missense variant and a synonymous variant in the N and ORF1ab genes, respectively. In addition, 9 variants were very rare (i.e. seen less than 4 times out of 81,625 genomes), including one missense variant (F850I) in the S gene (Supplementary Table S3 ). Our findings suggest multiple independent spatiotemporal introductions of SARS-CoV-2 into the UAE where the majority of introductions (76%) were from Iran and Europe during two different time frames (mid-late February and early March, respectively). Although we show evidence for possible local transmission within the Middle Eastern/Iranian isolates, it will be important to sequence further isolates at subsequent dates to determine whether these introductions succeeded in seeding more clustering and whether such clustering was affected by proactive and vigilant public health measures, such as transitioning to online learning for schools and universities, implementing work-from-home protocols across all sectors, and nationwide disinfection campaigns. Six isolates (22%) did not cluster with the European or Iranian groups and represented earlier introductions which did not appear to seed larger clusters in our sampled cohort. However, additional sequencing is needed to determine the extent of community transmission, especially given that our data strongly suggest that the earliest patient (early to mid-January) in the UAE could have been a secondary infection from one of those introductions. The new SARS-CoV-2 mutations identified in the UAE warrant further investigation to explore whether they influence viral characteristics, especially pathogenicity, or provide important information for vaccine development. One of the major strengths of the study was the non-biased representative sample of early cases, including the index family cluster, in Dubai from the only central testing lab, along with detailed demographic and clinical information. Limitations included the inability to conduct full whole genome sequencing on more samples most likely due to low viral load issues, although we were able to deduce the origin of transmission in most of those individuals based on travel history. Regardless, this study contributes important molecular epidemiological data that can be used to further understand the global transmission network of SARS-CoV-2 8 . Human subjects and ethics approval. Sociodemographic and clinical data was extracted from the electronic medical records of the earliest 49 patients with laboratory confirmed SARS-CoV-2 from 29 January to 18 March 2020 using the WHO case report form. Cases were categorized into three groups based on disease severity: asymptomatic and mild cases with either no symptoms or mild non-life-threatening symptoms e.g. dry cough, mild fever; moderate cases with symptoms (e.g. breathlessness, persistent fever) requiring hospitalization and medical attention (e.g. supplementary oxygen therapy, intravenous fluids); and severe/critical cases with advanced disease and pneumonia requiring admission to intensive care units and specialized life-support treatment (e.g. mechanical ventilation). This study was approved by the Dubai Scientific Research Ethics Committee-Dubai Health Authority (approval number #DSREC-04/2020_02). The requirement for informed consent was waived as this study was part of a public health surveillance and outbreak investigation in the UAE. Nonetheless, all patients treated at a healthcare facility in the UAE provide written consent for their deidentified data to be used for research and this study was performed in accordance with the relevant laws and regulations that govern research in the UAE. Germany) was amplified using 26 overlapping primer sets covering most of the SARS-CoV-2 genome as recently described by our group 9 . PCR products were then sheared by ultra-sonication (Covaris LE220-plus series, MA, USA) and prepared for sequencing using the SureSelectXT Library Preparation kit (Agilent, CA, USA). This library was sequenced using the Illumina MiSeq Micro Reagent Kit, V2 (2 X 150 cycles). Fig. S1 ). Assembled genomes with at least 20X average coverage across most nucleotide positions (56-29,797) were used for subsequent phylogenetic analysis (Supplementary Table S1 ). A total of 25 viral genomes (24 by shotgun and 1 by target enrichment) met this inclusion criterion and were submitted to the Global Initiative on Sharing All Influenza Data (GISAID) database under accession IDs: EPI_ISL_435119-435,142 (Supplementary Table S2 ). Phylogenetic analysis. We downloaded 157 global non-UAE sequences (Supplementary Table S2 ) with largely complete genomes (nucleotide positions 56-29,797) submitted to GISAID EpiCoV (https ://www.epico v.org/) between December 2019 and 04 March 2020 7 . All 182 sequences, including the 25 UAE sequences generated in this study, were analysed using Nexstrain 10 , which consists of Augur v6.4.3 pipeline for multiple sequence alignment (MAFFT v7.455 11 ) and phylogenetic tree construction (IQtree v1.6.12 12 ). Tree topology was assessed using the fast bootstrapping function with 1000 replicates. Tree visualization and annotations were performed in FigTree v1.4.4 13 for Fig. 1 and in auspice v2.13.0 tool 10 for Fig. 2 . SARS-CoV-2 clades annotations were performed in auspice v2.13.0 and cross-checked with nextstrain.org as of 12 May 2020. All data generated or analysed during this study are included in this published article (and its Supplementary Information files) and the sequences are available on the GISAID database under the corresponding accession numbers. Received: 25 June 2020; Accepted: 30 September 2020 Insights into the recent 2019 novel coronavirus (SARS-CoV-2) in light of past human coronavirus outbreaks. Pathogens. 9 SARS-CoV-2/COVID-19: viral genomics, epidemiology, vaccines, and therapeutic interventions Coronavirus disease 2019 (COVID-19) Situation Report-52 COVID-19 Dashboard by the Center for Systems Science and Engineering An analysis of the health status of the United Arab Emirates: the "Big A new coronavirus associated with human respiratory disease in China Phylogenetic network analysis of SARS-CoV-2 genomes Geographical and temporal distribution of SARS-CoV-2 clades in the WHO European Region SARS-CoV-2 whole genome amplification and sequencing for effective population-based surveillance and control of viral transmission Nextstrain: real-time tracking of pathogen evolution MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform Terrace aware data structure for phylogenomic inference from supermatrices The authors declare no competing interests. Supplementary information is available for this paper at https ://doi.org/10.1038/s4159 8-020-74666 -w.Correspondence and requests for materials should be addressed to N.N. or A.A.-A. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.