key: cord-0870599-ok4ah1r7 authors: Batty, E. M.; Kochakarn, T.; Panthan, B.; Kumpornsin, K.; Jiaranai, P.; Wangwiwatsin, A.; Kotanan, N.; Jaruampornpan, P.; Watthanachockchai, T.; Rakmanee, K.; Sensorn, I.; Sungkanuparph, S.; Pasomsub, E.; Chookajorn, T.; Chantratita, W. title: Genomic surveillance of SARS-CoV-2 in Thailand reveals mixed imported populations, a local lineage expansion and a virus with truncated ORF7a date: 2020-05-25 journal: nan DOI: 10.1101/2020.05.22.20108498 sha: 96c7e6a485fa5c013cb74488b3bde281b5874a7b doc_id: 870599 cord_uid: ok4ah1r7 Coronavirus Disease 2019 (COVID-19) is a global public health threat. Genomic surveillance of SARS-CoV-2 was implemented during March 2020 at a major diagnostic hub in Bangkok, Thailand. Several virus lineages supposedly originated in many countries were found, and a Thai-specific lineage, designated A/Thai-1, has expanded to be predominant in Thailand. A virus sample in the SARS-CoV-2 A/Thai-1 lineage contains a frame-shift deletion at ORF7a, encoding a putative host antagonizing factor of the virus. Coronavirus Disease 2019 has reached the status of global pandemic. Genomic surveillance of its etiological virus, SARS-CoV-2, plays an important role in epidemiological investigations and transmission control strategies [1] . Genetic variation data of the virus could reveal transmission chains between infected individuals and could even map the connection between outbreak cohorts. Thailand has suffered from the spread of COVID-19 with the total number of confirmed cases over 3,000 and with more than 120,000 individuals screened as of May 2020. Since January 2020, when both imported and locallytransmitted COVID-19 cases were reported in Thailand, the country has implemented several measures to combat COVID-19 at a national scale [2] [3] [4] . Genomic surveillance could be a powerful tool in the implementation of the national COVID-19 control strategy in Thailand. ARTIC multiplex tiling PCR allows whole-genome sequencing with minuscule amount of material by generating genome-wide overlapping amplicons, which has led to its success during the Zika virus outbreak investigation in Brazil [5, 6] . Using leftover RNA samples from a standard RT-PCR diagnosis, the genomic information of SARS-CoV-2 can be decoded in less than a week. The data presented here provide an insight into the genetic repertoire, origins and viral lineages of SARS-CoV-2 in Thailand. The information is particularly important given the multiple introduction events into the country and the local expansion of the Thai-specific SARS-CoV-2 lineages. We sequenced 27 anonymized RT-qPCR positive samples from Ramathibodi Hospital in Bangkok during March 13-28, 2020 (Supplementary Table 1 ) [EC approval number: MURA2020/676]. The hospital acted as one of the major diagnostic hubs for COVID-19 in Bangkok during the study period. Enrichment and amplification steps were done according to . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 25, 2020. . https://doi.org /10.1101 /10. /2020 the ARTIC Network protocol with ARTIC primer version 2 [7] . The libraries were prepared using KAPA HyperPrep and KAPA Library Amplification kits and subsequently sequenced using with a MiSeq Reagent Kit v2 according to the manufactures' protocols. Variant calling was performed using the ncov2019-artic-nf pipeline (https://github.com/connorlab/ncov2019-artic-nf). Consensus sequences were used to construct the maximum-likelihood and Bayesian phylogenetic trees with recommended representatives from various lineages worldwide utilizing IQ-TREE 2.0 and BEAST v1.10.4, respectively (Supplementary Table 2 ) [8] [9] [10] . Interestingly, Thailand appears to have had multiple introduction events of SARS-CoV-2 into the country, as evidenced by at least six separate clusters in the maximum- [9] . This lineage, designated A/Thai-1 ( Figure 1 and 2), descended from the original A . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . lineage (based on the maximum-likelihood based classification system), which was first reported in China before expanding into various countries in Asia, Europe, North America, South America and Australia [9] . This A/Thai-1 branch is separated from the rest of the original A lineage and subgroups. Upon visual inspection in Nextstrain, only one Malaysian sample (MKAK-CL-2020-5096) is the closest to A/Thai-1, but only with 63% bootstrap value and one shared lineage-specific nucleotide substitution (4,390G→U) (Supplementary . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . When the analysis was extended to 22 additional genomes, independently deposited in GISAID by the Thai National Influenza Center and the Thai National Institute of Health, samples collected in January 2020 are grouped closely with the B lineage from China including the Wuhan-Hu-1 reference (Supplementary Table 3 ). The genetic repertoires from this additional collection also support the notion of multiple virus lineages introduced into Thailand. Nine of them also fall into A/Thai-1, making it the largest lineage in Thailand during the period of March 2020 (22/49 genomes). Genomic surveillance is likely to be pivotal in the identification and the elimination of transmission cohorts and chains [17, 18] . The genetic composition presented here suggests the necessity for screening and monitoring international travelers during the period of COVID-19 pandemic. The local expansion of A/Thai-1 has created a new evolutionary branch unique to Thailand, which inevitably requires this lineage to be investigated for its compatibility to diagnosis and vaccine tools under development. The work here was supported by Ramathibodi Foundation, TCELS, NRCT and Mahidol University. The authors acknowledge NSTDA Supercomputer Center (ThaiSC) for providing computing resources for this work. We are grateful for the comment and suggestion from P. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . Hadfield, J., et al. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . https://doi.org /10.1101 /10. /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . https://doi.org /10.1101 /10. /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . https://doi.org /10.1101 /10. /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020 . . https://doi.org/10.1101 Supplementary Figure 1 . Maximum-likelihood tree used for the cladogram depiction in Figure 1 . The labels and colors are similar to those in Figure 1 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . Figure 2 . Nextstrain Timetree presenting the grouping of the Thai SARS-CoV-2 B1 samples with the virus genomes from other countries during the same period. The data was retrieved on 6 May 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . https://doi.org /10.1101 /10. /2020 Supplementary Figure 3 . Bayesian phylogenetic tree of the SARS-CoV-2. Tree plotting was done with the same sample sets used for the maximum-likelihood tree. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . https://doi.org /10.1101 /10. /2020 Supplementary Figure 4 . Nextstrain Timetree representing the A/Thai-1 lineage using the data from GISAID. The Malaysian virus genome MKAK-CL-2020-5096 is the closest one in the tree. The data was retrieved on 6 May 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . Figure 5 . Sanger sequencing result confirming the ORF7a deletion in BKK-0018. A sequencing chromatogram from the DNA regions (upper panel) corresponding to the deletion site (red arrow) is shown. The 4-nt deletion site marked in the red box causes a premature stop codon. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 25, 2020. . https://doi.org /10.1101 /10. /2020 Supplementary Spread of SARS-CoV-2 in the Icelandic Population Early transmission patterns of coronavirus disease 2019 (COVID-19) in travellers from Wuhan to Thailand Journey of a Thai Taxi Driver and Novel Coronavirus A self-assessment of the Thai Department of Disease Control's communication for international response at early phase to the COVID-19 Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples Genomic and Epidemiological Surveillance of Zika Virus in the Amazon Region nCoV-2019 sequencing protocol Bayesian phylogenetics with BEAUti and the BEAST 1.7 A dynamic nomenclature for SARS-CoV-2 to assist genomic epidemiology IQ-TREE 2: New Models and Efficient Methods for Phylogenetic