key: cord-0818265-0tndaux2 authors: Joonlasak, Khajohn; Batty, Elizabeth M; Kochakarn, Theerarat; Panthan, Bhakbhoom; Kümpornsin, Krittikorn; Jiaranai, Poramate; Wangwiwatsin, Arporn; Huang, Angkana; Kotanan, Namfon; Jaru-Ampornpan, Peera; Manasatienkij, Wudtichai; Watthanachockchai, Treewat; Rakmanee, Kingkan; Jones, Anthony R.; Fernandez, Stefan; Sensorn, Insee; Sungkanuparph, Somnuek; Pasomsub, Ekawat; Klungthong, Chonticha; Chookajorn, Thanat; Chantratita, Wasun title: Genomic surveillance of SARS-CoV-2 in Thailand reveals mixed imported populations, a local lineage expansion and a virus with truncated ORF7a date: 2020-11-21 journal: Virus Res DOI: 10.1016/j.virusres.2020.198233 sha: 513e320aefb46bdeb3eeee4b6007656d51176b7e doc_id: 818265 cord_uid: 0tndaux2 Coronavirus Disease 2019 (COVID-19) is a global public health threat. Genomic surveillance of SARS-CoV-2 was implemented in March of 2020 at a major diagnostic hub in Bangkok, Thailand. Several virus lineages supposedly originated in many countries were found, and a Thai-specific lineage, designated A/Thai-1, has expanded to be predominant in Thailand. A virus sample in the SARS-CoV-2 A/Thai-1 lineage contains a frame-shift deletion at ORF7a, encoding a putative host antagonizing factor of the virus. SARS-CoV-2 in Thailand reveals mixed imported populations, a local lineage expansion and a virus with truncated ORF7a, (2020), doi: https://doi.org/10.1016/j. virusres.2020.198233 This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Coronavirus Disease 2019 has reached the status of global pandemic. Genomic surveillance of its etiological virus, SARS-CoV-2, plays an important role in epidemiological investigations and transmission control strategies [1] . Genetic variation data of the virus could reveal transmission chains between infected individuals and could even map the connection between outbreak cohorts. Thailand has suffered from the spread of COVID-19 with the total number of confirmed cases over 3,000 and with more than 120,000 individuals screened as of May 2020. Since January 2020, when both imported and locally-transmitted COVID-19 cases were reported in Thailand, the country implemented several measures to combat COVID-19 on a national scale [2] [3] [4] . Genomic surveillance could be a powerful tool in the implementation of the national COVID-19 control strategy in Thailand. ARTIC multiplex tiling PCR allows whole-genome sequencing with minuscule amount of material by generating genome-wide overlapping amplicons, which has led to its success during the Zika virus outbreak investigation in Brazil [5, 6] . Using leftover RNA samples from a standard RT-PCR diagnosis, the genomic information of SARS-CoV-2 can be decoded in less than a week. The data presented here provide an insight into the genetic repertoire, origins and viral lineages of SARS-CoV-2 in Thailand. The information is particularly important given the multiple introduction events into the country and the local expansion of the Thai-specific SARS-CoV-2 lineages. We sequenced 27 anonymized RT-qPCR positive virus transport media samples containing nasopharyngeal/oropharyngeal swabs from Ramathibodi Hospital in Bangkok from March 13, 2020 Table 1 ) [13, 14] . The relationship and the origin of these lineages were described in [11] . Considering the origins and lineage branches, these [13] . This lineage, designated A/Thai-1 (Figure 1 and 2 Table 1 ). Among the changes, 20,134G→U mutation has been independently found in two samples in lineage B.1 from the Netherlands and USA. It remains to be determined with a larger sample size whether this is the result of convergent evolution or genetic recombination. This pattern of homoplasy was also hypothetically linked to putative RNA editing [15] . Thailand/Bangkok-0018, a sample in the A/Thai-1 lineage, contains a 4-nt frame-shift deletion at position 27,694-27,697, causing a premature truncation in ORF7a, which now contains five altered amino acid residues and loses the 16 original C-terminal residues ( Figure 3 ). The deletion was confirmed by Sanger sequencing twice using two independent RT-PCR reactions (Supplementary Figure 5) . The frame-shift mutation alters approximately one-sixth (21/121 residues) of the ORF7a protein. Based on protein homology to SARS-CoV, the missing region corresponds to a transmembrane helix and an ER retrieval motif, required for J o u r n a l P r e -p r o o f antagonizing a host antiviral factor [16, 17] . One sample from Arizona, USA also contains an 81-nt in-frame deletion in the ORF7a gene [18] . So far, only one sample in A/Thai-1 appears to have this frame-shift deletion. It is tempting to speculate on the relationship between ORF7a Table 3 ). The genetic repertoires from this additional collection also support the notion of multiple virus lineages introduced into Thailand. A/Thai-1 was the largest lineage in Thailand during the period of March 2020, with the total of 22 virus samples designated to A/Thai-1 (12 from our sequencing work and 10 from other independent genomic sequencing projects) from the total of 49 genomes available. Genomic surveillance is likely to be pivotal in the identification and the elimination of transmission cohorts and chains [19, 20] . The genetic composition presented here suggests the necessity for screening and monitoring international travelers during the period of COVID-19 pandemic. The local expansion of A/Thai-1 strongly indicates a series of local transmission events, allowing an evolutionary branch unique to Thailand. This lineage needs to be investigated further for its compatibility to diagnosis and vaccine tools under development. Writing -Review & Editing Thanat Chookajorn: Formal analysis, Supervision, Project administration, Writing -Original Draft, Funding acquisition, Writing -Review & Editing Wasun Chantratita: Formal analysis, Supervision Spread of SARS-CoV-2 in the Icelandic Population Early transmission patterns of coronavirus disease 2019 (COVID-19) in travellers from Wuhan to Thailand Journey of a Thai Taxi Driver and Novel Coronavirus A self-assessment of the Thai Department of Disease Control's communication for international response at early phase to the COVID-19 Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples Genomic and Epidemiological Surveillance of Zika Virus in the Amazon Region nCoV-2019 sequencing protocol An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar Bayesian phylogenetics with BEAUti and the BEAST 1.7 IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Nextstrain: real-time tracking of pathogen evolution A dynamic nomenclature for SARS-CoV-2 to assist genomic epidemiology Phylogenetic Assignment of Named Global Outbreak LINeages Rampant C-->U Hypermutation in the Genomes of SARS-CoV-2 and Other Coronaviruses: Causes and Consequences for Their Short-and Long-Term Evolutionary Trajectories. mSphere Severe Acute Respiratory Syndrome Coronavirus ORF7a Inhibits Bone Marrow Stromal Antigen 2 Virion Tethering through a Novel Mechanism of Glycosylation Interference Characterization of a unique group-specific protein (U122) of the severe acute respiratory syndrome coronavirus An 81 nucleotide deletion in SARS-CoV-2 ORF7a identified from sentinel surveillance in Arizona The resistome and genomic reconnaissance in the age of malaria elimination Towards a genomics-informed, real-time, global pathogen surveillance system None.