key: cord-0811522-83q9u6nk authors: Ishikawa, Fumihiro; Udaka, Yuko; Oyamada, Hideto; Ishino, Keiko; Tokimatsu, Issei; Sagara, Hironori; Kiuchi, Yuji title: Genetic epidemiology using whole genome sequencing and haplotype networks revealed the linkage of SARS-CoV-2 infection in nosocomial outbreak date: 2021-11-24 journal: Infect Prev Pract DOI: 10.1016/j.infpip.2021.100190 sha: 5202cb1f37e4a9ff130fae8cf1bdd548ac054e96 doc_id: 811522 cord_uid: 83q9u6nk BACKGROUND: A characteristic feature of SARS-CoV-2 is its ability to transmit from pre- or asymptomatic patients, complicating the tracing of infection pathways and causing outbreaks. Despite several reports that whole genome sequencing (WGS) and haplotype networks are useful for epidemiologic analysis, little is known about their use in nosocomial infections. AIM: We aimed to demonstrate the advantages of genetic epidemiology in identifying the link in nosocomial infection by comparing single nucleotide variations (SNVs) of isolates from patients associated with an outbreak in Showa University Hospital. METHODS: We used specimens from 32 patients in whom COVID-19 had been diagnosed using clinical reverse transcription-polymerase chain reaction tests. RNA of SARS-CoV-2 from specimens was reverse-transcribed and analysed using WGS. SNVs were extracted and used for lineage determination, phylogenetic tree analysis, and median-joining analysis. FINDINGS: The lineage of SARS-CoV-2 that was associated with outbreak in Showa University Hospital was B.1.1.214, which was consistent with that found in the Kanto metropolitan area during the same period. Consistent with canonical epidemiological observations, haplotype network analysis was successful for the classification of patients. Additionally, phylogenetic tree analysis revealed three independent introductions of the virus into the hospital during the outbreak. Further, median-joining analysis indicated that four patients were directly infected by any of the others in the same cluster. CONCLUSION: Genetic epidemiology with WGS and haplotype networks is useful for tracing transmission and optimizing prevention strategies in nosocomial outbreaks. Coronavirus disease 2019 (COVID-19) 1 originated in Wuhan, China, in late December 2019, rapidly spread worldwide, and was officially declared a pandemic by the World Health Organization on 11th March 2020 [1] . patients can infect other individuals before symptom onset or even without development of any apparent symptoms [2, 3] . This pre-and asymptomatic disease transmission makes it difficult to identify the origin of infection and prevent further spread, leading to disease outbreaks. COVID-19 is caused by SARS-CoV-2, a single-stranded RNA virus that rapidly accumulates mutations in its genome during replication [4] . Taking advantage of the high frequency of mutations, haplotypes with single nucleotide variations (SNVs) have been utilized for epidemiological analysis, supported by whole genome sequencing (WGS). To date, many studies have reported the advantage of haplotype network analysis with WGS-SNVs in understanding the This study was conducted using specimens from 32 patients in whom COVID-19 had been diagnosed by clinical reverse transcription-polymerase chain reaction (RT-PCR) test at SUH between 1st July 2020 and 31st January 2021. These patients included 11 healthcare workers and 21 patients, designated SUH001 to SUH032. This study did not include personal information leading to identification of individuals. The study protocol was approved by the ethics committee of Showa University School of Medicine (approval number: 3302). Nasopharyngeal swabs were collected from suspected COVID-19 patients and tested using quantitative reverse transcription polymerase chain reaction (RT-qPCR) performed using a SARS-CoV-2 detection kit (TOYOBO, Osaka, Japan), according to the manufacturer's instructions. RNA extraction and whole genome sequencing Wuhan-Hu-1 were defined as the core region and used for further analysis. The mixed bases in the core region were resolved by manually counting the sequence reads. A phylogenetic tree analysis with SNVs was performed using the neighbour-joining (NJ) method and MAFFT version 7 [11] , followed by visualization with iTOL version 6 [12] . The median-joining network analysis with SNV was performed using PopART software [13] . PANGO lineages of isolates were examined by phylogenetic assignment of named global outbreak lineages web application (Pangolin) (https://pangolin.cog-uk.io/) [14] . A bubble chart of the PANGO lineage was created using JMP Pro version 15 (SAS Institute Inc., Cary, NC, USA). To clarify the linkage of SARS-CoV-2 infection in the SUH outbreak, we used haplotype network analysis. The SUH nosocomial outbreak consisted of nearly 40 patients in four wards between mid-December 2020 and late January 2021, and 17 of outbreak isolates (SUH013 to SUH022, SUH024 to SUH026, SUH028, and SUH030 to SUH032) were included in this study. To assess the haplotype network analysis, we included the following isolates predicted to be irrelevant to the nosocomial outbreak that we had experienced between mid-To investigate the evolutionary connection among isolates, we performed a phylogenetic tree analysis, which found four clades that contained multiple isolates ( Figure These data suggest that there were at least three introductions of SARS-CoV-2 into SUH during the period of the outbreak. In nosocomial infections, identification of the infection link in a cluster is extremely important to design preventive strategies against subsequent spread. Although phylogenetic tree analysis is useful for the identification of groups as clades of isolates that have similar genomic sequences, it is difficult to reveal the parent-child relationship of isolates. Median-joining network analysis has shown excellent results in this regard because it can graphically express the difference in a single nucleotide in the genome [6, 15] . Therefore, to gain a deeper insight, we performed a median-joining network analysis using SNVs of the SUH isolates ( Figure 3 ). In cluster 1 (n = 10), six isolates had identical genome sequences (Figure 3 showed an additional three SNVs in cluster 2 (n = 6) as phylogenetic tree analysis. As expected, all isolates from patients with a close contact history (n = 3) were identical (Figure 3 , vertex C). In this study, we evaluated the advantages of using WGS and haplotype network analysis to identify the infection link in nosocomial outbreaks. First, we examined the SARS-CoV-2 lineages that were related to the SUH outbreak and determined it to be B.1.1.214. To date, multiple variants of concern (VOCs) that have a higher effective reproduction number and an increased risk of mortality have been spreading worldwide as pandemic strains [16, 17] . We did not find any VOCs, such as the alpha variant B.1.1.7. Currently, the delta variant B.1.617.2 and its subvariants have been expanding in many countries, threatening healthcare systems by rapidly increasing the number of symptomatic patients. Because the introduction of VOCs into a hospital could easily cause an outbreak, the identification of COVID-19-positive patients and their viral strains are necessary for quick and decisive actions, such as single-patient room management, to save lives. The WGS by NGS is essential to determine the lineage, although each mutation of the SARS-CoV-2 genome can be detected by single nucleotide polymorphism genotyping [18] . Our results showed that phylogenetic tree analysis could clearly distinguish between patients infected in the hospital and those infected in the community as well as classify patients with a close contact history as a single clade. This agrees with findings of a previous study [8] , which also indicated that phylogenetic tree analysis was successful in discriminating between COVID-19 patients on the basis of the genomic sequence of isolates during the outbreak. Surprisingly, our data showed that three separate introductions of SARS-CoV-2 during the same period into the hospital had played individual parts in the outbreak. As only patients who had tested negative for SARS-CoV-2 by RT-qPCR several days before admission were in the hospital, these introductions may have been due to healthcare workers or patients infected with a viral load below the detection limit. Additionally, one patient in cluster 1 was hospitalized on a different floor and had no direct contact with other patients in cluster 1, suggesting transmission either through healthcare workers or in shared spaces, J o u r n a l P r e -p r o o f such as an elevator. This result raises the possibility that even the adherence to standard precautions in healthcare workers when providing patient care and disinfection of shared space was insufficient to prevent transmission. Therefore, healthcare workers must pay more attention to adequate precautions. Notably, we showed that median-joining network analysis with SNVs could indicate the direction of transmission. It should be noted that while we could identify the individuals involved in the dissemination of this virus using this method, it is difficult to estimate the direction when the SNVs are identical. To resolve this issue, applying canonical epidemiological observations such as the date of symptom onset may be helpful. Identification of individuals responsible for the transmission would clarify how the transmission occurred by intensively investigating their recent activities, leading to improvement of infection control. Meanwhile, as Johnson and Parker pointed out, information on transmission during the outbreak could have potentially harmful consequences when the source of the outbreak is identified [19] . These consequences could include psychological distress and affording the responsibility of transmission to individuals. Therefore, the sharing and use of data should also be based on ethical considerations. The main limitations of this study include its retrospective nature and the lack of evaluation of haplotype network analysis using WGS-SNVs for infection control. Rapid onsite WGS during an outbreak and adequate intervention to prevent further spread are necessary for evaluation of this strategy. However, we could not accomplish this due to our lack of NGS equipment and consignment of WGS operation to an external institution. Because it takes several months at the earliest to receive sequences from the external institution, intervention during an outbreak would be difficult. Whether the interventions implemented to block the infection pathways identified by haplotype networks is successful for infection control should be addressed by future studies performed in hospitals that have NGS equipment. In the past two decades, three types of coronavirus have emerged and caused outbreaks in many countries [20] . This indicates the possibility that another coronavirus associated with outbreak could emerge in the near future. In such an event, it might be important to perform inhospital WGS to promptly end a nosocomial outbreak. J o u r n a l P r e -p r o o f Tables Table I SNVs Nine lineages were detected in 596 isolates from KMA, including the SUH (n = 32) samples, during the evaluation period, as visualized with a bubble plot. The red and blue bubbles represent the isolates from KMA and SUH, respectively. The size of the bubbles is proportional to the number of detected isolates. Figure 2 . Phylogenetic tree analysis of SARS-CoV-2 genome from SUH. A phylogenetic tree was created using the genomes of the 32 isolates from SUH with the NJ method, as described in the Methods section. A new coronavirus associated with human respiratory disease in China Presumed asymptomatic carrier transmission of COVID-19 Estimating the extent of asymptomatic COVID-19 and its potential for community transmission: systematic review and meta-analysis The architecture of SARS-CoV-2 Phylogenetic network analysis of SARS-CoV-2 genomes Haplotype networks of SARS-CoV-2 infections in the Diamond Princess cruise ship outbreak A genome epidemiological study of SARS-CoV-2 introduction into Japan. mSphere 2020 Clinical utility of SARS-CoV-2 whole genome sequencing in deciphering source of infection Interactive Tree Of Life (iTOL) v4: recent updates and new developments full-feature software for haplotype network construction A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Median-joining networks for inferring intraspecific phylogenies Risk of mortality in patients infected with SARS-CoV-2 variant of concern 202012/1: matched cohort study Public Health Scotland and the EAVE II Collaborators. SARS-CoV-2 Delta VOC in Scotland: demographics, risk of hospital admission, and vaccine effectiveness Detecting SARS-CoV-2 variants with SNP genotyping The ethics of sequencing infectious disease pathogens for clinical and public health Three emerging coronaviruses in two decades We thank all the patients and medical staff who have participated in this study, the supporting staff of the PCR Centre for their contribution to the clinical RT-PCR testing, TAKARA Bio (Shiga, Japan) for NGS operation, Editage (www.editage.com) for English language editing, and all authors who have kindly deposited genome data used in this study on the GISAID database. The authors do not have any conflicts of interest to declare. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.J o u r n a l P r e -p r o o f