key: cord-0863473-4n6xdoo7 authors: Kawano-Sugaya, Tetsuro; Yatsu, Koji; Sekizuka, Tsuyoshi; Itokawa, Kentaro; Hashino, Masanori; Tanaka, Rina; Kuroda, Makoto title: Haplotype Explorer: an infection cluster visualization tool for spatiotemporal dissection of the COVID-19 pandemic date: 2020-07-20 journal: bioRxiv DOI: 10.1101/2020.07.19.179101 sha: e486724cc0bab986c5da89cf6e01f7798cfdead6 doc_id: 863473 cord_uid: 4n6xdoo7 The worldwide eruption of COVID-19 that began in Wuhan, China in late 2019 reached 10 million cases by late June 2020. In order to understand the epidemiological landscape of the COVID-19 pandemic, many studies have attempted to elucidate phylogenetic relationships between collected viral genome sequences using haplotype networks. However, currently available applications for network visualization are not suited to understand the COVID-19 epidemic spatiotemporally, due to functional limitations That motivated us to develop Haplotype Explorer, an intuitive tool for visualizing and exploring haplotype networks. Haplotype Explorer enables people to dissect epidemiological consequences via interactive node filters to provide spatiotemporal perspectives on multimodal spectra of infectious diseases, including introduction, outbreak, expansion, and containment, for given regions and time spans. Here, we demonstrate the effectiveness of Haplotype Explorer by showing an example of its visualization and features. The demo using SARS-CoV-2 genome sequences is available at https://github.com/TKSjp/HaplotypeExplorer Summary A lot of software for network visualization are available, but existing software have not been optimized to infection cluster visualization against the current worldwide invasion of COVID-19 started since 2019. To reach the spatiotemporal understanding of its epidemics, we developed Haplotype Explorer. It is superior to other applications in the point of generating HTML distribution files with metadata searches which interactively reflects GISAID IDs, locations, and collection dates. Here, we introduce the features and products of Haplotype Explorer, demonstrating the time-dependent snapshots of haplotype networks inferred from total of 4,282 SARS-CoV-2 genomes. To eliminate infectious diseases, it is essential to quickly identify and control 57 emerging infection clusters before they become uncontrollable. For this 58 purpose, many applications to assist researchers in understanding the latest 59 epidemiology have been developed. In fact, the recent intensification of the and Network (Bandelt et al. 1999) , have supported these studies using 77 haplotype networks of SARS-CoV-2. Although these applications also work as 78 network viewers, many alternatives are also available for additional annotation Nextstrain: real-time tracking of pathogen evolution CoV Genome Tracker: tracing genomic footprints of 180 Covid-19 pandemic TCS: a computer program to estimate gene 183 genealogies PopART: Full-feature software for haplotype 186 network construction A. Median-joining networks for inferring 188 intraspecific phylogenies Biological network exploration with Cytoscape 3 Gephi: an open source software for exploring and 194 manipulating networks tcsBU: a tool to extend TCS network 197 layout and visualization D 3 Data-Driven Documents GISAID: Global initiative on sharing all 203 influenza data -from vision to reality SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation MAFFT: a novel method for rapid multiple sequence 209 alignment based on fast Fourier transform CD-HIT: accelerated for clustering the next generation 212 sequencing data SNP-sites: rapid efficient extraction of SNPs from 215 multi-FASTA alignments All depending programs other than network analyzing software (TCS) can be installed via Anaconda. If you do not have conda, download and install Anaconda Open terminal and execute following command: conda install -c bioconda seqkit mafft cd-hit snp-sites GISAID and open EpiCoV/Browse Host" as "Human" and check on for "complete Check on sequences of interest and download them as "input0.fasta". (Note: currently GISAID restricts downloading over 10 Result_example.html on a web browser. This viewer can also visualize in-house 244 data using three-step commands. Users can start from Step1.py, which 245 processes in-house multi-fasta (any multi-fasta is available, whereas SARS-