key: cord-0002260-e7xwb03g authors: Yamashita, Akifumi; Sakamoto, Tetsuya; Sekizuka, Tsuyoshi; Kato, Kengo; Takasaki, Tomohiko; Kuroda, Makoto title: DGV: Dengue Genographic Viewer date: 2016-06-07 journal: Front Microbiol DOI: 10.3389/fmicb.2016.00875 sha: 4997e92b014bb26f8184eae26662a09ffcf4f733 doc_id: 2260 cord_uid: e7xwb03g Dengue viruses (DENVs) and their vectors are widely distributed throughout the tropical and subtropical regions of the world. An autochthonous case of DENV was reported in Tokyo, Japan, in 2014, for the first time in 70 years. A comprehensive database of DENV sequences containing both serotype and genotype data and epidemiological data is crucial to trace DENV outbreak isolates and promptly respond to outbreaks. We constructed a DENV database containing the serotype, genotype, year and country/region of collection by collecting all publically available DENV sequence information from the National Center for Biotechnology Information (NCBI) and assigning genotype information. We also implemented the web service Dengue Genographic Viewer (DGV), which shows the geographical distribution of each DENV genotype in a user-specified time span. DGV also assigns the serotype and genotype to a user-specified sequence by performing a homology search against the curated DENV database, and shows its homologous sequences with the geographical position and year of collection. DGV also shows the distribution of DENV-infected entrants to Japan by plotting epidemiological data from the Infectious Agents Surveillance Report (IASR), Japan. This overview of the DENV genotype distribution may aid in planning for the control of DENV infections. DGV is freely available online at: (https://gph.niid.go.jp/geograph/dengue/content/genomemap). Dengue viruses (DENVs) are members of the genus Flavivirus in the family Flaviviridae and consist of four serotypes (DENV-1 to -4) (Lanciotti et al., 1992; Kuhn et al., 2002) . Each serotype can be divided into five to six genotypes. However, there is no standard genotyping classification [DENV-1, (Goncalvez et al., 2002) ; DENV-2, (Anez et al., 2011; Khan et al., 2013) ; DENV-3, (Lanciotti et al., 1994; Wittke et al., 2002; Klungthong et al., 2008) ; DENV-4, (Abubakar et al., 2002) ]. DENV has a positive-sense, single-stranded RNA genome that is ∼11 kb in length and encodes a capsid protein (C), premembrane protein (prM), and envelope glycoprotein (E) in addition to seven non-structural proteins (NSs, NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) . DENV infection causes dengue illness, which may range from dengue fever (a mild illness) to dengue hemorrhagic fever and dengue shock syndrome (the severe forms of the illness) in addition to asymptomatic cases Abbreviations: DENV, Dengue virus; C, capsid protein; prM, premembrane protein; E, envelope glycoprotein; NS, nonstructural protein; NCBI, national center for biotechnology information; IASR, Infectious Agents Surveillance Report. (Centers for Disease Control and Prevention; http://www. cdc.gov/dengue/clinicalLab/clinical.html). Infection results in lifetime immunity against the same serotype, but successive exposure to different DENVs increases the likelihood of contracting a severe form of dengue illness, such as dengue hemorrhagic fever or dengue shock syndrome (Chiappelli et al., 2014) . DENV and its vectors have become widely distributed throughout the tropical and subtropical regions of the world (Murray et al., 2013 ). An autochthonous case of DENV infection was reported in Tokyo, Japan, in 2014 for the first time in 70 years (Kutsuna et al., 2015) . To ensure prompt action in response to a DENV outbreak, a comprehensive DENV database based on the genotypes would be essential for tracing the outbreak source. To date, only two DENV databases provide genotype information. The web service ViPR 1 (Pickett et al., 2012) supports genetic analysis based on the viral genome for a tested input sequence, including DENV sequences. The second database is the Dengue virus genotyping database 2 (Yamashita et al., 2013) , which provides a summary table containing the DENV serotype/genotype, year and country of collection and accession number. There are many other DENV databases; however, no other sites provide summarized genotype information. The Dengue Virus Resource 3 facilitates the retrieval of DENV sequences deposited in GenBank according to serotype, disease symptom, host, region/country, genome region, and collection and/or release data (Resch et al., 2009) . DENVirDB 4 provides sequence information and computationally curated information of dengue viral proteins (Asnet et al., 2014) . DENVDB focuses on the dengue virus sequence database for keyword searches (no publication: http://proline.bic.nus.edu.sg/denvdb/). Finally, the Dengue Virus Portal is a sequence collection with metadata (no publication: https://www.broadinstitute.org/ annotation/viral/Dengue/Home.html). Here, we constructed the website Dengue Genographic Viewer (DGV), which presents DENV information based on the genotype and epidemiological data by using the geographic tool Google Maps © to update the recent dissemination of DENV genotypes from a global perspective. The DENV genotype database was constructed as follows: (1) all accessible DENV nucleotide sequences were collected; (2) the complete sequences of each protein region (C, prM, E, NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) were extracted from the sequences; (3) a blastn homology search was performed against the genotype database and the genotype of the most homologous sequence was assigned; (4) and the genotype data were stored in a database using SQLite (https://www.sqlite.org/). 1. The DENV nucleotide sequences were downloaded from the NCBI database using key words ("Dengue virus"[porgn:__txid12637]). 2. A blastx homology search was performed to detect nucleotide regions that corresponded to each mature DENV protein; nucleotide regions that exhibited more than 85% sequence coverage to the protein were used for the subsequent analysis. 3. To reduce the time required to obtain the most homologous sequences in the genotype database, we reduced the number of sequences used in the blast search by clustering highly homologous sequences. We performed a uclust search (Edgar, 2010) against the nucleotide sequences of each protein region and selected one representative sequence for each homologous sequence group with a clustering threshold of 99% identity; then, the representative sequences were subjected to a homology search against the genotype database. The original genotype database was constructed according to the method proposed by previous report (Yamashita et al., 2013) . Briefly, the representative sequences were aligned by using the mafft (Katoh and Standley, 2014) program and Neighbor joining (NJ) phylogenetic trees were constructed using the MEGA5 program (Tamura et al., 2011) . The genotype of each gene was assigned manually according to the previous genotype database (Yamashita et al., 2013) . 4. Sequence ID, country/region and year of collection were extracted from the deposited GenBank data and integrated into the SQL database by using an in house Perl script. The above processes except for the original database construction are performed automatically every night to update the recent DENV database. We implemented a set of viewer applications on DGV by using Google Maps © , which shows the data in a temporal and spatial manner. One application presents the geographical distribution of each DENV genotype on the map in a userspecified time span. Another option is a homology search program that searches for the most homologous DENV sequence in the DGV database and show the geographical positions of closely related sequences on the map. The other interface shows the sources of imported dengue cases on the map, according to the Infectious Agents Surveillance Report (IASR), Japan (http://www.nih.go.jp/niid/en/iasr-e.html). This set of applications is available at the DGV web site (https://gph.niid.go.jp/geograph/dengue/content/genomemap). On March 4, 2016, DGV included a total of 10,514 DENV sequences, which consisted of 3872, 3414, 2309, and 919 DENV serotype-1, -2, -3, and -4 sequences, respectively ( Table 1) . Some genotypes have been abundantly sequenced and deposited in the public database, whereas other genotypes have rarely been sequenced (i.e., DENV-1 genotype II was reported in only seven records from 1960 to 2012, DENV-4 genotype III was also reported in only seven records from 1997 to 2001, and DENV-3 genotype IV has not been reported since 1977). These rare genotypes may have become minor populations or may be undergoing a silent transmission cycle (Lanciotti et al., 1994; Chen and Vasilakis, 2011; Santiago et al., 2012) . Fifteen years' worth of data from 2000 to 2014 for all serotypes showed that DENV sequences were primarily reported from South to Southeast Asia, Central to South America, and the countries of Oceania (Figure 1A) . Some biases in DENV serotype compositions were observed in several countries. For instance, the dominant serotypes were DENV-1 and -2 in Mexico, DENV-1 and -4 in Polynesian countries with the exception of Fiji, and DENV-2 and -3 in Pakistan. In contrast, all serotypes were sampled in Brazil and Thailand. Intriguingly, when focusing on the genotype instead of the serotype, the data from 2000 to 2014 showed at least three potential geographical genotype distribution border lines in Asia (Figures 1B,2) . The first border is between the American continents and other regions (Figure 1B) , the second is located between Bangladesh and Myanmar for the genotype distributions of DENV-1 and -2 and India and Myanmar for DENV-3, and the third is located between Indochina and the Malay Peninsula (Figure 2) . There seem to be differences in the DENV-1 and -3 distributions between Malaysia, Singapore and Indonesia; however, the border line is not clear because Malaysia and Indonesia consist of many islands and share Kalimantan Island and the deposited sequence data do not specify the original island isolation site. Although, some boundaries are not clear, these boundaries are roughly conserved among all serotypes except for the Bangladesh-Myanmar border line for DENV-4, suggesting potential barriers against the vector mosquitos' movements or human activities between the countries. We also found a timeline change in the predominant genotypes. From 1998 to 2007, the dominant genotype in Asia was Cosmopolitan, although India-Pakistan-Sri Lanka and Southeast-Oceania belonged to different lineages (Khan et al., 2013) . The major genotypes in the Indochina countries were different from those of the other Asian countries; genotype FIGURE 4 | A screenshot of the DENV sequence similarity search. An Env sequence derived from an autochthonous case in Japan (LC006123 or gi: 698162713) was used as a sample query. The query was assigned as the Env region of DENV-1 genotype I. Asian I was predominant in Thailand, whereas genotype Asian American was predominant in Cambodia and Vietnam (Figure 3 and Movie S1). From 2001, Asian I increased in Cambodia and Vietnam until finally in 2007 Asian I became the predominant genotype in Indochina. The genotype Asian I viruses in Thailand seemed to be widely disseminated into Vietnam via Cambodia but did not reach Malaysia and Bangladesh (Figure 2) . Thus, the Asian American genotype was replaced by Asian I in Cambodia and Vietnam between 1998 and 2011. This example also suggests the idea of genotype transition, which probably reflects the mosquito vector habitat and human activities in the Indochinese Peninsula. DGV currently does not support the prediction of Dengue epidemics, because number of deposited sequence data does not always reflect the actual number of events, in addition, it takes long time to be a public sequence through isolation, sequencing, and publication. DGV provides a search engine for the assignment of the DENV serotype, genotype, and origin country according to the most homologous sequence on the basis of a blastn search against the DENV database. The search results are shown as text and are also plotted through Google Maps © . Subsequently, the query sequence is divided into mature protein regions and displayed with a serotype/genotype assignment. The homology search results and the divided nucleotide sequences in fasta format can be downloaded. Here, we present an example similarity search for an Env sequence derived from an autochthonous case in Japan (LC006123 or gi: 698162713). DGV assigned the sequence as the Env region of the DENV-1 genotype I and identified homologous sequences from Japan, China, Singapore and Indonesia. These results are consistent with those from a previous study (Figure 4 ; Kutsuna et al., 2015) . To aid in visualizing the source countries of dengue infection cases imported to Japan, the number of annual imported cases was also mapped on Google Maps © . The serotype (but not genotype), year, and visiting country/area are also indicated based on the Infectious Agents Surveillance Report (IASR), which releases monthly data and information obtained from prefectural and municipal public health institutes and quarantine stations to the public (Figure 5 ). AY performed the experimental design, participated in the analysis and drafted the manuscript. TS1 implemented the application, performed the data collection, constructed the original genotype database, and participated in the analysis. TS2, KK, and TT reviewed the application and participated in the discussion. MK contributed to the experimental design, performed the analysis and drafted the manuscript. All authors read and approved the final manuscript. This work was supported by a grant for Research on Emerging and Re-emerging Infectious Diseases (H25 Shinko-Ippan-015/H26 Shinko-Gyosei-Shitei-002) from the Ministry of Health, Labor and Welfare, Japan, and was also supported by the Research Program on Emerging and Re-emerging Infectious Diseases (15fk0108011h0003 and 15fm0108022h0001) from the Japan Agency for Medical Research and Development, AMED. This work was also partially supported by JSPS KAKENHI Grant Number 15K08488.The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. We are grateful to Ms. Inamine from NIID for drawing the DGV icon. We thank Prof. Ikuta and Prof. Yasunaga from BIKEN for allowing us to use the original dataset "dengue virus genotyping database." The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmicb. Movie S1 | Genotype transition of DENV-2 in Asia from 1986-1990 to 2010-2014. Frontiers in Microbiology | www.frontiersin.org Emergence of dengue virus type 4 genotype IIA in Malaysia Circulation of different lineages of dengue virus type 2 in Central America, their evolutionary time-scale and selection pressure analysis DENVirDB: a web portal of dengue virus sequence information on Asian isolates Dengue-quo tu et quo vadis? Viruses Viral immune evasion in dengue: toward evidence-based revisions of clinical practice guidelines Search and clustering orders of magnitude faster than BLAST Diversity and evolution of the envelope gene of dengue virus type 1 MAFFT: iterative refinement and additional methods Emergence and diversification of dengue 2 cosmopolitan genotype in Pakistan Molecular genotyping of dengue viruses by phylogenetic analysis of the sequences of individual genes Structure of dengue virus: implications for flavivirus organization, maturation, and fusion Autochthonous dengue fever Rapid detection and typing of dengue viruses from clinical samples by using reverse transcriptase-polymerase chain reaction Molecular evolution and epidemiology of dengue-3 viruses Epidemiology of dengue: past, present and future prospects Virus pathogen database and analysis resource (ViPR): a comprehensive bioinformatics database and analysis resource for the coronavirus research community Virus variation resources at the National Center for Biotechnology Information: dengue virus Reemergence and decline of dengue virus serotype 3 in Puerto Rico MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods Extinction and rapid emergence of strains of dengue 3 virus during an interepidemic period Origin and distribution of divergent dengue virus: novel database construction and phylogenetic analyses