key: cord-0796093-go5g6yxl authors: Zhao, Mingchen; Chen, Jingyuan; Wang, Qiang; Lu, Zuhong; Jia, Zhongwei title: A Landscape Analysis on Virus: based on NCBI Database date: 2022-02-18 journal: China CDC Wkly DOI: 10.46234/ccdcw2022.019 sha: 145faba5fc9986f6d704fda7cb00ef5757c70175 doc_id: 796093 cord_uid: go5g6yxl WHAT IS ALREADY KNOWN ABOUT THIS TOPIC? Studies indicate that viruses could spread across species, but it is difficult to know when and where such small probability events occur because it is almost impossible to design an observational study on the whole landscape. WHAT IS ADDED BY THIS REPORT? We did a comprehensive analysis on the National Center for Biotechnology Information database and tried to find the time, place, and host that the viruses stayed in their long evolutionary history. WHAT ARE THE IMPLICATIONS FOR PUBLIC HEALTH PRACTICE? Public databases are helpful to understand the risk of virus infection in humans and also a cost-effective method for monitoring public health and safety events. According to the International Committee on Taxonomy of Viruses Master Species List 2020, more than 9,000 virus species have been identified on earth (1), of which the World Health Organization (WHO) announced that more than 200 species were known as zoonotic viruses (2). Previous studies have also shown that zoonoses (hantavirus, Ebola virus, highly pathogenic avian influenza, West Nile virus, Rift Valley fever virus, norovirus, severe acute respiratory syndrome coronavirus 1, Marburg virus, influenza A virus) infected more than 2.5 billion people every year, among which 2.7 million died (3). Zoonotic viruses have aroused broad concerns in recent years so that people have been encouraged to avoid eating wild animals, and a series of animal protection laws and regulations were enacted, such as the Convention on International Trade in Endangered Species of Wild Fauna and Flora (4). Researchers from different fields and countries have tried to collaborate to explore virus associations between animals and humans worldwide (5). However, many studies only focused on investigating a specific virus when it received enough attention as in a localized or global epidemic, such as Ebola virus, H1N1, Zika virus, and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (6). Even so, it's difficult for people to know exactly when, where and how the virus jumped from animal or environment to humans because it is impossible to design an observational study on the whole landscape around a virus. To understand the overall evolution law of viruses, this study aimed to find the spatial and temporal distribution of viruses and related hosts through the public database. The National Center for Biotechnology Information (NCBI) virus database is a database of gene sequences maintained by the National Institutes of Health that aggregates and annotates all publicly available nucleotide and protein sequences, and it has been used for studies exploring quantitative snapshots of viral genomic trends and overviewing virus real-time quantitative polymerase chain reaction (RT-qPCR) method performance (7) (8) . In this study, we performed a spatiotemporal analysis on the viruses in the NCBI database and tried to disclose the potential time, place, and hosts that the virus appeared or stayed in their long evolutionary history. Our study indicated the top five widely distributed viruses and found that no virus had been reported in all countries/regions. In addition, the reported areas of viruses did not completely overlap with areas where their suspected hosts live. We also found that the 249 viruses isolated from humans were also isolated from 705 other mammals and 938 non-mammals. We attempted to map the distribution and evolution of viruses and identify suspected hosts globally based on the NCBI database, which was helpful for understanding the risk of virus infection in humans and was also a costeffective method for monitoring and predicting public health and safety events globally. The data were downloaded from the NCBI Virus database and covered available data up to September 2021 (www.ncbi.nlm.nih.gov/labs/virus). The data included virus genomic sequence submission information (accession number, submitter, and release date), virus types (species, genus, and family), and biological sample description (location, organisms of sample, isolation, and collection date). Data that were duplicated, missing, or had unidentified collected dates, locations, or organisms of the biological samples were excluded. Because the NCBI only reported that the virus was isolated from an organism, not whether the organism was the host of the virus, we referred to the organisms as suspected hosts. Based on the standard of NCBI taxonomy, suspected hosts were classified into animals, plants, and microbes. The numbers were calculated by virus species, suspected hosts, suspected host locations, and collection time. The spatiotemporal distribution was mapped in country/region units and the path was analyzed by linking the reported location of the viruses or suspected hosts. Potential zoonotic diseases were explored by observing the suspected hosts from which the viruses were isolated. All statistical analyses were performed in R statistical software (version 4.1.0, The R Foundation for Statistical Computing, Vienna, Austria). Animal silhouettes representing hosts were downloaded from PhyloPic (http://www.phylopic.org). All maps were made by using ArcGIS (version 10.7, Esri Inc, Redlands, CA, USA). A total of 605,504 records covering 24,234 viruses from 240 countries/regions with sample collection date from 1865 to September 22, 2021 were involved in the final analysis. We observed that 12,243 viruses were isolated from 4,187 animals, 2,856 from 2,074 plants, and 9,176 from 965 microbes ( Figure 1 ). However, more than 90% viruses (21,845 of 24,234) were reported in a single country/region, and the remaining 10% (2,389 of 24,234) of viruses were from 236 countries/regions and were isolated from 4,742 suspected hosts in 238 countries/regions. Table S1 ). Homo sapiens were reported in 225 countries/regions in the past 154 years (from the United States in 1866 to the Faroe Islands in 2020), and approximately 1,075 viruses were isolated from this host. Bos taurus were reported in 143 countries/regions during the past 138 years (from France in 1882 to Luxembourg in 2020) and from which 273 viruses were isolated; Canis lupus familiaris were reported in 123 countries/regions during the past 89 years (from China in 1931 to Ghana in 2020) and from which 106 viruses were isolated; Gallus gallus were reported in 122 countries/regions during the past 118 years (from Italy in 1902 to Timor-Leste in 2020) and from which 287 viruses were isolated; Sus scrofa were reported in 110 countries/regions during the past 99 years (from China in 1921 to Paraguay in 2020) and from which 276 viruses were isolated (Supplementary Figure S1B Table S1 , available in http://weekly.chinacdc.cn/). These outcomes pose several concerning issues. First, the top five most widely distributed viruses have been investigated and reported in most countries/regions (Supplementary Table S1 , available in http://weekly.chinacdc.cn/), but no virus has been reported in all countries/regions. The surprising finding is the distribution of influenza A virus -a respiratory virus closely related to humans. We presumed it should be distributed everywhere and reported by all 240 countries/regions. Our study indicates that influenza A virus is the most reported location but 85 countries/regions have not yet reported it. It is possible that the NCBI database has not collected these viruses in these countries/regions or that these areas have not covered these viruses. In either case, these countries/regions deserve attention from the perspective of virus monitoring. Second, the reported areas of viruses do not completely overlap with areas where their suspected hosts live. For example, influenza A virus has been reported in 155 countries/regions, but its 422 hosts live in 231 countries/regions. The results have two explanations: first, the 76 countries/regions may have influenza A virus but do not investigate and report it; second, the 76 countries/regions could have no influenza A virus, which implies that these areas are susceptible to influenza A virus. However, both viruses and suspected hosts all take approximately one hundred years to survive in a region (Supplementary Figure S1 , available in http://weekly.chinacdc.cn/). This long period of history leaves many opportunities for people to find and be involved in their evolution. Third, we found that 249 viruses from humans were also isolated from other 705 mammals and 938 nonmammals. It is not surprising that most mammals' zoonoses are from Sus scrofa, Gallus gallus, Anas platyrhynchos, Canis lupus familiaris, and Bos taurus, because these domestic mammals have a close relationship with humans. Bats (152 kinds of species in the database) and Paguma larvata covered 44 zoonoses and 3 zoonoses with humans in the reported data. We also found 27 viruses from animals and plants that are involved in insects, whether it is related with humans has not been reported (Figure 2) . This study was subject to some limitations. First, NCBI is a public database, and the data quality may be uneven. However, NCBI has a form for submitted data, which guarantees that the basic information of submitted data is consistent at some level, and these data can meet the requirements of our spatiotemporal analysis. Second, we classified viruses and suspected hosts according to their submitted names. It will be regarded as two different viruses or suspected hosts if their names are spelled differently. For example, Enterovirus A, Enterovirus B, Enterovirus C, and Enterovirus sp. are regarded as different viruses in this study. This might be a slight overestimation of the virus species and suspected hosts submitted in the NCBI database but cannot change their whole spatiotemporal distribution, especially in the long history of their evolution. Further, we identified 249 zoonotic viruses in NCBI, which was consistent with the figure published by the WHO (2). Third, data reported by countries may not be complete, which could lead to bias in the analysis. Also, the advent of next generation sequencing (NGS) would cause bias on the temporal discovery of viruses because it was hard to perform large-scale genome sequencing of viruses before NGS became relatively affordable. Fourth, we did not analyze the path of each virus and their suspected hosts. Analysis of the overlap of viruses and their hosts is important for predicting the risk of a new virus outbreak in local areas, but beyond the scope of this study. In conclusion, we attempted to map the distribution and evolution of viruses and suspected hosts globally based on the NCBI database, which is helpful for understanding the risk of virus infection in humans and is also a cost-effective method for monitoring and predicting public health and safety events globally. Conflicts of interest: No conflicts of interest. World Health Organization. Regional Office for the Western Pacific. Zoonotic diseases: a guide to establishing collaboration between animal and human health sectors at the country level. Geneva: WHO Regional Office for the Western Pacific. 2008. https://iris.wpro.who.int/handle/ 10665.1/10415. Wilder-Smith A, Osman S. Public health emergencies of international concern: a historic overview. J Travel Med 2020;27 (8) China CDC Weekly Central African Republic; Chad; Chile; China; Colombia New Caledonia; New Zealand; Nicaragua; Niger; Nigeria; Democratic People's Republic of Korea Papua New Guinea; Paraguay; Peru; Philippines; Poland; Portugal Republic of Serbia; Reunion; Romania; Russia; Saudi Arabia; Senegal; Singapore; Slovakia; Slovenia; Solomon Islands; South Africa; Republic of Korea; Spain; Sri Lanka; Sudan; Suriname; Sweden United Arab Emirates; United Kingdom; United Republic of Tanzania; United States of America; Uruguay; Uzbekistan; Venezuela; Vietnam; West Bank; Zambia Cambodia; Cameroon; Canada; Cape Verde; Central African Republic; Chad; Chile; China; Colombia Equatorial Guinea; Eritrea; Estonia; Ethiopia; Fiji; Finland; France Republic of Serbia; Romania; Russia; Rwanda; Saudi Arabia Singapore; Slovakia; Slovenia; Somalia; South Africa; Republic of Korea; Spain Hepatitis B virus Afghanistan; Albania; Algeria; Angola; Argentina; Armenia; Australia; Azerbaijan; Bangladesh; Belarus Cape Verde; Central African Republic; Chad; Chile; China; Colombia; Comoros Czech Republic; Democratic Republic of the Congo; Dominican Republic; Ecuador; Egypt; Eritrea; Estonia; Ethiopia Papua New Guinea; Paraguay; Peru; Philippines; Poland; Portugal; Republic of Serbia Reunion; Romania; Russia; Rwanda; Samoa; Saudi Arabia; Senegal United Kingdom; United Republic of Tanzania; United States of America; Uruguay; Uzbekistan; Vanuatu; Venezuela; Vietnam; Zimbabwe (145) Rabies lyssavirus Afghanistan; Algeria; Argentina; Austria; Azerbaijan; Bangladesh; Belgium; Benin; Bhutan Equatorial Guinea; Estonia; Ethiopia; Finland; France; French Guiana; Gabon; Gambia; Georgia; Germany; Ghana; Greece; Greenland; Grenada; Guatemala; Guinea Mozambique; Myanmar; Namibia; Nepal; Netherlands; Niger; Nigeria; Norway Republic of Serbia; Romania; Russia; Rwanda; Saudi Arabia United Kingdom; United Republic of Tanzania; United States of America; Uruguay; Vietnam; Zaire; Zambia Antigua and Barbuda; Arctic Ocean; Argentina; Aruba; Australia; Bangladesh; Barbados British Virgin Islands; Brunei; Burkina Faso; Cambodia; Cameroon; Cape Verde Chile; China; Colombia; Comoros; Cook Islands El Salvador; Eritrea; Federated States of Micronesia New Caledonia; Nicaragua; Nigeria; Niue; Pacific Ocean; Pakistan; Palau; Panama Peru; Philippines; Portugal; Reunion; Russia; Saint Barthelemy; Saint Kitts and Nevis; Saint Lucia; Saint Vincent and the Grenadines; Samoa; Saudi Arabia; Senegal; Seychelles; Singapore; Solomon Islands; Somalia United Republic of Tanzania; United States of America; Uruguay; Vanuatu; Venezuela; Vietnam; Wallis and Futuna; Yemen (122) Measles morbillivirus Afghanistan; Algeria; Angola; Argentina; Australia; Austria; Bahrain; Bangladesh; Belarus; Belgium Cote d'Ivoire; Croatia; Cuba; Cyprus; Czech Republic; Democratic Republic of the Congo Mozambique; Myanmar; Namibia; Nepal; Netherlands; New Caledonia; New Zealand; Niger; Nigeria; Norway; Oman; Pakistan; Panama; Papua New Guinea; Philippines; Poland; Portugal Republic of Serbia; Romania; Russia; Rwanda; Senegal; Sierra Leone; Slovakia; Slovenia; Solomon Islands Afghanistan; Albania; Algeria; Andorra; Angola; Anguilla; Antigua and Barbuda; Arctic Ocean; Argentina; Armenia British Virgin Islands; Brunei; Bulgaria; Burkina Faso; Burundi; Cambodia; Cameroon; Canada; Cape Verde; Cayman Islands Cote d'Ivoire; Croatia; Cuba; Cyprus Democratic Republic of the Congo; Denmark; Djibouti; Dominica; Dominican Republic; East Timor; Ecuador; Egypt; El Salvador; Equatorial Guinea; Eritrea; Estonia; Ethiopia; Faroe Islands; Federated States of Micronesia; Fiji New Caledonia; New Zealand; Nicaragua; Niger; Nigeria; Niue; Norway; Oman; Pacific Ocean; Pakistan; Palau; Palestine; Panama; Papua New Guinea; Paraguay; Peru; Philippines; Poland Republic of Serbia; Reunion; Romania; Russia; Rwanda; Saint Barthelemy; Saint Kitts and Nevis Saint Martin; Saint Vincent and the Grenadines Solomon Islands; Somalia; South Africa; Republic of Korea; South Sudan; Spain; Sri Lanka; Sudan; Suriname; Swaziland; Sweden; Switzerland; Syria United Arab Emirates; United Kingdom; United Republic of Tanzania; United States of America; Uruguay; Uzbekistan; Vanuatu; Venezuela; Vietnam; Wallis and Futuna; West Bank; Western Sahara; Yemen; Yugoslavia; Zaire; Zambia Bos taurus Afghanistan; Albania; Algeria; Angola; Argentina; Armenia; Australia; Austria; Azerbaijan; Bahrain Mayotte; Mexico; Moldova; Mongolia; Montenegro; Morocco; Mozambique; Myanmar; Namibia; Nepal New Zealand; Niger; Nigeria; Democratic People's Republic of Korea; Norway; Oman; Pakistan; Palestine; Panama Papua New Guinea; Paraguay; Peru; Poland; Portugal; Republic of Serbia; Romania; Russia; Rwanda Saudi Arabia; Senegal; Slovakia; Slovenia; Somalia; South Africa; Republic of Korea; Spain United Arab Emirates; United Kingdom Faso; Cambodia; Cameroon; Canada Democratic Republic of the Congo; Dominican Republic; Ecuador; Egypt; Estonia; Ethiopia; Finland; France; French Guiana; Gabon; Gambia Myanmar; Namibia; Nepal; Netherlands; New Zealand; Nicaragua; Niger; Nigeria; Oman; Pakistan; Paraguay Republic of Serbia; Romania; Russia; Rwanda; Saint Kitts and Nevis; Senegal; Slovenia United States of America; Uruguay; Vietnam; Zaire; Zambia Gallus gallus Afghanistan; Algeria; Argentina; Australia; Austria; Azerbaijan; Bangladesh; Belgium; Belize; Benin; Bhutan Burkina Faso; Burundi; Cambodia; Cameroon; Canada; Chile; China; Colombia New Zealand; Nicaragua; Niger; Nigeria; Democratic People's Republic of Korea; Oman; Pakistan; Peru; Philippines; Poland; Portugal; Republic of Serbia; Romania; Russia; Saudi Arabia; Senegal; Singapore; Slovakia; Slovenia; South Africa; Republic of Korea; Spain; Sri Lanka United Arab Emirates; United Kingdom Uruguay; Venezuela; Vietnam; West Bank; Yemen; Zambia Sus scrofa Afghanistan; Albania; Angola; Argentina; Armenia; Australia; Austria; Bangladesh; Belarus; Belgium; Benin Cote d'Ivoire; Croatia; Cuba; Czech Republic; Democratic Republic of the Congo; Denmark New Caledonia; New Zealand; Nicaragua; Nigeria; Democratic People's Republic of Korea Republic of Serbia; Romania; Russia; Saint Kitts and Nevis; Sao Tome and Principe; Senegal; Singapore; Slovakia; Slovenia; South Africa; Republic of Korea; Spain; Sri Lanka United States of America; Uruguay; Venezuela; Vietnam; Zaire; Zambia Chinese Center for Disease Control and Prevention Mozambique 12Singapore 12Ukraine 12Azerbaijan 11Iraq 11Morocco 11Vanuatu 11Guatemala 10 Saint Kitts and Nevis 10 Gambia 9Iceland 9Oman 9Sri Lanka 9Benin 8