key: cord-0912354-ndt8f812 authors: Zhang, Yong-Zhen; Wu, Wei-Chen; Shi, Mang; Holmes, Edward C title: The diversity, evolution and origins of vertebrate RNA viruses date: 2018-08-13 journal: Curr Opin Virol DOI: 10.1016/j.coviro.2018.07.017 sha: 5718bd92934aeeabb9cee6354a5ea32a341f1c76 doc_id: 912354 cord_uid: ndt8f812 Despite a substantial increase in our knowledge of the biodiversity and evolution of vertebrate RNA viruses, far less is known about the diversity, evolution and origin of RNA viruses across the diverse phylogenetic range of viruses, and particularly in healthy animals that are often only rarely utilized for virological sampling. Fortunately, recent advances in virus discovery using metagenomic approaches are beginning to reveal a multitude of RNA viruses in vertebrates other than birds and mammals. In particular, fish harbor a remarkable array of RNA viruses, including the relatives of important pathogens. In addition, despite frequent cross-species transmission, the RNA viruses in vertebrates generally follow the evolutionary history of their hosts, which began in the oceans and then moved to terrestrial habitats over timescales covering hundreds of millions of years. Yong-Zhen Zhang 1,2 , Wei-Chen Wu 1 , Mang Shi 1,2,3 and Edward C Holmes 1,2,3 Despite a substantial increase in our knowledge of the biodiversity and evolution of vertebrate RNA viruses, far less is known about the diversity, evolution and origin of RNA viruses across the diverse phylogenetic range of viruses, and particularly in healthy animals that are often only rarely utilized for virological sampling. Fortunately, recent advances in virus discovery using metagenomic approaches are beginning to reveal a multitude of RNA viruses in vertebrates other than birds and mammals. In particular, fish harbor a remarkable array of RNA viruses, including the relatives of important pathogens. In addition, despite frequent cross-species transmission, the RNA viruses in vertebrates generally follow the evolutionary history of their hosts, which began in the oceans and then moved to terrestrial habitats over timescales covering hundreds of millions of years. RNA viruses are responsible for a wide range of human diseases, from the relatively mild respiratory infections associated with some rhinoviruses and coronaviruses, to life threatening hemorrhagic fevers. The association between viruses and disease dates to the beginning of virology at the junction of the 19th and 20th centuries, when the first viruses were identified in plants and animals suffering from overt diseases (tobacco mosaic virus and foot-and-mouth disease virus, respectively). As mammals and birds often experience relatively similar infectious disease to humans, and live in close proximity to us, they are commonly thought to be the natural reservoir hosts of the RNA viruses that subsequently emerge and cause diseases in humans through a process of cross-species transmission [1, 2] . Consequently, considerable effort has been directed towards investigating the diversity, evolution and origins of RNA viruses associated with mammalian and bird hosts. To date, a multitude of related RNA viruses have been identified from mammalian and bird hosts, including those that have emerged or reemerged in humans or domestic animals. Important examples include coronaviruses in bats [3 ] , hantaviruses in rodents and shrews [4] , and influenza A viruses in birds and a variety of mammals [5] . Although these results are of importance, mammals and birds represent only a small proportion (less than 25%) of total number of vertebrates, and little is known about the natural viromes of the vast number of vertebrate species. Indeed, recent metagenomic studies [6] [7] [8] [9] [10] [11] suggest that our understanding of the true biodiversity and evolution of vertebrate RNA viruses is limited, fragmentary and biased. Viruses likely infect all cellular organisms, and are ubiquitous in all natural environments, such that they can be considered the most abundant source of nucleic on earth [12] . Compared with birds and mammals, 'lower' vertebrates, which we define here as all vertebrates other than birds and mammals (i.e. amphibians, reptiles and various groups of fish, and jawless vertebrates), exhibit a far greater biological diversity with approximately 33 000 documented species. In particular, fish exhibit a remarkable diversity and abundance in both fresh water and marine environments, and account for approximately 50% of total described number of vertebrate species. It is therefore possible, if not likely, that fish also harbor an enormous diversity of as yet uncharacterized RNA viruses. Indeed, those fish viruses described to date have largely been sampled from animals experiencing overt disease and that the potential to cause huge losses in aquaculture [13] [14] [15] . In addition, despite an enormous increase in the number of invertebrate RNA viruses described in recent years [16 ,17 ] , we still know little about the evolutionary links between RNA viruses carried by invertebrates and vertebrates, again reflecting a lack of the sampling of the key taxa. Within individual virus families or genera, those RNA viruses associated with vertebrates tend to form monophyletic groups and are often distantly related to those sampled from invertebrates [16 ,17 ] . This suggests that there are likely major evolutionary gaps between invertebrate and vertebrate RNA viruses, although whether this will withstand more intensive sampling is unknown. Additionally, the time scale of evolutionary history of RNA viruses remains unclear, with very different inferences estimated using a suite of different methods. Molecular clock-based analyses of individual viruses have often suggested very recent origins, even for wellknown etiologic agents with a documented historical record, such as rabies virus, which may be only a few thousand years old [9, 18] . By contrast, far old evolutionary histories, on the scales of millions of years, have been revealed at the level of virus families, either by using endogenous viral elements (EVEs) that are integrated into host genomes to calibrate molecular clocks [19] , or through patterns of host-virus association inferred through phylogenetic trees (see below). Although we know a great deal more about the diversity and evolution of RNA viruses than we did before the start of this millennium, it is also apparent that we are only just starting to scratch the surface of their true biodiversity and evolution of vertebrate RNA viruses. Fortunately, a remarkable number and diversity of RNA viruses have recently been discovered in lower vertebrates through meta-transcriptomic approaches [20 ] . Indeed, all types of virus family identified in mammals have now been found in lower vertebrates ( Figure 1 ). In particular, an amazing diversity of RNA viruses (>17 families) has now been documented in fish, including viruses that are related to those that cause severe human disease today. This strongly suggests that these families of RNA viruses have an evolutionary history that may stretch back to the entire history of the vertebrates. In what follows we briefly review some of the key advances in our understanding of the diversity and evolution of vertebrate RNA viruses, and discuss some of the challenges for the future. Due to intense interest in the ongoing occurrence of emerging and re-emerging infectious diseases (including avian influenza, SARS, and Ebola disease) in recent decades, the number of known RNA viruses from mammals and birds has increased rapidly [21,22 ,23,24,25 ] . However, the situation is very different in the case of lower vertebrates. Prior to 2000, very few of RNA viruses had been identified from amphibians, reptiles, and fish, and these are always associated with overt disease that could result in huge economic losses [15, 26, 27] . However, following the extensive use of PCR and the Sanger sequencing methods for virus identification over the past decade, the number of RNA viruses sampled from lower vertebrates has steadily increased [28] , with notable examples being arenavirus and paramyxoviruses in reptiles [29] , and novirhabdoviruses and other RNA viruses from fish [30] , although these numbers were still very limited compared to the viruses described in birds and mammals. Also of importance was that the viruses sampled from lower vertebrates are usually divergent from those in birds and mammals, such as arenavirus and coronavirus [24, 31] , in turn suggesting that there is a huge diversity of unsampled viruses that occupy the 'long branches' separating these taxa. The most dramatic increase in the biodiversity of RNA viruses from lower vertebrates has been achieved by using a bulk RNA-sequencing approach known as meta-transcriptomics [20 ] . For example, this approach has led to the discovery of more than 200 novel viruses in 186 species of vertebrates sampled from both terrestrial and aquatic environments in China [20 ] . These vertebrates represent the phylogenetic diversity in vertebrates that had not been screened previously, including lancelets (Leptocardii), jawless fish (Agnatha), cartilaginous fish (Chondrichthyes), ray-finned fish (Actinopterygii), lobefinned fish (Sarcopterygii), amphibians (Amphibia), and reptiles (Reptilia). Of these newly discovered viruses, the majority (196 viruses) are vertebrate-specific that are thought only to infect vertebrate hosts, with the remaining 18 viruses thought to represent can be thought of as 'vector-borne' viruses that are able to infect both vertebrate and invertebrate hosts. Strikingly, these viruses cover virtually all known types of viruses previously identified in avian and mammalian hosts. Additionally, some of the newly discovered viruses (such as vertebrate associated astro-like viruses) may represent new vertebrate-associated virus families [20 ] , implying that far greater numbers and diversity of RNA viruses are present in lower vertebrates than previously realized and identified to date. Similarly, fish also contain viruses that are relatively closely related to known human pathogens. For example, Lassa hemorrhagic fever, Ebola virus disease, and hantavirus pulmonary syndrome, are well known due to the extensive damage they cause in human patients and represent a major threat to public health. Importantly, in all these cases, those viruses identified in ray-finned fish were the most closely related to those of the Arenaviridae, Filoviridae, and Hantaviridae, showing that these previously mammal-dominated and disease-causing viruses have relatives in aquatic vertebrates [20 ] . Additionally, influenza-like viruses were also discovered from jawless fish, ray-finned fish, and amphibians, with those from ray-finned fish currently the closest known relative to human influenza B virus. Remarkably, although they were sampled in lower vertebrates, and hence likely separated from mammals hundreds of millions of years ago, these viruses exhibited similar tissue tropisms as their mammalian counterparts, again arguing for the antiquity of these vertebrate-specific viruses in nature. In sum, all these data imply that the viruses that infect us today have once infected vertebrates that ultimately occupied aquatic environments (Figure 2) , although it is possible that some even extend back to the origin of animals that may have occupied terrestrial habitats. Although RNA viruses have universally small genomes, recent data suggests that they experience as complex processes of genome evolution as those large DNA viruses and utilize a wide range of replication-expression strategies [32 ] . Recent meta-genomic studies reveal that invertebrates appear to be a particularly rich source of genomic diversity for RNA viruses [17 ,33 ] . Indeed, the genomes of invertebrate RNA viruses are more complex in both size and structure than those of related viruses from vertebrates [32 ] . For example, the genome of Chuviruses, which likely represent at least a new family, are more diverse and have more intricate structures -including mixtures of segmented, unsegmented, and even circular genomes -than all other virus families [16 ] . Although the newly discovered RNA viruses from lower vertebrates have relatively simple genomes in both size and structure than those sampled from invertebrates [20 ,33 ] , their genomes still show greater variation in architectures than their counterparts in mammalian and avian hosts. These include variations in genome length, organization of open reading frames, changes in the order and number of glycoproteins, and even number of segments. For example, two arenaviruses, which were discovered from marine fish, have three RNA segments instead of two seen in mammalian or reptile hosts [20 ] . Interestingly, arenaviruses with three RNA segments have also been found in arthropods [17 ] , all known arenaviruses form a monophyletic cluster within the order Bunyavirales in trees of the RNA-dependent RNA polymerase (RdRp) [16 ] . Additional studies are needed to reveal the true history of evolutionary events: that is, whether there has been a decrease in segment number (but not gene content) from three to two in the arenaviruses, or an increase in segment numbers from two to three in the arenavirus-bunyavirus group. Each cellular organism likely possesses multiple DNA and RNA viruses. Indeed, an increasing number of studies have shown that RNA viruses can have a broad spectrum of hosts [21,22 ,23,24,25 ] . In particular, recent metagenomic studies have shown that the host spectrum of invertebrate RNA viruses is remarkably broad, including different phyla and sometimes different kingdoms [16 ,17 ] . However, due to major sampling biases, the hosts of vertebrate RNA viruses are mainly dominated by 12 Viral evolution mammals, and to a lesser extent birds [32 ] . Therefore, our understanding of the host spectrum and virus-host association of vertebrate RNA viruses is clearly far from complete. Fortunately, the discovery of diverse viruses from a diverse range of lower vertebrates is helping to fills these gaps, revealing that the host spectrum of vertebrate RNA viruses is broad [20 ] , although still markedly narrower than that seen in invertebrates [16 ,17 ] . The difference between invertebrates and vertebrates most likely reflects differences in species numbers, population size, abundance, and the evolution of adaptive immunity in the latter. That RNA viruses may co-diverge with their mammalian hosts has been known for several decades, and is well documented in some groups such as the hantaviruses (Figure 3 ) [34] . Even in the case of invertebrate RNA viruses, viruses tend to form separate phylogenetic groups loosely based on the evolutionary relationships of their host taxa [16 ,17 ] , reflecting the long-term co-divergence between viruses and their invertebrate hosts (although with relatively frequent host-jumping; see below). With the exception of some interesting cases such as influenza virus and rotaviruses, the same general pattern also seems to be true of those RNA viruses newly described in lower vertebrates, in which there is a general clustering of related viruses from related hosts [20 ] . In particular, it is striking that the RNA viruses sampled from fish tend to fall basal to those sampled from amphibians, reptiles, birds and mammals, reflecting their divergent phylogenetic position within vertebrates (Figure 1) . Hence, at the broad-scale, the phylogeny of vertebrate RNA viruses overall mirrors that of their vertebrate hosts, with a transition from ocean to land (Figure 2 ). This overall co-phylogenetic match between RNA viruses and their vertebrate hosts strongly suggests that the viruses that still infect us today are ancient and have evolutionary histories that date back to first vertebrates, and even perhaps the first animals. However, despite the overall co-divergence between RNA viruses and their vertebrate hosts, it is also clear that host-switching events have frequently occurred during evolutionary history. Previous studies have revealed the important role of the cross-species transmission in the evolution and the emergence of RNA viruses in humans [4, 35] . As in the case of many RNA viruses from birds and mammals (e.g. influenza virus [35] and hantavirus [36] ), host-switching events are also commonplace among lower vertebrates [20 ] . For example, that an influenza virus from a fish is the closest known relative of mammalian influenza B virus, clearly conflicts with the host phylogeny. Similar host-switching events also observed for the viruses sampled from lungfish (in the Picornaviridae, hepacivirus and aquareovirus) that exhibited a closer evolutionary relationship with those from ray-finned fish rather than those from tetrapods to which they are more closely related. In addition, as in the case of mammalian Coordination of phytovirus long-distance movement Zhang et al. 13 viruses such as Seoul (hanta) virus in rats and SARSrelated virus in bats [37, 38] , single viruses are occasionally associated with multiple host species or even multiple host orders. Indeed, across the phylogenies as a whole it is possible that host-switching has been more common during the evolutionary history of vertebrate RNA viruses than co-divergence, particularly among hosts that share similar environments [20 ,36,38] . Together, these results suggest that virus evolution is a complex interaction between virus-host co-divergence over many millions of years and frequent cross-species transmission, with the evolutionary history of many virus groups reflecting an interweaving of both processes [39 ] . Despite an increasingly large scale of meta-transcriptomic surveys of invertebrates globally, some vertebrate RNA viruses (e.g. arenaviruses, filoviruses, hantaviruses, and paramyxoviruses) have not yet been identified in invertebrates, implying that they only originated in the latter (or that the wrong invertebrates have thus far been sampled). Similarly, although the phylum Echinodermata and the subphylum Tunicata of Chordata are more closely related to vertebrates than invertebrates, it is notable that the viruses identified in these animals were closely related to those associated with invertebrates, suggesting that there is a major phylogenetic 'break' in virus biodiversity following their divergence from the vertebrate lineage [16 ] . The reasons for these disjunct patterns of virus distributions across phylogenies are currently unclear. The antiquity of vertebrate RNA viruses is also apparent from some molecular clock based dating schemes, particularly using endogenous virus elements (EVEs). Comparisons of exogenous viruses and their endogenous relatives have shown that the Filoviridae and Bornaviridae emerged at least 30 million years (Myr) ago and 50 Myr ago, respectively, again indicative of an ancient evolutionary history [40, 41] . Remarkably, however, the discovery of the divergent filoviruses and bornavivirus from rayfinned fish indicates that both viruses have great ancient evolutionary histories that greatly extend beyond the calibration dates achieved with endogenous viruses. Our knowledge of the biodiversity and evolution of vertebrate RNA viruses has expanded dramatically since the millennium, and especially so following the metagenomic evolution of recent years. It is now clear that the virosphere is far larger and complex than previously envisioned, and that we have only sampled a tiny fraction of this remarkable biodiversity [32 ] . Although the most studied group, it is also the case that we have only sampled a small subset of total described vertebrate species, particularly for screening viruses using unbiased meta-transcriptomic approaches. It is also likely that lower vertebrates, and other animals, harbor RNA viruses are so divergent from known viruses that they cannot be detected using available Blast-based approaches [32 ] , so that our sampling is necessarily biased toward 'detectable' viruses. It is therefore clear that more expensive and better sampling worldwide, especially in uncommon vertebrate species, and more powerful approaches for virus characterization are needed to help us to find these divergent viruses, such as led to the discovery of the chuviruses and the jingmenviruses [16 ,42] , which will in turn help fill the evolutionary gaps of RNA viruses. Although we know that vertebrate RNA viruses have an ancient evolutionary history, it is difficult at present to paint a clear picture about their origin and evolution, including their evolutionary relationship with invertebrate viruses which is likely to be complex, particularly as our sampling of basal vertebrates remains poor. That some RNA viruses seemingly only infect vertebrates suggest that they are 'vertebrate-specific' viruses, while another 'vector-borne' class are able to simultaneously infect both vertebrate and invertebrate hosts. However, it is important to note that for well known vector-borne viruses (such as dengue and Zika viruses), their true natural hosts are likely to be arthropods, and that they secondarily evolved to infect vertebrates. Indeed, it is striking that some of vertebrate viruses, including recently discovered viruses from lower vertebrates, fall to the basal of the classic vector-borne viruses [20 ] . Overall, the evolutionary history of vertebrate RNA viruses seems to reflect a complex interplay between long-term virus-host co-divergence and frequent hostswitching. Greater taxonomic sampling is clearly the goal for the future, and the discovery of viruses that fill the gaps between vertebrate and invertebrate viruses will clearly be important in helping to resolve virus origins. Although only a miniscule proportion of RNA viruses have been known to cause diseases in humans, some of uncharacterized viruses will surely be able to cross interspecies genetic barrier and emerge in humans. Identifying this subset will be difficult, particularly as the mechanisms that enhance or prevent the successful crossspecies transmission of viruses remain unclear. Although we now know that the evolutionary history of vertebrate RNA viruses is characterized by frequent cross-species transmission on the background of co-divergence, it is not clear whether RNA viruses with a history of cross-species transmission and multiple hosts emerge more frequently in humans compared to those with specific virus-host association [32 ,36] , or whether emergence is strongly dependent on local epidemiological and ecological features with less impact of virus phylogenetic history. In addition, far less is known about the interaction between viruses each other, viruses and other micro-organisms, and their hosts, especially how these interactions act to prevent or enhance diseases in humans even though this is likely to be central to understanding disease emergence [43] . Finally, although of fundamental scientific importance, we contend that rather than simply surveying biodiversity and classifying new viruses, the goal for the future of metagenomic studies should be to perform research focused on reveal the fundamental patterns and processes of virus evolution [20 ] , which will be greatly enhanced by a better understanding of the virus. Papers of particular interest, published within the period of review, have been highlighted as of special interest of outstanding interest Epidemic dynamics at the human-animal interface Origins of major human infectious diseases Ecology, evolution and classification of bat coronaviruses in the aftermath of SARS A systematic summary on the role played by bats in harboring and spreading multiple lineages of coronaviruses in mammals The evolution and emergence of hantaviruses Population diversity and collective interactions during influenza virus replication and evolution Discovery of a novel hepatovirus (Phopivirus of seals) related to human hepatitis A virus RĂ³ s KK: High diversity of picornaviruses in rats from different continents revealed by deep sequencing Bats are a major natural reservoir for hepaciviruses and pegiviruses Large-scale phylogenomic analysis reveals the complex evolutionary history of rabies virus in multiple carnivore hosts Viral diversity of house mice in Deciphering the bat virome catalog to better understand the ecological diversity of bat viruses and the bat origin of emerging infectious diseases What does virus evolution tell us about virus origins? Viral encephalopathy and retinopathy in aquaculture: a review From fish to frogs and beyond: impact and host range of emergent ranaviruses Viruses of lower vertebrates Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negativesense RNA viruses Major phylogenetic analysis of RNA viruses from a diverse range of arthropods, showing that arthropods play an important role in the evolution and transmission of RNA viruses. The very important virus -Chuvirus was identified Redefining the invertebrate RNA virosphere A transformative study in RNA virus diversity and evolution. Shows that meta-transcriptomics revolutionizes the efficiency and scope of pathogen discovery. Described at least 1445 novel RNA viruses A reassessment of the evolutionary timescale of bat rabies viruses based upon glycoprotein gene sequences The evolution of endogenous viral elements The evolutionary history of vertebrate RNA viruses This revealed the origin and evolutionary history of vertebrate viruses. It demonstrated that fish carry an enormous diversity of viruses and a longterm co-divergence between RNA viruses and their vertebrate hosts Freimer NB: Arteriviruses, pegiviruses, and lentiviruses are common among wild African monkeys Bats host major mammalian paramyxoviruses A high diversity of paramyxoviruses was discovered. Shows that bats are important hosts for paramyxoviruses Global avian influenza surveillance in wild birds: a strategy to capture viral diversity Discovery of seven novel Mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus Phylogeny and origins of hantaviruses harbored by bats, insectivores, and rodents Major phylogenetic analysis of hantaviruses from a diverse range of mammalian taxa, showing the occurrence of both co-divergence and cross-species transmission in hantavirus evolution Comparative sequence analyses of sixteen reptilian paramyxoviruses Virus taxonomy: classification and nomenclature of viruses. Seventh report of the International Committee on Taxonomy of Viruses. Encyclopedia Virol Virus taxonomy: classification and nomenclature of viruses: ninth report of the international Coordination of phytovirus long-distance movement Zhang et al. 15 www.sciencedirect.com Current Opinion in Virology Paramyxoviruses in reptiles: a review Host range, host specificity and hypothesized host shift events among viruses of lower vertebrates Isolation, identification, and characterization of novel arenaviruses, the etiological agents of boid inclusion body disease Using metagenomics to characterize an expanding virosphere Redefines the diversity and genome evolution history of the Flaviviridae. A successful application of meta-transcriptomics to pathogen discovery from both arthropods and vertebrates, and documented the first evidence of hepcivirus in sharks Evolution of hantaviruses: co-speciation with reservoir hosts for more than 100 MYR Interspecies transmission and emergence of novel viruseslessons from bats and birds Cross-species transmission in the speciation of the currently known murinae-associated hantaviruses Migration of Norway rats resulted in the worldwide distribution of Seoul hantavirus today Extensive diversity of coronaviruses in bats from China Comparative analysis estimates the relative frequencies of co-divergence and cross-species transmission within viral families Reveals that 'host-jumping' plays an important role in shaping virus macroevolution Endogenous nonretroviral RNA virus elements in mammalian genomes Filoviruses are ancient and integrated into mammalian genomes A tick-borne segmented RNA virus contains genome segments derived from unsegmented viral ancestors Backbone of RNA viruses uncovered This study was supported by the National Natural Science Foundation of China (Grants 81861138003, 81672057) and the Special National Project on Research and Development of Key Biosafety Technologies (2016YFC1201900, 2016YFC1200101). Holmes EC is funded by an ARC Australian Laureate Fellowship (FL170100022).