key: cord-1002828-y8c60w4a authors: Burrell, Christopher J.; Howard, Colin R.; Murphy, Frederick A. title: Classification of Viruses and Phylogenetic Relationships date: 2016-11-11 journal: Fenner and White's Medical Virology DOI: 10.1016/b978-0-12-375156-0.00002-3 sha: 28198db549fd736761dee92488554289c7aa8a94 doc_id: 1002828 cord_uid: y8c60w4a The taxonomy of viruses represents a unique classification system that recognizes boundaries among at first sight a continuum of properties. Genome sequencing has brought into sharp debate the origin of viruses, with RNA viruses perhaps having a separate evolutionary lineage. The criteria adopted for assessing the causal linkage between virus and disease deviates from those normally adopted for bacterial diseases and reflects those unique properties that underlie the principles of virus classification. Virus taxonomy brings into sharp focus the debate about the true nature of viruses. A comprehensive classification system should define boundaries within what may at first appear as a continuum of properties. This is often most challenging at the level of genome sequence analysis. The rules and processes that have been developed are unique to the science of virology, and necessary to accommodate the astonishing variety of viruses. There is now evidence that probably all organisms in the biological world may be infected by at least one virus. Indeed it has been estimated that viruses represent the most abundant biological entities on the planet, existing as pathogens or silent passengers in humans and other animals, plants, invertebrates, protozoa, fungi, and bacteria. To date more than 4000 different viruses and 30,000 different strains and subtypes have been recognized, with particular strains and subtypes often having significant public health importance. Several hundred different viruses are known to cause disease in humans, although this is a small fraction of those viruses encountered in the surrounding environment. Since all viruses, whatever the host, share the properties described in the preceding chapter, virologists have developed a single system of classification and nomenclature that covers all viruses-this is a system overseen by the International Committee on Taxonomy of Viruses (ICTV). One challenge of virus classification is to define evolutionary relationships between viruses when minor changes in molecular structures may give rise to pathogens with radically different properties ( Fig. 2.1 ). Although it is hierarchical and at most levels reflects evolutionary relationships, the taxonomy of viruses is deliberately non-systematic-that is, there is no intent to relate all viruses to an ancient evolutionary root-in fact, there is good evidence for several separate roots. The earliest efforts to classify viruses were based upon host organism species, common clinical and pathological properties, tropism for particular tissues and organs, and common ecological and transmission characteristics. For example, viruses that cause hepatitis (e.g., hepatitis A virus, family Picornaviridae; hepatitis B virus, family Hepadnaviridae; hepatitis C virus, family Flaviviridae; and Rift Valley fever virus, family Bunyaviridae) might have been brought together as "the hepatitis viruses." Such systems have now been superseded. The initial principles for identifying and distinguishing different viruses involved giving equal weight to the importance of: 1. type of nucleic acid (DNA or RNA); 2. virion size, as determined by ultrafiltration and electron microscopy; 3. virion morphology, as determined by electron microscopy; 4. virion stability, as determined by varying pH and temperature, exposure to lipid solvents and detergents, etc.; and 5. virion antigenicity, as determined by various serological methods. This approach was practicable in the era before molecular biology, as these characteristics had already been determined for a large number of viruses, and thus these properties could be used to build a taxonomic framework. Subsequently it has been necessary in most cases to determine only a few characteristics in order to place a newly described virus into an established taxon, as a starting point for further work to define its relationship with other members. For example, an isolate from the respiratory tract of a child with croup, identified by negative contrast electron microscopy as an adenovirus, might be submitted immediately for serological identification-it would certainly turn out to be a member of the family Adenoviridae, genus Mastadenovirus (the adenoviruses of mammals), and would be serologically identified as one of the >50 human adenoviruses-or perhaps, it would turn out to be a new human adenovirus! Nowadays, the primary criteria for delineation of the main viral taxa are: Sequencing, or partial sequencing, of the viral genome provides powerful taxonomic information and now is often done very early in the identification process. Reference genome sequences for all viral taxa are available in public databases (e.g., GenBank, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States: ). Such an approach in most cases allows one to immediately place a virus in a specific taxon. The universal system of viral taxonomy recognizes five levels, namely order, family, subfamily, genus, and species. The names of orders end with the suffix -virales, families with the suffix -viridae, subfamilies with the suffix -virinae, and genera with the suffix -virus. The names of species also end with the term virus, either as a separate word or as a suffix (according to historic precedence). Lower levels, such as subspecies, strains, and variants, are established for practical purposes such as diagnostics, vaccine development, etc., but this is not a matter of formal FIGURE 2.1 Diagram illustrating the shapes and sizes of viruses that infect vertebrates. The virions are drawn to scale, but artistic license has been used in representing their structure. In some, the cross-sectional structures of capsid and envelope are shown, with a representation of the genome; with the very small virions, only their size and symmetry are depicted. Reproduced from King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkowitz, E.J. (Eds.) classification and there are neither universal definitions nor is there any standard universal nomenclature. As of 2015 the universal taxonomy system for viruses encompasses seven orders, four of which contain human and animal pathogens (Picornavirales, Herpesvirales, Mononegavirales, and Nidovirales), and 78 families, 27 of which contain human and/or animal pathogens, 348 genera, and 2285 species of viruses (Table 2 .1). This situation is constantly changing, and the interested reader should consult the ICTV website for updates (http://www. ictvonline.org). The universal taxonomy system is nearly complete at the level of families and genera; that is, virtually all of the viruses mentioned in this book have been placed within a family and assigned to a genus, although there are some "floating genera" where family construction is not yet complete. Subfamilies are used only where needed to deal with very complex interrelationships among the viruses within a particular family. Virus families are broadly divisible into those with DNA or RNA genomes respectively. Viruses within each family possess broadly similar genome structure, virion morphology, and replication strategy. Subfamilies are distinguished in cases where some members of a family can be grouped as possessing distinct and unique properties. Orders are used to group together those virus families with related but distant phylogenetic properties (e.g., conserved genes, sequences, or domains). Again, since all viruses did not derive from a common ancestor, there is no intent to construct a unified viral evolutionary tree. Genera are used to bring together viruses with clear, important evolutionary, and biological relationships, which are also usually reflected in antigenic, host range, epidemiological, and/or other relationships. Species is the most important taxon in the systems used to classify all life forms, but it is also the most difficult to both define and use-this is especially the case with regard to viruses. In recent years the ICTV has determined criteria for defining virus species-different criteria are being used for different families. After some controversy, the ICTV recently redefined the term species: A species is a monophyletic ("relating to or descended from one source or taxon") group of viruses, whose properties can be distinguished from those of other species by multiple criteria. The criteria by which different species within a genus are distinguished shall be established by the appropriate Study Group. These criteria may include, but are not limited to, natural and experimental host range, cell and tissue tropism, pathogenicity, vector specificity, antigenicity, and the degree of relatedness of their genomes or genes… Below the species level, the identification of particular lineages within an individual virus species is often extremely important because of clinical, epidemiological, or evolutionary significance. Such lineages may be designated as serotypes, genotypes, subtypes, variants, escape mutants, vaccine strains, etc. Many different conventions are used for naming at this level, depending on the virus involvedthese distinctions lie outside the remit of the ICTV. Using the above taxonomic system brings a number of practical benefits, including (1) the ability to relate a newly found virus to similar agents that have already been described and thereby to anticipate some of its possible properties, and (2) the ability to infer possible evolutionary relationships between viruses. Even though there has been little disagreement over the use of this system at the order, family, or genus level, there has been considerable confusion at the species level, partly based in misunderstanding over the difference between the man-made taxonomic construction, the species, and the actual entity, the virus. In this book formal ICTV taxonomy and nomenclature will be cited, but virus names will be in the English vernacular. The discovery of mimiviruses (a virus infecting the protozoan Acanthamoeba) in the last decade has challenged the traditional concept of virus. The mimivirus genome is able to direct much more than the replication of its own DNA genome, coding as it does for a large number of proteins with functions resembling some eukaryotic proteins and a large number of proteins of unknown function. At the time of writing, no mimivirus-like agent causing human illness has yet been found; still the discovery of mimiviruses has had a profound influence on our understanding of virus evolution and on our sense of what is yet to be discovered. A full discussion regarding the origin of viruses is outside the scope of this book, suffice it to say that some virologists argue that RNA viruses have evolved many aeons before the appearance of DNA viruses. In formal usage, the first letters of virus order, family, subfamily, genus, and species names are capitalized and the terms are printed in italics. Further words making up a species name are not further capitalized unless they are derived from a place name (e.g., the species St. Louis encephalitis virus). The first letter of the names of specific viruses having the status of tentative species is capitalized, but the names are not italicized. In formal usage, the identification of the taxon precedes the name; for example: "… the family Paramyxoviridae" or "… the genus Morbillivirus." The following are some illustrative examples of formal taxonomic usage: In informal vernacular usage, all terms are written in lower case script (except those derived directly from place names); these are not italicized, do not include the formal suffix, and the name of the taxon follows the name. For example, "…the picornavirus family," "…the enterovirus genus," "poliovirus 1." One particular problem in vernacular nomenclature lies in the historic use of the same root terms in family and genus names-it is sometimes difficult to determine which level is being cited. For example, the vernacular name "bunyavirus" might refer to the family Bunyaviridae, to the genus Orthobunyavirus, or perhaps even to one particular species, Bunyamwera virus. The solution to this problem is to add an extra word to formally identify which taxon level is being referred to; for example, when referring vernacularly to Bunyamwera virus (capitalized, because the name derives from a place name), a full vernacular description would be "Bunyamwera virus, a member of the genus Bunyavirus in the family Bunyaviridae…" For each genus there is a type species assigned that creates a link between the genus and the species. A second problem lies in what seems to be an arbitrary incorporation of the root term, "virus," in some virus names and its separation as a detached word in others. For example, poliovirus vs. measles virus. The basis for this lies in history-some, but not all, of the viruses isolated early on assumed the former name style, whereas most viruses discovered more recently have been identified using the latter style. In this book, we have tried to hold to the name style used most commonly for each virus, but since this is mostly a matter of vernacular usage the reader may often find variations. There are other informal categories of viruses that are practical and in common usage, distinct from the formal universal taxonomic system and the formal and vernacular nomenclature. These are based upon virus tropism and modes of transmission. Most human pathogens are transmitted by either inhalation, ingestion, injection (including via arthropod bites), close contact (including sexual contact), or congenitally. Enteric viruses are usually acquired by ingestion (fecal-oral transmission) and replicate primarily in the intestinal tract. The term is usually restricted to viruses that remain localized in the intestinal tract, rather than causing generalized infections. Enteric viruses are included in the families Picornaviridae (genus Enterovirus), Caliciviridae, Astroviridae, Coronaviridae, Reoviridae (genera Rotavirus and Orthoreovirus), Parvoviridae, and Adenoviridae. Respiratory viruses are usually acquired by inhalation (respiratory transmission) or by fomites (inanimate objects carrying virus contagion) and replicate primarily in the respiratory tract. The term is usually restricted to viruses that remain localized in the respiratory tract, rather than causing generalized infections. Respiratory viruses are included in the families Picornaviridae (genus Enterovirus), Caliciviridae, Coronaviridae, Paramyxoviridae (genera Respirovirus, Rubulavirus, Pneumovirus, and Metapneumovirus), Orthomyxoviridae, and Adenoviridae. Arboviruses (from "arthropod-borne viruses") replicate in hematophagous (blood-feeding) arthropod hosts such as mosquitoes and ticks, and are then transmitted by bite to vertebrates, wherein the virus replicates and produces viremia of sufficient magnitude to infect other bloodfeeding arthropods. In all cases, viruses replicate in the arthropod vector prior to further transmission: thus, the cycle is perpetuated. The occasional passive transfer of virus on contaminated mouthparts ("the flying pin") does not constitute sufficient grounds for a virus to be identified as an arbovirus. Arboviruses are included in the families Togaviridae, Flaviviridae, Rhabdoviridae, Bunyaviridae, and Reoviridae (genera Orbivirus and Coltivirus). Blood-borne viruses are those that are typically transmitted by transfusion of blood or blood products, by sharing of intravenous injecting equipment, and by other mechanisms of parenteral transfer of blood or body fluids. Some are also transmitted by sexual contact (sexually transmitted viruses). This group includes hepatitis B, C, and D, HIV-1 and -2, HTLV-1 and -2, and other viruses can also be transmitted occasionally by this route. Hepatitis viruses are grouped as such because the main target organ for these viruses is the liver. Hepatitis A, B, C, D, and E viruses each belong to completely unrelated taxonomic families. Oncogenic viruses usually cause persistent infection and may produce transformation of host cells, which may in turn progress to malignancy. Viruses that have oncogenic potential, in experimental animals or in nature, are included in the families Herpesviridae, Adenoviridae, Papillomaviridae, Polyomaviridae, Hepadnaviridae, Retroviridae, and Flaviviridae. One of the landmarks in the history of infectious diseases was the development of the Henle-Koch postulates that established the evidence required to prove a causal relationship between a particular infectious agent and a particular disease. These simple postulates were originally drawn up for bacteria, but were revised in 1937 by Thomas Rivers and again in 1982 by Alfred Evans in attempts to accommodate the special problem of proving disease causation by viruses. In many cases, virologists have had to rely on indirect causal evidence, with associations based on epidemiology and patterns of antibody prevalence among populations. The framework of virus taxonomy, again, plays a role, especially in trying to distinguish an etiological, rather than coincidental or opportunistic relationship between a virus and a given disease. Particular difficulty arises where a disease occurs in only a small fraction of infected individuals, where the same apparent disease can be caused by more than one different agent, and in various chronic diseases and certain cancers. These difficulties are confounded in many instances where diseases cannot be reproduced by inoculation of experimental animals, or where the discovered viruses cannot be grown in animals or cell culture: there may even be a "hit and run" relationship where the causative virus may no longer be present in the afflicted individual. Thus scientists have to evaluate the probability of "guilt by association," a difficult procedure that relies heavily on epidemiological observations. The Henle-Koch postulates were reworked again in 1996 by David Relman and David Fredricks as more and more genomic sequencing criteria came to dominate the subject (Table 2. 2). As a test of the value of these criteria, one can consider the level of proof that the human immunodeficiency viruses, HIV-1 and HIV-2, are the etiological agents of human acquired immunodeficiency syndrome (AIDS) ( Table 2 .3). Early in the investigation of AIDS, before its etiology was established, many kinds of viruses were isolated from patients and many candidate etiological agents and other theories were advanced. Prediction that the etiological agent would turn out to be a member of the family Retroviridae was based upon years of research on animal retroviral diseases and many points of similarity with some characteristics of AIDS. Later, after human immunodeficiency virus 1 (HIV-1) was discovered, the morphological similarity of this virus to equine infectious anemia virus, a prototypic member of the genus Lentivirus, family Retroviridae, highlighted the usefulness of the universal viral taxonomic system and of animal lentiviruses as models for AIDS. In other examples, the causal relationships of Epstein-Barr (EB) virus to the disease infectious mononucleosis, and of Australia antigen (later known as hepatitis B surface antigen) to clinical hepatitis, were each established by matching serological evidence of acute infection with the timing of onset of clinical disease. Further, the complex role of EB virus in Burkitt's lymphoma was investigated in a large prospective study carried out by the International Agency for Research on Cancer (IARC) on 45,000 children in an area of high incidence of Burkitt's lymphoma in Africa. This showed that: 1. EB virus infection preceded development of the tumors by 7 to 54 months; 2. exceptionally high EB virus antibody titers often preceded the appearance of tumors; and 3. antibody titers to other viruses were not elevated. In addition, it was demonstrated that the EB virus genome is always present in the cells of Burkitt's lymphomas among These studies are examples of important concepts now widely understood in situations where a virus has been shown to cause a specific disease, namely that not all cases of the infection may necessarily develop the clinical disease, and not all cases of the clinical disease may be caused by the particular virus in question. Thus, for many associations between a virus and a clinical disease, the concept of infection representing a "risk factor" is more appropriate than it being an absolute "cause." It also now happens frequently using modern diagnostic methods, that viruses are recovered from individuals with some ongoing disease; however, careful work is essential in such cases to distinguish a true causative role from an unrelated infection of no clinical significance occurring at the same time. The breath-taking advances in genome sequencing now enable the complete genomes of many hundreds of virus isolates to be sequenced in a matter of days, if not hours. Multiple sequence alignment and the construction of phylogenetic trees are now commonplace when virologists are confronted with either a potentially new virus or an isolate with new or unexpected properties. These data are rapidly challenging previous ideas about the origin and evolution of many viruses of medical importance. Detailed phylogenetic analysis of RNA viruses in particular sometimes provides unexpected answers that in turn create more questions; for example, hepadnaviruses share a similar reverse transcriptase-based replication strategy that is common to the caulimoviruses of plants-does this reflect a common ancestor or convergent evolution? Deep evolutionary relationships among the higher virus taxa have led to the construction of several Ordersthe Herpesvirales, Mononegavirales, Nidovirales, and Picornavirales. The common conserved sequences employed here are at the lower limit of significance, but similarities in some functional and structural protein domains still appear among otherwise unrelated viruses in various taxa. Sequence analyses also suggest that it is unlikely that many more associations of diverse taxa will be found that warrant construction of further Orders. At the other extreme, namely clarifying the phylogenetic relationships among viruses in the same taxa (i.e., families or genera), great progress is being made continually. For example, the origin of the 2009 influenza (H1N1) pandemic has been found to be complex indeed: the virus is a reassortant with genes from four different ancestral viruses-North American swine influenza, North American avian influenza, human influenza, and swine influenza virus typically found in Asia and Europe. Similarly, some member viruses of the family Bunyaviridae have been found to be natural reassortants with genes from known and unknown ancestors. Thus, the development of a robust, yet flexible and continually evolving taxonomic system for viruses underpins, and gives structure to, all facets of research, management, and control of virus diseases. Are viral nucleic acid sequences detected in most (all) cases of disease? 2. Specificity of the association. Are viral nucleic acid sequences localized to diseased tissues, and not to healthy tissues? Is the frequency of virus infection reduced significantly in healthy individuals? 3. Response to treatment. Does the copy number of viral nucleic acid sequences fall with resolution of illness or effective treatment Does infection with the virus precede and predict disease onset? Do the known biological properties of the virus make sense in terms of the disease? 6. Biological gradient. Is the amount of virus higher in patients with severe disease than it is in persons with mild disease? Is the amount of virus higher in diseased tissues than in healthy tissues? Are these findings reproducible by multiple laboratories and by multiple investigators? Recently agreed changes to the International Code of Virus Classication and Nomenclature A strategy to estimate unknown viral diversity in mammals Sequence-based identification of microbial pathogens: a reconsideration of Koch's postulates The Evolution and Emergence of RNA Viruses 2011. Ninth Report of the International Committee for the Taxonomy of Viruses Order to the viral universe