key: cord-1021853-wbrzoyvr authors: Ho, Chi-Chun; Lau, Susanna K. P.; Woo, Patrick C. Y. title: Romance of the three domains: how cladistics transformed the classification of cellular organisms date: 2013-07-19 journal: Protein & Cell DOI: 10.1007/s13238-013-3050-9 sha: f7ed34959b7d1ce9b6150adb8a7a6eec590586a4 doc_id: 1021853 cord_uid: wbrzoyvr Cladistics is a biological philosophy that uses genealogical relationship among species and an inferred sequence of divergence as the basis of classification. This review critically surveys the chronological development of biological classification from Aristotle through our postgenomic era with a central focus on cladistics. In 1957, Julian Huxley coined cladogenesis to denote splitting from subspeciation. In 1960, the English translation of Willi Hennig’s 1950 work, Systematic Phylogenetics, was published, which received strong opposition from pheneticists, such as numerical taxonomists Peter Sneath and Robert Sokal, and evolutionary taxonomist, Ernst Mayr, and sparked acrimonious debates in 1960–1980. In 1977–1990, Carl Woese pioneered in using small subunit rRNA gene sequences to delimitate the three domains of cellular life and established major prokaryotic phyla. Cladistics has since dominated taxonomy. Despite being compatible with modern microbiological observations, i.e. organisms with unusual phenotypes, restricted expression of characteristics and occasionally being uncultivable, increasing recognition of pervasiveness and abundance of horizontal gene transfer has challenged relevance and validity of cladistics. The mosaic nature of eukaryotic and prokaryotic genomes was also gradually discovered. In the mid-2000s, high-throughput and whole-genome sequencing became routine and complex geneologies of organisms have led to the proposal of a reticulated web of life. While genomics only indirectly leads to understanding of functional adaptations to ecological niches, computational modeling of entire organisms is underway and the gap between genomics and phenetics may soon be bridged. Controversies are not expected to settle as taxonomic classifications shall remain subjective to serve the human scientist, not the classified. "The empire, long divided, must unite; long united, must divide."-The Romance of the Three Kingdoms, Guanzhong Luo The history of biological classifi cation is long and tortuous (Fig. 1) . As biological beings are, obviously, the earliest things that our ancient ancestors encountered and classifi cation allows entities to be better and more easily understood, the long history of biological classifi cation should not seem surprising. In terms of lasting infl uence, Aristotle's (384 BC-322 BC) classifi cation of living organisms can be ranked among that of Carl Linnaeus and modern taxonomists such as Ernst Mayr On 7th September 1957, Julian Huxley's two-page article, titled The Three Types of Evolutionary Process, was published (Huxley, 1957) . He extended the proposal by Bernhard Rensch (1954) and defi ned the term cladogenesis "to denote all splitting, from subspeciation through adaptive radiation to the divergence of phyla and kingdoms". Although some have suggested that terminology of a clade had been used earlier in the literature by Lucien Cuénot (1866 Cuénot ( -1951 or Ernst Haeckel (1834 Haeckel ( -1919 , it has been pointed out that their use of the beasts such as lions and elephants were ranked among the highest animals and birds, capable of fl ight in air, were ranked higher than fi shes. Plants were considered lacking intelligence and sensitivity and therefore lower than the animals. It should be noted that the classifi cation was not strictly biological, as the lower levels of the scale actually went on to include the minerals such as diamond (with hardness and luster, the highest mineral), marble and grit (the lowest mineral); the scala naturae portrayed the then prevalent belief that the biological beings were integral to the world created by God and entities therefore had a logical hierarchy or order which was based on reason and logic. Even if we do not consider the creationist perspective intrinsic to this early classifi cation system, the hierarchical view of life forms can also be found to infl uence such work as Linnaeus's Systema Naturae and the rampant notion that certain extant organisms are higher (e.g. Homo sapiens) and some are more primitive (e.g. Escherichia coli). We note, from an evolutionary point of view, such comments are not justifi ed as both humans and bacteria have the same origin from the last universal common ancestor and have underwent evolution and natural selection for the same length of time. Indeed, if the actual amount of sequence evolution is taken into account, it becomes even harder to justify the opinion that humans are more evolved than even monkeys and apes (Li and Tanimura, 1987) . This review has two objectives: to present a critical appraisal of the history and development of cladistics and to evaluate its impact on the classifi cation of cellular organisms from our understanding as microbiologists. Starting with such pioneers as Julian Huxley , who formally proposed the term cladogenesis, and Willi Hennig, the German biologist who is While monophyletic units can be pin-pointed exactly in a strictly bifurcating phylogeny, biologists occasionally disagree as to whether a group of organisms is delimitable from another. The delimitability was later dropped from the defi nition when Huxley presented at the Systematics Association (Huxley, 1959) . In modern molecular phylogeny, the clade as a monophyletic group is defi ned as a group of organisms that have been derived from a common ancestor, including the common ancestor itself (Page and Holmes, 1998; Graur and Li, 2000) . Cladogenesis is now also known as a synonym for genetic speciation (US National Library of Medicine, 2006) . It is unknown whether Huxley anticipated the rather acrimonious debate between the cladists and pheneticists in the decades following his initial proposal. Believing that convergence and parallelism were diffi cult to detect given the incompleteness of the fossil record and "a thoroughgoing phylogenetic classification does not recognize such (non-monophyletic) grades at all", Huxley suggested a double system to express both biological improvement and evolutionary relationships (Huxley, 1959) . This ambivalence was not accorded by contemporary cladists. When Hennig later introduced his landmark work, Phylogenetic Systematics, to the English world, he noted that "There is, in fact, a widespread notion that phylogenetic systematics, at least in those groups of animals for which no fossil fi nds are available, possesses no method of its own… This notion is false" (Hennig, 1965) . Notably, Hennig's original publication of the Grundzüge einer Theorie der phylogenetischen Systematik (in English "Outlines of a Theory of Phylogenetic Systematics") actually predates that of Huxley (Hennig, 1950) . In the later English translation of the work, known also as Phylogenetic Systematics, Hennig established the major cladistic principles, including the genealogical relationship among species as the basis of clades; shared-derived (synapomorphic) characters as the sole source of phylogenetic evidence; and the consistency of a proposed phylogeny with multiple lines of evidence (Hennig, 1966) . While not discrediting the fossil record, Hennig logically argued, from his emphasis on synapomorphic traits, that the shared-ancestral (plesiomorphic) characters that had been the focus of fossil analysis were relatively unimportant for establishing a phylogeny; from a phylogenetic standpoint, he asserted that, instead, apomorphic characters on fossils should be sought and analysed after careful exclusion of convergence and retrogression (Hennig, 1966) . As a side note, Hennig was not the fi rst to point out the difference between homology and analogy. As early as the late 19th century, Darwin has rectifi ed the defi nition of homology to include common ancestry in addition to mere structural similarities in The Origin of Species (Darwin, 1859) , and the concept was further extended by Hubbs (1944) to encompass functional similarities from common descent. In the meantime, some however insisted that homology should be limited to the description of homologous structural similarities instead of functional, physiological and behavioral correspondence from common descent (Boyden, 1947) . In retrospect, this reluctance to change was understandable because of the diffi culty if not a practical impossibility in exactly determining the function of a structure even given a comprehensive and well-preserved fossil record. The use of molecular techniques to determine the homology and function of biological structures was not foreseen. Hennig's cladistics was not universally accepted in the systematics community. As the notion Phylogenetic Systematics implies, Hennig's method had the goal of changing the grouping and ranking of organisms based on cladistic principles, i.e. by the recency of common descent. While some zoologists voiced their support for the new method: "For if we do desert it, systematics inevitably returns to its pre-Darwinian status as a method of merely pigeonholing information." (Myers, 1952) ; or accepting and promoting it through fi eld studies (Brundin, 1966) , prominent taxonomists such as Ernst Mayr and Robert Sokal strongly rejected the use of cladistics in classifi cation. Mayr believed that the branching pattern of a cladogram did not convey as much information as an "evolutionary classifi cation" which took into account the adaptive features of different taxa and frequency of branching, and that the cladistic proposal, once implemented, would "produce classifi cations that are unbalanced and meaningless" (Mayr, 1974) . Evolutionary systematists, represented by the Simpson-Mayr school and supporters, recognized cladistics as an important contribution to systematics but considered the application of cladistic principles to re-classify living organisms as being "neither necessary nor desirable" (Ashlock, 1974) . As a generation of microbiologists nurtured in modern phylogenetic methods, we note that many of the polarized arguments between the more phenetically-inclined "evolutionary taxonomists" and cladists could have been harmonized if they had been able to obtain the wealth of genetic information now available. Arguments stemming from the impracticability of cladistics, such as the diffi culty of determining homology and the direction of evolutionary relationships; differentiation between parallel and convergent evolution; and doubts about the information content of ancestral characters (Mayr, 1974) , have gradually been rendered obsolete as molecular techniques, such as serum protein electrophoresis (Johnson and Wicks, 1959) , multi-locus enzyme electrophoresis (Avise, 1974), DNA-DNA hybridization (McCarthy and Bolton, 1963; Hyer et al., 1964) , protein sequence comparison (Margoliash et al., 1961; Zuckerkandl and Pauling, 1965) and, finally, DNA sequencing (Sanger et al., 1973 (Sanger et al., , 1977 , were being continuously developed and popularized. We, however, agree with Mayr's view that, as an inevitable nature, cladistics does not entirely refl ect functional evolution, which is occasionally biologically signifi cant; cladistic classifi cations therefore may not be good heuristics to phenotypic characters; and that the indiscriminate use of cladistics in taxonomy could lead to the generation of REVIEW Protein Cell & nelle of an eukaryote (i.e. duckweed chloroplast), and (ii) the methanogens (e.g. Methanobacterium thermoautotrophicum and Methanosarcina barkeri), which Woese and Fox named "archaebacteria" (Woese et al., 1978) . To justify their classifi cation, which was mainly based on a rudimentary measure of ribosomal RNA sequence similarity, it was further pointed out that the cell walls of the methanogens they studied differed from that of other bacteria by lacking peptidoglycan, and the biochemical and transfer RNA modification pathways of the methanogens were distinct from those of other bacteria as well as the eukaryotic organisms (Woese and . The presumed monophyly of the prokaryotes was formally challenged thirteen years later, when the proposal for the domains Archaea, Bacteria and Eucarya was published (Woese et al., 1990a) . By comparing the primary sequences and the inferred secondary structures of the SSU ribosomal RNA gene of the three groups; usage pattern of RNA polymerases; some limited bacterial fossil record; and a phylogenetic rooting strategy using a gene duplication event inferred to be ancestral to the divergence of the groups, a "natural" (i.e. cladistic) classification comprising three domains was formulated. In the rooted phylogenetic tree presented in the article, Domain Archaea was more closely related to Domain Eucarya, and two Kingdoms, Kingdom Euryarchaeota and Kingdom Crenarchaeota, were also proposed to contain two monophyletic lineages within Domain Archaea. In the interim between the two proposals, Woese and his co-workers' numerous publications on prokaryotic phylogeny amounted to what we would call the "cladist's reign". Earlier, 16S rRNA fi ngerprinting analysis had supported the prokaryotic origin of the chloroplast (Zablen et al., 1975) ; 16S rRNA gene sequence analysis in this period supported the prokaryotic origin of the mitochondrion (Yang et al., 1985b) . A number of prokaryotic species were classifi ed or re-classifi ed as Archaea based on 16S rRNA gene sequence evidence (Magrum et al., 1978; Woese et al., 1980b; Gupta et al., 1983; Woese et al., 1984a; Olsen et al., 1985; Achenbach-Richter et al., 1988; Burggraf et al., 1990) . A methanogenic origin of cellular life was proposed, supported by the presumed rooting of the Tree of Life near the Bacteria-Archaea split (Woese, 1979) . The basic framework on which detailed analysis of 16S rRNA gene sequences is based was established (Woese et al., 1980a; Noller and Woese, 1981) . Traditional phenetic classifi cation of cell wall defi cient bacteria was revised (Woese et al., 1980b) . More importantly, the major bacterial and archaeal phyla were established in this period based on rRNA gene sequence evidence (Paster et al., 1984; Woese et al., 1984b; Oyaizu and Woese, 1985; Paster et al., 1985; Yang et al., 1985a; Oyaizu et al., 1987; Weisburg et al., 1989; Yang and Woese, 1989; Woese et al., 1990 Woese et al., , 1990c Woese et al., , 1990d . The cladistic-driven approach adopted by Woese ironically put Mayr's comment in perspective "this approach by trial and error… led to frequent changes in classifi cations" (Mayr, 1974) , although, as we contend, Woese's cladistic analysis and re-classifi cation were more stable and reliable than Mayr had expected. The an unmanageable number of extant and extinct taxa (Mayr, 1981) . In these aspects, Sneath and Sokal's numerical taxonomy (Sneath and Sokal, 1962) and its underlying philosophy are of particular interest as they represent a methodological treatment of phenetics with practical value in classifi cation and prediction. We shall return to this topic as we focus on bacterial identifi cation and classifi cation. The modern, Linnaean classifi cation of biological organisms can be traced back to the 18th century, when the goal of classifi cation was to facilitate the fi eld work of naturalist-botanistphysician Carolus Linnaeus (1751). Presumably, the inference of a phylogenetic relationship was unimportant, given the once prevalent biblical belief that all species had originated from a special creation and had not been evolving (1611). The goal of a practical classifi cation system, therefore, is to provide positive identifi cation (What is this plant/animal/fungus?), delineation (What are the differences between these two groups of plants/animals/fungi?) and circumscription of organism groups (Do these plants/animals/fungi belong to the same species/ genus/family?); cladistic classifi cations, to this end, have limited "practical value". Certain practical considerations from Linnaeus's days have persisted and remained central to branches of botany, zoology, as well as basic and medical microbiology. A phenetic approach is useful when the goal of classifi cation is to assign static groups for a fi xed number of extant species. Even though practicality is central to such and related approaches, they always necessitate subjective judgement (e.g. deciding if a phenotypic character is a biological improvement or difference worthy of taxonomic demarcation). However, if the aim of classifi cation is beyond that of extant, observed biodiversity and the objective of inference includes past biological forms and functions, only a cladistic classifi cation can provide useful information. In a mixed classification approach, such as the evolutionary classifi cation by Mayr (1974 Mayr ( , 1981 , or the polyphasic taxonomy commonly employed in basic and clinical microbiology (Colwell, 1970) , the cladistic component may actually facilitate the documentation of novel biodiversity. In 1977, Carl Woese and George Fox published their phylogenetic analysis of some eukaryotic organisms (e.g. baker's yeast and common duckweed), their organelles (e.g. duckweed chloroplast) and some prokaryotic organisms, including a few methanogens . In their landmark study, instead of the two groups that would have been predicted by the phenetic, prokaryotic-eukaryotic delimitation, the organisms (and organelles) could be divided into three groups based on comparative cataloguing of nucleotide hexamers of their small subunit (SSU) ribosomal RNA (Balch et al., 1977; Fox et al., 1977) ; the prokaryotic organisms could be further divided into two groups, namely, (i) the "eubacteria" which included all the bacterial species considered typical in the study (e.g. Escherichia coli and Bacillus fi rmus) and also the orga- & 2011, 2012) can provide a relatively reliable framework on which classifi cation can be based, as exemplifi ed by the role of the 16S rRNA gene phylogeny in the prokaryotic taxonomic framework (Ludwig and Klenk, 2005) . Organisms may not live to manifest all phenotypic characteristics. It is well-known in clinical microbiology that certain important pathogens, such as Mycobacterium tuberculosis, the causative agent of tuberculosis, takes a relatively long time (6-8 weeks) to grow and manifest the subtlest phenotypic characteristics such as its volatile fatty acid profi le . This length of time is, ostensibly, short compared with the animal and plant species with generation time exceeding tens of years and should not impress the zoologists and botanists. Yet it is only common knowledge that phenotypic characteristics of plants and animals, such as the presence and color of fl owers, presence of appendages or cranial size, can be affected by environmental factors, the exact stage of growth and life cycle. A relevant example may be cited from our fi eld: certain fungal species, collectively known as the dimorphic fungi, exhibit a yeast phase and a mould phase. Not only are the two phases very distinct by visual inspection, they are usually associated with very different profi les of metabolite production (Woo et al., 2010 and potential reproductive strategy . What may complicate the issue further is the reversible nature of the phase switch (Woo et al., 2003c) ; so if the switch had not been recognized and the two phases classifi ed into relatively unrelated groups by a phenetic approach it could produce a taxonomy that is more perplexing than useful. It takes little imagination to realize the vaguely analogous situations of metamorphosis in some animal species and alternation of generations in certain plants, or perhaps the multi-staged life cycles of many protoctista (Burkholder and Glasgow, 1997) . Organisms may leave only molecular evidence for their previous presence. In clinical microbiology, due to various reasons, the infectious agent may not be recovered from samples taken from the patient (Woo et al., 2001c) . Instead, an infective episode may only be demonstrated by means of immunological assays targeting part of the agent or host responses (Woo et al., 2001b) . More recently, the advent of nucleic amplifi cation techniques has allowed the direct detection of nucleic acids from clinical specimens to evidence presence of an infectious organism. The documentation of novel biodiversity by metagenomics and complete genome sequencing for uncultured organisms (Hallam et al., 2006; Hongoh et al., 2008a Hongoh et al., , 2008b ) therefore seem to be a natural extension to such. If one insists on characterizing a complete specimen of the organism before establishing a new taxon, he volitionally neglects the logical sequence evidence and implicitly attributes this abundance of biological sequences to non-evolutionary processes. As microbiologists at the interface of clinical and basic microbiology, we are perceptive of the infl uence of numerical taxonomy and molecular cladistics on classifi cation. It is worth pointing out that classification, as a taxonomic endeavor, is current taxonomic framework of prokaryotic systematics is still largely compatible with what has been established by Woese (Ludwig and Klenk, 2005) . The impact of Woese's cladistic classifi cation approach was profound. As PCR and DNA sequencing became common molecular techniques by the early 1990s, a number of conserved primers capable of producing nearly full-length 16S rRNA gene amplicons for subsequent sequencing and analysis were being published for both prokaryotes (Weisburg et al., 1991) and eukaryotes (Medlin et al., 1988) . Distant phylogenetic relationships, arguably beyond the reach of most phenetic methods, were elucidated among animal phyla (Field et al., 1988) . Large-scale projects, such as the ARB project (Ludwig et al., 2004) , Ribosomal Database Project (Olsen et al., 1992) and the Tree of Life Web Project (http://tolweb.org), were initialized in this period to help deduce the phylogenetic relationship among living organisms. Limited by cost and the early technical diffi culties, the predominant target used for molecular phylogeny was the small subunit (SSU) ribosomal RNA gene, i.e. the 16S rRNA gene in prokaryotes and the 18S rRNA gene in eukaryotes, as its moderate length allowed relatively effi cient amplifi cation and sequencing for even large-scale studies and the presence of conserved and variable regions along this functionally conserved molecule allowed phylogenetic inference to be made at various depths; this legacy is self-evident by the sheer number of SSU rRNA gene sequences in various electronic databases (DeSantis et al., 2006; Pruesse et al., 2007; Cole et al., 2009; Woo et al., 2011) . As microbiologists, we think we are justifi ed to mention a few facts occasionally neglected by early taxonomists, who were usually botanists or zoologists, although certain salient points, such as the classifi cation of the larval and adult form of the same insect and quantitative variation of a character, have been properly addressed as the allomorphism of species (Hennig, 1950) . These observations originate from our practice and experience in clinical and basic microbiology and are hereby generalized for the wider audience; the reader is invited to judge the importance of these observations in their own respective fi elds. Organisms can have unusual phenotypic profi les. As phenetic techniques advanced, the characterization of microorganisms progressed from size, shape and staining characteristics to usually a battery of biochemical tests (Woo et al., 2000) . We reckon that there has been a similar trend in the botanical and zoological sciences. Given biological variation, it is plausible that organisms with atypical phenotypic profi les compared with previously observed organisms used to establish the phenetic classifi cation and/or identifi cation scheme be ambiguously (Woo et al., 2001a; Lau et al., 2005 Lau et al., , 2006b or incorrectly identifi ed in the phylogenetic or cladistic sense, even if stateof-the-art techniques, such as matrix-assisted laser desorption ionization-time of fl ight mass spectrometry (MALDI-TOF MS), are used (Chan et al., 2012) . Widely present, housekeeping genes (Woo et al., 2003a (Woo et al., , 2003b Lau et al., 2006a Lau et al., , 2011a or clade-specifi c, synapomorphic genomic features (Ho et al., & quences to describe novel species (Brenner et al., 2005) . We contend that this approach is reasonable and reckon that the same approach is applicable to the classifi cation of multicellular eukaryotes as well. In our opinion, a phylogenetic backbone, instead of phenetic clustering or otherwise, provides an objective framework of a taxonomy by illuminating the plethora of phenotypic characteristic with evolutionary insight along the nodes and branches of the phylogenetic tree; the phylogenetic tree, however, does not directly translates itself into a taxonomic nomenclature precisely because of the varying rates of divergence and evolution. The value of a polyphasic taxonomy, presumably one that is fi rmly grounded on a properly inferred phylogeny, is the necessary integration of subjective information to derive cut-offs that make a phylogenetic tree meaningful not to its optimization algorithm but to the scientists' minds. To this end, cladistics provide the stable, objective framework which is negligent (Mayr, 1974) and therefore immune (Page and Holmes, 1998) to the varying tempi of evolution. The three-domain classification has not been unanimously agreed since its proposal. With the advancement of cytology and the extensive use of molecular phylogenetics, taxonomists were obliged to change their opinion on kingdom delimitation from time to time. Parallel to Woese's work, Cavalier-Smith proposed a system of nine kingdoms (Cavalier-Smith, 1981) , subsequently one with seven kingdoms (Cavalier-Smith, 1986) , then another with eight kingdoms and 10 subkingdoms (Cavalier-Smith, 1993 ) and a more recent one with only six kingdoms (Cavalier-Smith, 1998 , 2004 . Notably, the Cavalier-Smith school did not subscribe to the splitting of Kingdom Prokaryote and the new Domain concept and instead proposed the use of below-kingdom hierarchies such as Subkingdom, Infrakingdom and Parvkingdom. Stephen Jay Gould (1943-2002) , in accord with Cavalier-Smith, commented that the renaming, reclassifi cation and elevation of Archaea as a taxonomic rank above that of a Kingdom "grossly infl ates the importance of the differences between the two kingdoms Eubacteria and Archaebacteria" (Cavalier-Smith, 1993; Margulis, et al., 2009) . We believe that whether all prokaryotic organisms should be classifi ed into a single kingdom is merely subjective and conceptual in the absence of a credible phylogeny. If the phylogenetic tree and sequence of divergence depicted in Woese's initial proposal are true, i.e. the Bacteria-Archaea split occurred before the Archaea-Eukaryota split (Iwabe et al., 1989; Cammarano et al., 1992; Brown and Doolittle, 1995; Yang and Roberts, 1995; Baldauf et al., 1996; Gribaldo and Cammarano, 1998; Woese, 2000; Lake et al., 2009; Fournier and Gogarten, 2010) instead of erroneous (Saccone et al., 1995; Lawson et al., 1996; Lopez et al., 1999; Philippe and Forterre, 1999; Caetano-Anolles, 2002; Gribaldo and Philippe, 2002; Bapteste and Brochier, 2004; Lake et al., 2008; Sun and Caetano-Anolles, 2009; Dagan et al., 2010; a distinct step that is thought to precede the corresponding identifi cation scheme for practical procedures (Brenner et al., 2005) . Therefore, the modern use of "numerical" or "phenetic" identifi cation techniques, such as similarity indices from biochemical profi ling, DNA-DNA hybridization and BLAST comparison of gene sequences, do not represent a breakdown of the current, cladistic-based taxonomy; this is because the goal of clinical or fi eld identifi cation is often the binning of an organism to an established taxon and only occasionally the recognition of novel species; a phylogeny is not always useful to identifi cation. To establish the defi nitive identity of an organism for the basic and evolutionary researches, however, the rigorous construction of a phylogenetic hypothesis is usually required. As remarked by Page and Homles, the maximally informative or predictive classifi cations advocated by pheneticists and evolutionary taxonomists "may or may not be actual phylogenies; if they are, this is merely a happy accident" (Page and Holmes, 1998) . The impact of numerical taxonomy, in a sense the holy grail of phenetics, on classifi cation is hereby examined. We feel that we are from the right fi eld to comment on such since in reviewing the three decades of development of numerical taxonomy, Sneath particularly noted its success in microbiology and cited the proportion of publications in the International Journal of Systematic Bacteriology including "numerical relationships" as support (Sneath, 1995) . Whilst it is apparent that common identifi cation techniques in clinical microbiology, including the analytical profi le index (API) and related methods (Titsworth et al., 1969; Washington et al., 1971) , work by the numerical coding of arrays of biochemical test results with subsequent manual or computerized search in databases to achieve a "numerical identifi cation", whether this also represents "numerical taxonomy" is disputable. Numerical taxonomy, as established by Sneath and Sokal in the early 1960s (Sneath and Sokal, 1962; Sokal, 1963) , is a proposition against taxonomic groupings based on phylogenetic deductions. In their 1962 article, Sneath and Sokal argued for the separation of the taxonomic process from phylogenetic speculations as they believed that phylogenetic classifi cations have logical pitfalls: by providing just phylogenetic hierarchies representing the time sequence of divergence, a phylogenetic classifi cation does not contain the maximum possible information for the researcher as would be achieved by a numerical taxonomy using cluster analysis (Sneath and Sokal, 1962) . With such, Sneath and Sokal hoped to achieve "stable phenetic groups" in taxonomy which would also help fi nd "features best suited to making a diagnostic key" (Sneath and Sokal, 1962) . Later, they also collaborated with other groups to promote the numerical taxonomy method to the classifi cation of autoimmune diseases (Jones et al., 1970 (Jones et al., , 1973 and medically important genera of bacteria (Broom and Sneath, 1981; Sneath et al., 1981; Bridge and Sneath, 1983) . In 1970, Colwell published his application of numerical taxonomical techniques to derive a polyphasic taxonomy for the Vibrio genus (Colwell, 1970) . Still considered as the standard approach to describe prokaryotic species, polyphasic taxonomy frequently employs information other than 16S rRNA gene se- & selected loci, which precluded rigorous methods to weight and evaluate confl icting topologies even if present. Very often, the additional data gained from sequencing alternate genetic loci were non-selectively integrated by concatenation into an alignment of an established locus in the hope of giving better resolution or reliability to the resulting phylogenetic tree. Indeed, when software-based methodologies for alignment trimming were fi rst presented (Castresana, 2000) , the removal of highly variable and unreliable sequence positions was met with reluctance since the reduced information content of the trimmed alignment occasionally concerned the researcher (Talavera and Castresana, 2007) . In the age of high-throughput DNA sequencing, ironically, one of the major issues emerged as the selection of the more reliable loci among the plethora of targets in the sequenced genome (Salichos and Rokas, 2013) . With whole-genome comparative genomics (Razin, 1997 ) and the rise of phylogenomics (Sicheritz-Ponten and Andersson, 2001), the dynamic nature of genomes is being increasingly recognized especially in the most sequenced domains, the Bacteria and Archaea. The long-standing debate of whether horizontal gene transfer has compromised the universal tree of life once again became relevant, not because of the then paucity of targets but now due to the abundance of putative horizontal gene transfers identifi ed. While early studies have supported the separation of the prokaryotic domains (Sicheritz-Ponten and Andersson, 2001) , more recent complete genome comparisons have revealed extensive evidence for horizontal genetic transfers in the history of bacterial and archaeal evolution (Koonin and Wolf, 2008) . Coupling to the pervasive gene loss and functional plasticity of many genes, consistent phylogenetic signals can only be observed at relatively shallow phylogenetic depths and the concept of a phylogenetic tree may not be strictly applicable especially at the base of the tree where the domains diverge (Puigbo et al., 2009) . We envisage that there may soon be another debate which ultimately challenges the central tenets of cladistics. Supporters of the tree of life concept have devised novel methods to statistically reconcile the differing topologies of phylogenetic trees from different genes and recover a species tree (Abby et al., 2010; Schliep et al., 2011) . Although it has also been suggested that the more efficient way to estimate the common phylogeny, i.e. the species tree, underlying the different genes sampled is the concatenation approach given proper consideration for among-gene heterogeneity (Yang and Rannala, 2012) , this proposal has also been questioned based on the total incongruence among all genes observed in a large-scale phylogenomic study (Salichos and Rokas, 2013) . Regardless of the actual results obtained, these methodological inconsistencies may render the tree of life proposal more prone to attack by the web of life supporters (Kunin et al., 2005) , especially when the methods of phylogenetic and phylogenomic network reconstruction become mature and widely available (Woolley et al., 2008) . Why should this challenge the clade concept? If one considers the basis of Hennigian cladistics, it is clear that its Valas and Bourne, 2011) or indeterminate due to horizontal gene transfer (Lake and Rivera, 2004; Bapteste and Walsh, 2005) , then to unite the prokaryotes into a hierarchy excluding Eukaryota would create paraphyly. On the other hand, to apply such cladistic concepts to a tree of life is to assume that a strictly bifurcating phylogeny existed for the early evolutionary history despite the potentially extensive horizontal gene transfers (Doolittle and Brown, 1994) . Apart from alternative roots of the universal tree of life, proponents of a single prokaryotic kingdom (or superkingdom) have also argued by current molecular evidence and highlighted the many reproductive, organizational, developmental and nutritional differences between prokaryotes and eukaryotes-which, according to the proponents, represent whole-organism criteria instead of partial phylogenies. Such polarized debates have even resulted in suggestions of abandoning the use of some common terms. The "bacterial chromosome", for example, is "used by molecular biologists to refer to the genophore, is confusing and its use should be avoided" (Margulis, et al., 2009) . We note, however, the terminology suggested by Margulis and Chapmanto mainly served to highlight the differences between the Archaea and Eukaryota; and their repeated references to the methanogenic and halophilic archaea as being "bacteria" is perhaps more confusing to many microbiologists. On the other front, while the division of the prokaryotes into new domains or supra-kingdom hierarchies has not yet been completely settled, cladistics has penetrated the subdomain hierarchies in Eukaryota. One of the more illuminating examples occurred at the interface of the eukaryotic clades, the protists. Although still considered as an important grade of classification by certain pheneticists (Cavalier-Smith, 2004) , the convenience of classifi cation once thought to be an indispensible virtue of this paraphyletic taxon has gone into disfavor as monophyletic groups became the phylogenetic ideal (Simpson and Roger, 2004) . With the advent of phylogenetics, it was recognized that the classifi cation of eukaryotic organisms-especially those that could only be classifi ed based on mainly phenetic characters-should follow that of prokaryotic taxonomy to include gene trees and multi-gene phylogenies for enhanced validity (Keeling et al., 2005) . The extensive revisions brought to protist taxonomy by molecular phylogenetics aggravated the incompatibilities between the International Code of Zoological Nomenclature and International Code of Botanical Nomenclature to the point that neither was considered adequate for a practical taxonomy (Adl et al., 2007) . The paraphyletic Kingdom Protoctista, although still included in many junior and college science curricula, has also been noted as being a "phylogenetic dustbin" (Bradfi eld, 2009 ). In through and through cladistic-based databases such as the NCBI Taxonomy database, it is perhaps reasonable to see Protoctista not being recognized as a valid taxonomic rank (Federhen, 2012) . Then again, less optimistically, one may doubt if current taxon sampling, methodologies and computational power allow the phylogenetic ideals of the cladists-or perhaps the practicalitydriven taxonomy of the pheneticists-to be found, since even with genomic understanding the functional adaptations of organisms to their specifi c niches are not immediately apparent. We opt to be optimistic. With the recent success in modeling an entire organism by using its complete genetic information (Karr et al., 2012) , we believe that the gap between evolutionary simulations at the genetic and genomic level (Edgar et al., 2013) and the phenotype or adaptability level will eventually be bridged. The debates that we have seen and predicted, however, will not be settled by such as taxonomic classifi cations serve the human scientist. With sympathy, we quote the respected Bergey's Manual in our fi eld to end this review, with the hope that it may serve as a reminder to the wider taxonomic audience: "…bacterial classifi cations are devised for microbiologists, not for the entities being classifi ed. Bacteria show little interest in the matter of their classifi cation. For the taxonomist, this is sometimes a very sobering thought!" (Brenner et al., 2005) . Chi-Chun Ho, Susanna K P Lau and Patrick C Y Woo declare no confl ict of interest. This review article does not contain any studies with human or animal subjects performed by any of the authors. foundation is based on the concept of homology (Hennig, 1950 (Hennig, , 1966 . Without proper ascertainment of homology, the comparison of apomorphic characters cannot be performed. Clearly, homology since the early days of molecular phylogenetics has been considered an all-or-none phenomenon with no intermediate or fractional levels (Theißen, 2002) . The extensive and on-going horizontal gene transfers therefore centrally challenge the concept of homology at the organism level far beyond the within-species sexual reproduction or the occasional hybridization between species. Even if we neglect viral evolution, which is exactly characterized by an abundance of recombination (Woo et al., 2009; Lau et al., 2010 Lau et al., , 2011b Lau et al., , 2011c Yip et al., 2011) , for they may not be entirely representative of the membrane-bound evolutionary pathways taken by the cellular organisms, the mosaic nature of the eukaryotic (Golding and Gupta, 1995; Gupta, 1998; Ribeiro and Golding, 1998 ) and even archaeal genomes (Kennedy et al., 2001; Deppenmeier et al., 2002) is disturbing to the cladists. Given that recombination occurs between "homologous" regions in a genome and foreign genetic elements can invade and integrate into existing genes, the absolute demarcation of homology may be compromised even at the gene level. Do clades stand, absolutely, as a phylogenetic backbone supporting an essentially tree-like phylogeny to the point that we can regard the horizontal gene transfer events as signal rather than noise (Abby et al., 2012)? Alternatively, should we modify our understanding of homology and clades, updating our traditional methodologies (McDade, 1990 (McDade, , 1992 to include a probabilistic element to recognize the non-major, alternative contributions to the evolutionary history of a gene and an organism (Posada and Crandall, 2001) ? Or, have the ideas of homology and clades been rendered obsolete and novel concepts should be introduced to help us understand the somehow reticulated evolutionary history of the domains of life? In this review, we examined the transforming influences of cladistics on the classifi cation and taxonomy of cellular organisms. These infl uences were not the sole result of the cladistic concepts. The advent of molecular biology techniques led to the widespread availability of genetic data, which allowed the geneology of organisms to be inferred using the concurrently developed methodologies, algorithms and, perhaps most importantly, modern and affordable computers. The development of high-throughput sequencing in the last decade has led to a quantum leap in the availability of genetic data as the complete repertoire of genes of microbes and even animals and plants became all at once available. It is a pity that neither Charles Darwin (1809-1882) nor Willi Hennig lived to witness these changes. With the power of the high-throughput sequencing technologies, for the fi rst time in history, we are beginning to see the overwhelming diversity of genomic architecture, magnitude of gene fl ow and genetic variability among extent or even extinct organisms. Systematic value of electrophoretic data An ancient divergence among the bacteria The root of the universal tree and the origin of eukaryotes based on elongation factor phylogeny On the conceptual diffi culties in rooting the tree of life Does the 'Ring of Life' ring true? Homology and analogy. a critical review of the meanings and implications of these concepts in biology Edexcel IGCSE biology Classifi cation of procaryotic organisms and the concept of bacterial speciation. In Bergey's manual of systematic bacteriology Numerical taxonomy of Streptococcus Numerical taxonomy of Haemophilus Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications Transantarctic relationships and their signifi cance, as evidenced by chironomid midges Methanococcus igneus sp. nov., a novel hyperthermophilic methanogen from a shallow submarine hydrothermal system Trophic controls on stage transformations of a toxic ambush-predator dinofl agellate Evolved RNA secondary structure and the rooting of the universal tree of life Early evolutionary relationships among known life forms inferred from elongation factor EF-2/EF-G sequences: phylogenetic coherence and structure of the archaeal domain Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis Only six kingdoms of life First report of spontaneous intrapartum Atopobium vaginae bacteremia The Ribosomal Database Project: improved alignments and new tools for rRNA analysis Polyphasic taxonomy of the genus vibrio: numerical taxonomy of Vibrio cholerae, Vibrio parahaemolyticus, and related Vibrio species Genome networks root the tree of life between prokaryotic domains On the origin of species by means of natural selection The genome of Methanosarcina mazei: evidence for lateral gene transfer between bacteria and archaea Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB Tempo, mode, the progenote, and the universal root Evolver: a whole-genome sequence evolution simulator The NCBI Taxonomy database Molecular phylogeny of the animal kingdom Where is the root of the universal tree of life? Rooting the ribosomal tree of life Classifi cation of methanogenic bacteria by 16S ribosomal RNA characterization Protein-based phylogenies support a chimeric origin for the eukaryotic genome Fundamentals of molecular evolution The root of the universal tree of life inferred from anciently duplicated genes encoding components of the protein-targeting machinery Ancient phylogenetic relationships Sequence of the 16S ribosomal RNA from Halobacterium volcanii, an archaebacterium Protein phylogenies and signature sequences: A reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes Genomic analysis of the uncultivated marine & otes Understanding the adaptation of Halobacterium species NRC-1 to its extreme environment through computational analysis of its genome sequence Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world The net of life: reconstructing the microbial phylogenetic network Deriving the genomic tree of life in the presence of horizontal gene transfer: conditioned reconstruction Evidence for a new root of the tree of life Genome beginnings: rooting the tree of life First report of disseminated Mycobacterium skin infections in two liver transplant recipients and rapid diagnosis by hsp65 gene sequencing Molecular epidemiology of human coronavirus OC43 reveals evolution of different genotypes over time and recent emergence of a novel genotype due to natural recombination Ecoepidemiology and complete genome comparison of different strains of severe acute respiratory syndrome-related Rhinolophus bat coronavirus in China reveal bats as a reservoir for acute, self-limiting infection that allows recombination events Usefulness of the Micro-Seq 500 16S rDNA bacterial identifi cation system for identifi cation of anaerobic Gram positive bacilli isolated from blood cultures Typhoid fever associated with acute appendicitis caused by an H1-j strain of Salmonella enterica serotype Typhi Clinical isolates of Streptococcus iniae from Asia are more mucoid and beta-hemolytic than those from North America Co-existence of multiple strains of two novel porcine bocaviruses in the same pig, a previously undescribed phenomenon in members of the family Parvoviridae, and evidence for inter-and intra-host genetic diversity and recombination Phylogenetic analysis of carbamoylphosphate synthetase genes: complex evolutionary history includes an internal duplication within a gene which crenarchaeote Cenarchaeum symbiosum Grundzüge einer Theorie der phylogenetischen Systematik Phylogenetic Systematics Phylogenetic systematics Automated pangenomic analysis in target selection for PCR detection and identification of bacteria by use of ssGeneFinder Webserver and its application to Salmonella enterica serovar Typhi Rapid identification and validation of specifi c molecular targets for detection of Escherichia coli O104:H4 outbreak strain by use of high-throughput sequencing data from nine genomes Complete genome of the uncultured Termite Group 1 bacteria in a single host protist cell Genome of an endosymbiont coupling N2 fi xation to cellulolysis within protist cells in termite gut Concepts of homology and analogy The three types of evolutionary process Clades and grades. in function and taxonomic importance: a symposium A molecular approach in the systematics of higher organisms. dna interactions provide a basis for detecting common polynucleotide sequences among diverse organisms Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes Serum Protein Electrophoresis in Mammals-Taxonomic Implications The application of numerical taxonomy to the separation of cllonic infl ammatory disease Numerical taxonomy and discriminant analysis applied to non-specifi c colitis A whole-cell computational model predicts phenotype from genotype The phylogeny of the spirochetes The rooting of the universal tree of life is not reliable Intraspecifi c gene genealogies: trees grafting into networks SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB Search for a 'Tree of Life' in the thicket of the phylogenetic forest Comparative genomics of mycoplasmas Neuere Probleme der Abstammungslehre: die transspezifi sche Evolution, 2., stark verä nderte Aufl The mosaic nature of the eukaryotic nucleus Molecular classifi cation of living organisms Inferring ancient divergences requires genes with strong phylogenetic signals Use of DNA polymerase I primed by a synthetic oligonucleotide to determine a nucleotide sequence in phage fl DNA DNA sequencing with chain-terminating inhibitors Harvesting evolutionary signals in a forest of prokaryotic gene trees A phylogenomic approach to microbial evolution The real 'kingdoms' of eukaryotes Numerical taxonomy Numerical taxonomy of Pseudomonas based on published records of substrate utilization Thirty years of numerical taxonomy Principles of numerical taxonomy The evolutionary history of the structure of 5S ribosomal RNA Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments Orthology: secret life of genes Efficiency of a multitest system (Enterotube) for rapid identification of Enterobacteriaceae The molecular clock runs more slowly in man than in apes and monkeys Philosophia botanica The root of the tree of life in the light of the covarion model Overview: a phylogenetic backbone and taxonomic framework for procaryotic systematics. In Bergey's manual of systematic bacteriology ARB: a software environment for sequence data Are extreme halophiles actually "bacteria"? Amino-acid sequence of horse heart cytochrome c Cladistic analysis or cladistic classifi cation Biological classifi cation: toward a synthesis of opposing methodologies An approach to the measurement of genetic relatedness among organisms Hybrids and phylogenetic systematics i. patterns of character expression in hybrids and their implications for cladistic analysis Hybrids and phylogenetic systematics ii. the impact of hybrids on cladistic analysis The characterization of enzymatically amplifi ed eukaryotic 16S-like rRNAcoding regions The nature of systematic biology and of a species description Secondary structure of 16S ribosomal RNA The ribosomal database project Sequence of the 16S rRNA gene from the thermoacidophilic archaebacterium Sulfolobus solfataricus and its evolutionary implications The green non-sulfur bacteria: A deep branching in the eubacterial line of descent Phylogenetic relationships among the sulfate respiring bacteria, myxobacteria and purple bacteria Molecular evolution: a phylogenetic approach (Oxford A phylogenetic grouping of the Bacteroides, Cytophagas, and certain Flavobacteria The origin of a derived superkingdom: how a gram-positive bacterium crossed the desert to become an archaeon 16S ribosomal DNA amplifi cation for phylogenetic study The Deinococcus-Thermus phylum and the effect of rRNA composition on phylogenetic tree construction What, exactly, is cladistics? Re-writing the history of systematics and biogeography A proposal concerning the origin of life on the planet earth Interpreting the universal phylogenetic tree Phylogenetic structure of the prokaryotic domain: the primary kingdoms The phylogenetic relationships of three sulfur dependent archaebacteria Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya Secondary structure model for bacterial 16S ribosomal RNA: phylogenetic, enzymatic and chemical evidence Phylogenetic placement of the Spirosomaceae The case for relationship of the fl avobacteria and their relatives to the green sulfur bacteria The phylogeny of purple bacteria: The alpha subdivision The phylogeny of purple bacteria: The gamma subdivision The flexibacter-flavobacter connection Genomic and experimental evidence for a potential sexual cycle in the pathogenic thermal dimorphic fungus Penicillium marneffei Isolation and characterization of a Salmonella enterica serotype Typhi variant and its clinical and public health implications First discovery of two polyketide synthase genes for mitorubrinic acid and mitorubrinol yellow pigment biosynthesis and implications in virulence of Penicillium marneffei Coronavirus diversity, phylogeny and interspecies jumping Then and now: use of 16S rDNA gene sequencing for bacterial identifi cation and discovery of novel bacteria in clinical microbiology laboratories Seronegative bacteremic melioidosis caused by Burkholderia pseudomallei with ambiguous biochemical profi le: clinical importance of accurate identifi cation by 16S rRNA gene and groEL gene sequencing Identification by 16S ribosomal RNA gene sequencing of an Enterobacteriaceae species from a bone marrow transplant recipient groEL encodes a highly antigenic protein in Burkholderia pseudomallei Usefulness of the MicroSeq 500 16S ribosomal DNA-based bacterial identification system for identifi cation of clinically signifi cant bacterial isolates with ambiguous biochemical profi les High diversity of polyketide synthase genes and the melanin biosynthesis gene cluster in Penicillium marneffei Automated identifi cation of medically important bacteria by 16S rRNA gene sequencing using a novel comprehensive database, 16SpathDB Cell-wall-defi cient bacteria and culture-negative febrile episodes in bone-marrow-transplant recipients The mitochondrial genome of the thermal dimorphic fungus Penicillium marneffei is more closely related to those of molds than yeasts A comparison of phylogenetic network methods using computer simulation Mitochondrial origins Leuconostocs": an interesting case of a rapidly evolving organism Molecular phylogenetics: principles and practice On the use of nucleic acid sequences to infer early branchings in the tree of life Complete genome sequence of a coxsackievirus A22 strain in Hong Kong reveals a natural intratypic recombination event Phylogenetic origin of the chloroplast and prokaryotic nature of its ribosomal RNA Molecules as documents of evolutionary history