key: cord-0014589-9oksg5mg authors: Hammond, Jocelyn A.; Gordon, Emma A.; Socarras, Kayla M.; Chang Mell, Joshua; Ehrlich, Garth D. title: Beyond the pan-genome: current perspectives on the functional and practical outcomes of the distributed genome hypothesis date: 2020-12-18 journal: Biochem Soc Trans DOI: 10.1042/bst20190713 sha: c6fcaf3067b09d136e0528a88a22d2bc5f91de2f doc_id: 14589 cord_uid: 9oksg5mg The principle of monoclonality with regard to bacterial infections was considered immutable prior to 30 years ago. This view, espoused by Koch for acute infections, has proven inadequate regarding chronic infections as persistence requires multiple forms of heterogeneity among the bacterial population. This understanding of bacterial plurality emerged from a synthesis of what-were-then novel technologies in molecular biology and imaging science. These technologies demonstrated that bacteria have complex life cycles, polymicrobial ecologies, and evolve in situ via the horizontal exchange of genic characters. Thus, there is an ongoing generation of diversity during infection that results in far more highly complex microbial communities than previously envisioned. This perspective is based on the fundamental tenet that the bacteria within an infecting population display genotypic diversity, including gene possession differences, which result from horizontal gene transfer mechanisms including transformation, conjugation, and transduction. This understanding is embodied in the concepts of the supragenome/pan-genome and the distributed genome hypothesis (DGH). These paradigms have fostered multiple researches in diverse areas of bacterial ecology including host–bacterial interactions covering the gamut of symbiotic relationships including mutualism, commensalism, and parasitism. With regard to the human host, within each of these symbiotic relationships all bacterial species possess attributes that contribute to colonization and persistence; those species/strains that are pathogenic also encode traits for invasion and metastases. Herein we provide an update on our understanding of bacterial plurality and discuss potential applications in diagnostics, therapeutics, and vaccinology based on perspectives provided by the DGH with regard to the evolution of pathogenicity. Established paradigms of monoclonality may have adequately described a subset of acute bacterial infections, however, they have been found lacking when dealing with persistent and chronic bacterial infections, including biofilms. Monoclonality implies that colonizing bacteria all belong to the same genotype; however, this is an oversimplified view of most bacterial infections, especially those that are chronic. Traditional diagnostic methods that relied on bacterial cultures helped to entrench these paradigms in medical microbiology. However, newer diagnostic and research tools, such as polymerase chain reaction (PCR), advanced microscopic techniques for metabolic profiling, and whole-genome sequencing (WGS), have demonstrated extensive phenotypic and genotypic diversity among bacterial strains within a population or species [1] [2] [3] . Genotypic diversity within infecting bacterial populations suggests these populations are in fact polyclonal, rather than monoclonal; i.e. multiple independent strains of the same bacterial species are present in these populations at the same time [1, [3] [4] [5] [6] [7] . This diversity also implies that even mono-species bacterial populations are neither phenotypically or genotypically clonal [1] . Genetic diversity within a prokaryotic species incorporates two distinct phenomena: (1) genetic heterogeneity, i.e. individuals in a population possess different alleles (variant forms of the same gene); and (2) genomic plasticity, i.e. individuals in a population possess different genes [1, 8] . The rubric of genomic plasticity, thus teaches that the genome of one strain within a species cannot account for the entire set of genes for the species as a whole, and that studies of individual isolates will greatly underestimate the biological properties of the species. Therefore, the genomes of multiple independent isolates need to be fully characterized to even estimate the total biological complexity of a bacterial species [8] [9] [10] [11] [12] [13] [14] [15] [16] . Collectively, the genes contained in the genomes of all individuals within a species comprise the supragenome (or pan-genome) of that species [8, 9, 17] . Multiple mathematical models have been developed to estimate the species-level supragenome based on the sequencing of a limited number of strains chosen to be unclustered with respect to geographic location and phenotypic properties [13, 14, 18, 19] . The concepts of genomic plasticity and the supragenome/pan-genome are integral components of the distributed gene hypothesis (DGH), which posits that not all individuals in a population have the same set of genes, and that no one individual strain has all the genes of a species. Individual members of the population have 'access to' a species supragenome from which genetic material can be taken up and recombined as a means of generating genomic diversity [1, 2, 7] . This genomic plasticity, propagated by horizontal gene transfer (HGT), is highly conserved among bacterial species. There are multiple examples wherein HGT has been shown to be the mechanism that has provided for bacterial adaptation, survival, and evolution through the transfer of genes that provide for adaptation to various environmental pressures including host immune surveillance, antibiotic resistance, and changes in pH [20] [21] [22] [23] [24] [25] [26] . Obviously, vertical transmission of the core genome is required to maintain species viability and to provide a framework within which HGT mechanisms operate. In this review, we provide an updated perspective with regard to bacterial genomic plasticity, and how HGT mechanisms together with a large, species-level, supragenome provides for bacterial adaptation. In addition, we show how these new understandings provide for targeted clinical interventions that have the potential to fundamentally change how we approach microbial infections such that it should be possible to 'surgically' eliminate highly pathogenic strains without undue damage to the host's microbiome Bacterial genomic plasticity Bacterial genomic plasticity is characterized by each strain in a population possessing a unique set of distributed/accessory genes from the population supragenome [1, 7, 13, 15] . We have demonstrated high degrees of genomic plasticity within multiple species that underlies the observation that these species have both pathogenic and commensal strains, these include: Haemophilus influenzae [10, 14, 19] , Pseudomonas aeruginosa [12, 27, 28] , Streptococcus pneumoniae [11, [18] [19] [20] 29] , Moraxella catarrhalis [30, 31] among others. Genomic diversity results from a balance between genome expansion (addition of genes) and genome reduction (deletion of genes) [14] [15] . Our analyses comparing type strains of both H. influenzae (Rd) and S. pneumoniae (Sp6) to clinical isolates (of each of their respective species) revealed only minor changes in genome size, but more than 200 INDELS per genome-pair comparison (on average) indicating that there is very extensive recombination occurring [14, 29] . Genomic plasticity is generated via three types of horizontal gene transfer (HGT) processes: (1) transformation; (2) conjugation; and (3) bacteriophage-mediated transduction ( Figure 1) . A large proportion of novel bacterial genes arise from DNA duplication and other DNA-modifying processes [15] , or via acquisition from mobile genetic elements (e.g. transposons, insertion elements, etc.) and transferable plasmids all of which provide genomic fodder for the evolution of novel functions as they are not associated with gene loss. Vertical gene transfer is the standard mechanism by which a mother cell replicates its entire DNA complement and passes identical (or nearly identical) copies of each chromosome and plasmid/episome (an episome is usually described as a plasmid that is able to integrate into the main bacterial chromosome) to each daughter cell during cell division [7] . In contrast, the HGT processes involve unidirectional gene movement between two, often unrelated, bacterial cells in which one or more blocks of donor chromosomal DNA (range: several hundred bases to >100 Kb) (and/or plasmid and episomal DNAs in the case of conjugation) are transferred into the recipient cell resulting in either the partial replacement of the recipient bacterium's chromosome or the acquisition of a new replicon [1, 7] . Mell et al. [46] using high molecular mass DNA for in vitro transformation of H. influenzae found that the mean recombination tract length was 8.1 ± 4.5 Kb. Hiller et al. [20] in a study of S. pneumoniae evolution in situ during a polyclonal pediatric infection found 23 transformation events ranging in size from 0.4 to 235 Kb, with a mean size of 28 Kb and a median size of 13 Kb. It is important to understand that the cell from which the transferred DNA comes does not necessarily have to be viable at the time of transfer (and never is in the case of transduction). Each of the HGT mechanisms can occur between different strains of the same species and also between related species. Conjugation, however, particularly when it involves episomal transfers, can also occur between very divergent species that exist in different phyla [32, 33] . Transformation rates decline rapidly with increased levels of genetic heterogeneity as they rely on homologous recombination machinery within the recipient cell [34] [35] [36] [37] . Transduction is based on the host-range of the infecting temperate bacteriophage and is, therefore, usually confined to a single species or closely related sister species. Some bacterial species use a single HGT process, while others use two or even all three. For example, Staphylococcus aureus principally relies on transductionand accordingly has a much smaller pan-genome than many other bacterial species [19, 38] . In contrast, Escherichia coli uses both mating [39] and transduction [40, 41] whereas H. influenzae utilizes all three HGT processes [37, [42] [43] [44] . Mating and transformation are active processes, requiring that the donor and recipient organisms live in close proximity to one another. Both (A) Bacterial cell with a single circular chromosome in blue; (B) vertical transmission from mother to daughter cells with the daughter cells being genomically identical with each other and the mother cell; (C) horizontal gene transfer methods with transferred or transferrable DNA signified by red shading: (i) transformation, foreign DNA is picked up from the environment by competent bacteria which is then recombined into the bacterial chromosome via homologous recombination; (ii) conjugation, transfers either a plasmid (episome) or chromomosonal DNA via a pilus from a 'male'-parasitizing bacteria to a recipient; (iii) transduction, transfers foreign bacterial DNA via a phage using nonhomologous recombination. processes require substantial energetic expenditures by one or the other of the participating bacteria. Thus, either the donor or the recipient bacteria, and indirectly the bacterial population engaging in these processes must receive an evolutionary advantage from these processes, or they would not persist. Any mechanistic process that requires energetic expenditures, by definition must provide an evolutionary advantage for its continued propagation. In the case of mating, the primary energy expenditure is via the male (DNA donating) bacteria. This process has been described as parasitic and an example of selfish genes ensuring their own propagation [45] . This is because the genes for the mating apparatus are parts of transposable elements (either chromosomally or episomally located) that collectively ensure their own horizontal propagationas well as transferring physically adjacent genes that have been selected for their ability to provide for survival advantages to the host bacteria in stressful environments. Thus, mating often also results in the transfer of genes that enhance the survival of the recipient demonstrating that a singular process can have multiple ecological outcomes. In the mating process, two live bacteria are joined temporarily by a pilus (or similar structure) through which one bacterium sends a copy of its DNA (chromosomal and/or episomal) into the other bacterium. Therefore, the evolutionary advantage accrues to both the donor (selfish gene) as it propagates itself, and the recipient as it gains genes enabling survival such as those encoding resistance for antibiotics and heavy metals. For these reasons, we have referred to these HGT processes as population-level virulence (or survival) traits [3] . Similarly, the benefit of transformation accrues to the recipient and the population. In the case of competence and transformation, the energy expenditure is via the recipient bacteria as it uptakes DNA from its environment (competence) and homologously recombines the exogenous DNA into its chromosome (transformation). This results in changing the genotype and phenotype of the cell which may provide a selective advantage in a stressful environment. Multiple transfers distributed around the chromosome may occur during a single competence event [20, 7, [46] [47] [48] . Competence, the metabolic state of being able to take up foreign DNA from the environment into the cell, is triggered by nutrient limitation or other stress conditions as part of the bacterial cell's SOS response. One of the molecular triggers of competence among the Streptococci, and related gram-positive bacteria, is the production of a quorum-sensing pheromone, competence stimulating peptide (CSP). CSP is a seventeen amino acid peptide that serves as an intercellular activating signal leading to the expression of the ComABCDE regulon that encodes genes that control and produce the cellular machinery necessary for competence and transformation. In S. pneumoniae there are several CSP alleles that divide the species into specificity groups as each CSP has a corresponding receptor (ComD) which is specific for a given CSP peptide. In some cases, particularly among the Streptococci and Vibrio spp. competence is also an auto-parasitic process, as the first bacteria in a population to become competent kill their neighbors to ensure a source of DNA for transformation [49] [50] [51] [52] . Finally, viral transduction results when a lysogenic (temperate) bacterial virus, or bacteriophage, excises itself from the host genome and inadvertently takes some of the host's genes along with its own genes and then reinserts these genes into the recipient's genome when it establishes a lysogenic state [1, 7, 15] . Through transduction (specialized or generalized), bacteriophages play a major role in HGT, especially regarding antibiotic resistance, virulence factors, and invasionrelated functions [13, 15] . Viral transduction is a bacterially passive process for both the donor and recipient bacteria (in that it results from viral infection) which can result in bacterial species living in different environments to exchange genetic material [7, 13, 15] . DNA exchange via these HGT mechanisms provides for diversity generation within a bacterial population; and thus, they provide the same advantages as sex in eukaryotic organisms and can be thought of analogous processes [53, 54] . Recently Colnaghi et al. [55] demonstrated that eukaryotic sex developed to provide the same population-level benefits as HGT in prokaryotes, including protection form Muller's ratchet, because as genome size increased there was a need for increases in recombination length. Without the ability to rapidly generate diversity, all individuals within the population would have the same fitness level with respect to environmental challenges which would decrease a population's chance of survival during times of environmental change. Moreover, the lack of HGT would result in the bacterial equivalent of in-breeding as there would be no mechanism to replace mutant loss-of-function alleles; this would, according to Muller's ratchet, result in their eventual extinction caused by the accumulation of slightly deleterious mutations via genetic drift [56] . The universality of the need for genetic recombination was recently reinforced when it was found that there is extensive HGT among metazoan 'asexual' bdelloid rotifers. The authors of this work, and those commenting on its implications, have clearly and unequivocally construed HGT as the equivalent of sex among these metazoans, and used it to explain their long-term survival as it creates diversity preventing evolutionary bottlenecks associated with the lack of sex [57, 58] . Thus, bacterial HGT mechanisms help to ensure that the population survives both during periods of environmental and nutritional challenges [7] by providing an 'evolutionary shortcut' that enables organisms to quickly adapt to a changing or new environment [13] and by providing for replacement of mutant alleles [56, 59] . HGT can also induce/produce major lifestyle changes for recipient cells and permit radiation into different ecological niches [15] such as occurs when commensal organisms within a holobiont [60] [61] [62] acquire virulence genes and become pathogens. Taken together, these observations suggest that evolutionary pressures select for mechanisms that generate diversity [7] . A species' supragenome contains three different sets of genes: (1) the core genes, which are found in all strains of the species; (2) the distributed/accessory genes, which are found in a subset of strains of the species; and (3) the shell/unique genes, which are found in only a very small fraction of the strains of the species. Thus, in addition to the set of core genes, each strain has its own unique set of noncore genes from the supragenome [1, 9, 63] . The core genome consists of 'essential' genes responsible for the basic aspects of a species' metabolism and major phenotypic traits [13, 15] , including genes for housekeeping functions, such as energy production, amino acid metabolism, nucleotide metabolism, lipid transport, and translational machinery [48] . In contrast, the noncore (distributed, accessory, or adaptive) genome includes genes encoding for supplementary or modified biochemical functions that may be useful in contexts other than basic survival, such as adaptation to new environments, antibiotic resistance, or colonization of new environments and hosts [13, 15] . Being noncore implies that these genes can be deleted from the genome, but such deletions may result in the loss of important phenotypic traits, such as the ability to grow on nontraditional nutrients and substrates, virulence and antibiotic resistance [13, 15] . Early studies demonstrated that the majority of noncore genes within a species supragenome have been evolving with the core genes of that species [64] and that many of the unannotated distributed genes are associated with survival in different environments [25, 65] . The noncore genes also include those with parasitic functions, the selfish genes. These include those that promote their own transfer and propagation [66] as well as those that run a 'protection racket'the toxin-antitoxin genes [67] . WGS of multiple strains of multiple bacterial species has demonstrated that, with each new strain sequenced, multiple new genes are found. The number of novel genes added per genome, ranges from many hundreds at the start of a pan-genome project to just a handful as the growth becomes asymptotic [9] [10] [11] [12] [9] [10] [11] [12] [13] [14] 16, [19] [20] 31, 68] . These early observations led to the recognition that in most cases thousands of genomes would be needed to fully describe the supragenomes/pan-genomes of many bacterial species [13] . Recently Park et al. [38] performed pan-genome analyses on seven pathogenic bacterial species using data downloaded from some 27 000 genomes from the NCBI prokaryotic genome database. They calculated pan-genome sizes of 22 000 genes for S. aureus and 128 193 genes for E. coli with the others being intermediate. It is interesting to note, that even the smallest of these bacterial pan-genomes contain as many genes as the human genome based on these recent estimates [69] . Our laboratory's recent sequencing of more than 2000 H. influenzae clinical isolates has demonstrated that we have yet to fully characterize this species' supragenome, albeit we are getting very close as our recent analyses have not identified any novel MLST types (unpublished observations); moreover, the number of novel genes identified with each additional strain sequenced is now in the single digits. This analysis includes specimens from all six permanently inhabited continents, and from patients with every known disease type that H. influenzae has been associated with; therefore we think the probability is high that there are very few widely divergent genotypes that have not been characterized, and that the number of additional novel genes to be identified is relatively small compared with the number that has been found. With this knowledge, the best means to approximate a bacterial species genome in terms of gene numbers are mathematical models, including the Pan-Genome Model [13, 18] and the Finite Supragenome Model [14, 16, 19] both of which have proven to be quite accurate in predicting the total number of genes within a species using data from WGS of a few independent isolates. As no two strains in a species contain the same complement of genes ( per the DGH), collectively the species' supragenome/pan-genome is often quite large and can actually exceed the number of genes in mammalian genomes. For example, as noted above, there are over 100 000 genes in the E. coli supragenome. However, it is also easy for the prokaryotic cell to undergo genomic deletions, disposing of unnecessary or deleterious genes [15] . The size of a species' supragenome/pan-genome, relative to its core genome size, is highly variable among bacterial species. In an analysis of 295 species-specific supragenome projects published from 2005 to 2019, the supragenome was found to be substantially larger than the core genome in essentially all cases with the core genome making up from <20% to >60% of the supragenome [48] (Figure 2 ). It is interesting to note that both genome and supragenome size are associated with a species biology. Free-living environmental bacterial species tend to have the largest genomes 4-12 megabases (Mb); with commensal and pathogenic bacteria having intermediate size genomes 1.5-4 Mb; and obligate and intracellular pathogens having the smallest genomes 0.6-1.5 Mb. These rules, however, are not hard and fast as there are exceptions as not all pathogens have a reduced genome size [48] . In summary, the DGH provides a theoretical framework for understanding bacterial genomic plasticity and the supragenome/pan-genome are the functional constructs that embody the genic diversity that the DGH predicts. The DGH states, with respect to chronic bacterial pathogens, that they utilize a survival strategy wherein a majority of their genes are distributed among a population and are not found in all members of a species; thus there exists a supragenome at the population level which is greater than the genome of any one organism and that this distribution of genes among a population serves as a population-level virulence factor [3] that provides for improved population survival through continual HGT mechanisms which provide for rapid adaptation to environmental conditions through the reassortment of genes (and alleles). Moreover, the set of genes in a species' supragenome can expand through the introduction of genes via inter-species exchange or via gene duplication and the evolution of paralogous functions within a species. This is not to say that all genes and all different gene combinations will lead to correspondingly large phenotypic differences as multiple genes can often provide the same functionality; moreover, it is quite possible that some genes that enter a genome via HGT mechanisms may not be functional in that genome due to the need for other corresponding gene products or signaling pathways that are not present in the new host. With those caveats, the shuffling of noncore The area of the inner circle for each species represents the size of the core genome relative to the supragenome which is represented by the entire area of the circle. All areas are representative of the actual percentage differences between the core and supragenomes. Starting at the lower left with Mycobacterium tuberculosis and moving clockwise through Staphylococcus aureus, Corallococcus coralloides, Escherichia coli and Pseudomonas aeruginosa the relative size of the core genome decreases compared with the supragenome. genes and alleles (different forms of the same gene) of core genes generates new combinations that are subsequently subjected to the forces of selection. Genomic plasticity represents a successful strategy for bacteria to adapt, survive, and evolve. In the 15 years since the initial characterizations of bacterial supra/pan-genomes in both Gram-negative and Gram-positive species [9, 10] the integration of the DGH with our parallel increases in understanding of microbial ecology has revolutionized thinking about many aspects of bacteriology. These include: (1) basic bacterial biology, e.g. their possession of a life cycle, evolution, population genetics, and taxonomy/phylogeny; (2) hostbacterial interactions both at the species and microbiome levels encompassing bacterial colonization, persistence and invasion into host tissues; and (3) clinical applications, e.g. the diagnosis, treatment, prevention, and epidemiology of bacterial infections [1, 7, 13, 15, 63] . It is important in these contexts to make the distinction between the bacterial meta-genome which refers to the collective genomes of multiple species assemblages, i.e. microbiomes, and the supra/pan-genome which refers to the collective genome of a single species, or population of that species within a particular microbiome. Here, we discuss several research applications from the current literature using these perspectives as theoretical bases. The DGH teaches that each individual strain within a species and even within a polyclonal bacterial population possesses a unique complement of distributed genes. This means that each strain possesses variation with respect to gene possession with respect to all other strains. Thus, it follows in the case of pathogens that each strain has a unique set of heritable traits with regard to antigen presentation to the host; virulence, including antibiotic resistances and serum resistances; and tissue tropisms, all which contribute to a strain's pathogenicity affecting its ability to colonize, persist in the face of antimicrobial therapy, invade cells and tissues, metastasize to distant sites via systemic spread, and evade or disarms various aspects of the host's innate and adaptive immune systems [1, 7, 64] . Based on the DGH we developed a comparative and functional genomics program that has provided the data establishing the veracity of the postulate that the genetic determinants of virulence and antibiotic resistance are unique to each strain for multiple pathogenic bacterial species (reviewed in [48] ). These include: the Gram-negative pathogens, H. influenzae [14, 19, 62, 65, [70] [71] [72] [73] , M. catarrhalis [30, 31] , P. aeruginosa [12, 27, 28] , and Burkholderia cenocepacia [68] ; and the Gram-positive pathogens S. pneumoniae [19] [20] 29, [74] [75] [76] [77] , S. aureus [19] , and Gardnerella vaginalis [16] . This information can be exploited for the development of targeted prevention and treatment strategies [62, 65, 71] . We first proposed the use of SGWAS (SupraGenome-Wide Association Studies) for the identification of bacterial virulence and tropism genes over a decade ago [62] . SGWAS, as with GWAS, can analyze large numbers of genetic variants (including gene possession variants) to test for a statistical association between each variant and a phenotype of interest [78] being careful to apply methodologies that account for multiple comparisons such as the Bonferroni correction or the Benjamini-Hochsberg method to decrease the false discovery rate. Since that time, we and others have used SGWAS as a valuable tool to identify specific bacterial genes that gives rise to specific phenotypic traits [65, 71, 78, 79] . To date, bacterial GWAS studies have principally focused on identifying genes that are associated with clinically relevant phenotypes, such as virulence and antibiotic resistance [62, 65, 71, 78, 79] (Table 1) . Recently, we have conducted SGWAS studies using multiple advanced algorithms including machinelearning approaches. In Lee et al. [68] we conducted a SGWAS using Spearman rank correlation to study 215 B. cenocepacia strains isolated from 16 CF patients and observed recurrent loss-of-function mutations that were associated with decreases in biofilm formation. In two random-forest-based machine-learning SGWAS studies, both involving large numbers of H. influenzae strains (the first with >1600 genomes and the second with >200 genomes) that were performed to associate clinical provenance with gene presence/absence, we identified a preponderance of unannotated genes among the most important classifiers. This finding is of immense theoretical importance as it teaches us two important lessons. First, that these unbiased methods can point us to specific genes within the enormous background of genomic 'dark matter' that are relevant clinically, and second that their examination is highly likely to lead to novel biology, as we previously demonstrated [65] . In this study, we applied statistical genetic analysis methods to clinical meta-data on large numbers of H. influenzae strains from which we identified multiple unannotated genes that were associated with virulence. Characterization of one of these genes, which we named msf1, demonstrated that it was a major virulence factor providing for invasion and survival in human macrophages, and also increased trafficking of H. influenzae to the brain in the Chinchilla lanigera model of otitis media leading to increased morbidity and mortality. Microbiomes are polykingdom communities, often including bacteria, archaea, fungi, protozoa, and viruses, which colonize particular environments in or on animal bodies (skin, oral cavity, gastrointestinal tract, respiratory tract, urogenital tract, etc.), higher plants, soils and other terrestrial and aquatic environments. The microbiome can also be thought of as the combined genetics and metabolic capabilities of the community of organisms. For the host/holobiont, defined as host-microbiota symbioses [60] [61] [62] , individual members of the microbiome can display a wide range of symbiotic relationshipsfrom mutualism to commensalism to parasitism to pathogen. Microbes inhabiting the human body (1) play roles in multiple important physiological functions, such as digestion, metabolism and immunity; (2) vary according to body site; and (3) in most cases establish an equilibrium with healthy hosts [80, 81] . When this homeostasis is disrupted by an overgrowth of pathogenic microorganisms or by a lack of sufficient numbers of mutualistic or commensal microorganisms at particular body sites or expression of accessory genes triggered by some environmental stimulus, the consequences of the resulting microbial dysbiosis are malfunctioning physiological processes and perhaps ultimately disease [82] [83] [84] [85] [86] . Thus, understanding host-microbiome interactions provides insights into disease diagnoses, treatment, and prevention. Disturbances in microbiomes at various human body sites have been linked to the development of various traits and diseases, including weight gain, obesity, inflammatory bowel disease, diabetes, liver cirrhosis, cardiovascular disease, rheumatoid arthritis, cancer, depression, autism, asthma [80, 84] , and even premature and still birth [86, 87] . Treatments involving the human oral and gut microbiomes include probiotics, which have become popular with the public in recent years. More recently microbiome transplantations (although the Chinese have been using them for thousands of years) have also captured the public imagination, however, they are still largely in the experimental stages at this point, but they have produced promising results as a potential therapy for diseases caused, or exacerbated by, microbial dysbiosis [88] . Transplantation of the intestinal and skin microbiomes have been used to treat a range of diseases. Fecal microbial transplants (FMT) have shown promising results in patients with obesity, C. difficile infections, and ulcerative colitis [89, 90] . There are ongoing trials to measure the potential therapeutic effects of FMT in a host of other diseases; however, the benefits have yet to be determined [91] . Though some benefits of FMT have been shown, there have also been Staphylococcus aureus 75 Antibiotic resistance [167] Streptococcus pneumoniae 3701 Antibiotic resistance [168] Plasmodium falciparum 1063 Antibiotic resistance [169] Staphylococcus aureus 90 Virulence [170] Staphylococcus epidermidis 83 Virulence [171] Burkholderia cenocepacia 215 Biofilm formation [67] Haemophilus influenzae 220 Virulence [71, 65] Mycobacterium tuberculosis 498 Antibiotic resistance [79] Staphylococcus aureus 75 Antibiotic resistance [170] Streptococcus pneumoniae 3701 Antibiotic resistance [171] Plasmodium falciparum 1063 Antibiotic resistance [172] Staphylococcus aureus 90 Virulence [173] Staphylococcus epidermidis 83 Virulence [174] Burkholderia cenocepacia 215 Biofilm formation [68] adverse effects, such as obesity post-transplant and infection [92] and even infection and death [93] . Hopefully, with a better understanding of strain-specific genetic profiles, donors and mock-laboratory-built fecal microbiomes could be more carefully and strategically selected/constructed to avoid adverse effects owing to pathogenic distributed genes in strains from donor samples. This could be accomplished using a combination of 16S microbiome analyses with metagenomic sequencing [94, 95] to provide data on species/strain composition and gene content which could be used to aid in the creation of a safer fecal transplant. While the gut microbiome has been investigated extensively, the skin microbiome has been the focus of more recent research. The skin is colonized by a large number of diverse microorganisms, of which most are beneficial or harmless [96, 97] . The composition of the abundant species is relatively stable over time, although varies with anatomical site. However, skin-associated diseases, such as acne vulgaris, eczema, psoriasis, and dandruff are associated with strong and specific microbiome alterations. Thus, manipulation of the skin microbiome holds promise as a novel therapeutic approach for these diseases (e.g. [98, 99] ). Paetzold et al. [100] used mixtures of different skin microbiome components to alter the composition of recipient skin microbiomes and showed that, after sequential applications of donor microbiomes, recipient microbiomes became more similar to those of the donors. As the degree of engraftment depended on the recipient and donor microbiome composition, applied bacterial load, and application site, these parameters will need to be explored more fully in future experiments. While HGT drives the evolution of many virulent and drug-resistant bacterial strains that contribute to increasing levels of morbidity and mortality [7, 15] , studies of pathogen supragenomes/pan-genomes, on the other hand, help to identify distributed genes that could serve as biomarkers of virulence, and perhaps more importantly as targets for precision medicine-based treatments and preventions [15, 65] . For species-level diagnostic purposes the core genome can be utilized as it contains genes possessed by every member of the species, but we need to use the noncore (distributed/accessory) genes to identify the strains with specific phenotypes if we wish to target specific populations for intervention (since different genes and gene combinations produce different disease phenotypes and tissue tropisms) [48, 65] . For global prevention and treatment, core genes can be used to target an entire species whereas targeting distributed genes allows for selective strain targeting, ensuring that only strains containing the gene of interest are affected (microbiome-sparing approach) [48, 65] . As a technical note, the risk of misclassifying genes as core or distributed/accessory (when they are not) has decreased considerably as the field moves to long-read genomic sequencing methodologies such as PacBio and Oxford Nanopore. These methodologies which have been employed for the last half-dozen years routinely provide closed circular genomes directly from the initial sequencing run. Thus, they have eliminated the need to start genome assemblies with alignments to a reference genome which can be problematic when there is extensive genomic plasticity, and instead provide high-quality sequences for de novo assemblies [25, 68, 73] . Antibiotics have been considered the standard-of-care for the treatment of bacterial infections caused by drug-susceptible organisms since World War II. However, this situation is changing due to the worldwide spread of antibiotic resistance driven by widespread use (and misuse) of antibiotics [101, 102] . This emergence of drug-resistant bacterial pathogens has led to a decline in the efficacy of traditional antimicrobial therapies and has greatly limited the repertoire of antibiotics available to effectively treat patients [101, 102] . The rapid rise in multi-drug-resistant bacteria is a direct consequence of natural selection operating on HGT-driven mechanisms of gene exchange whereby multiple antibiotic resistance genes have become clustered together with the genes that promote conjugation. Thus, a single gene transfer event can provide for the survival of the recipient in the face of combination antibiotic therapy. Much of the rise in antibiotic resistances (both individual resistances and multi-drug resistances) could have been avoided, or at least greatly delayed, by the employment of common-sense guidelines for treatment. Easily the greatest mistake made by antibiotic stewards worldwide for most of the last 75 years was due to their slavish adherence to the 'one antibiotic at a time' doctrine. On the face of it, this was a statistically flawed approach from the start. If resistance to a given antibiotic were to arise spontaneously in one in 10 7 bacteria (a widely agreed upon rate), then an infection with 10 9 bacteria (not an unreasonable number) would produce 100 resistant bacteria that would survive, and could go on to colonize and infect other individualswhich is precisely what has occurred over and over again. If, on the other hand, the patient had been treated with two (or even better three) different antibiotics with non-overlapping mechanisms of action then to produce a doubly resistant organism it would require an infection with 10 14 bacteria (10 7 ×10 7 = 10 14 )which is a number far greater than any infection, even during sepsis; and to produce a triply resistant organism it would require a starting population of 10 21 bacteria which is equivalent to 1000 metric tons of bacteria! Unfortunately, it is now too late to universally adopt such an approach and have a uniform positive patient outcome. This is because our past mistakes, caused largely by an unawareness of the teachings of the DGH, have resulted in the evolution of transferable plasmids and transposons that not only promote HGT of their own core dispersal genetic machinery (selfish genes), but also include multiple genetically encoded resistances to antibiotics. Thus, a single HGT event will result in the formation of a new multiply-drug-resistant bacterial strain [103, 104] . However, one need only look at HIV HAART (highly active anti-retroviral therapy) to understand the utility of such a multi-target approach to pathogen treatment. The simultaneous targeting of multiple HIV enzymatic functions, in the late 1990s, changed HIV-1-related disease from a near certain death sentence to a treatable chronic disease. Historically, antibiotics have been identified by screening natural compounds for their ability to kill bacteria grown in vitro. Recently, genome sequencing and supragenomic characterization of microorganisms have enabled the collection of detailed information regarding the physiological repertoire of entire microbial species (vide supra). This has led to a shift in the discovery of novel anti-microbials from an empirical approach to a knowledge-based approach based on specified targets. In these 'target-based' approaches, potential drugs are identified, or preferentially, designed using in silico modeling algorithms [105, 106] that take advantage of structural information of the target molecule predefined via its roles in a key metabolic process. Comparative metabolomics [107] combined with comparative genomics [7] can be used to identify genes essential for pathogen survival and pathogenicity, which are then tested as targets of specific compounds derived from large chemical libraries [105, 106] . Aside from the antibiotic resistance crisis, research has revealed the harmful effects of broad-spectrum antibiotic therapy on the community structure of beneficial host microbiota, which in turn can have negative effects on long-term host health [102] . To combat these threats, the microbiome-sparing approach aims to modify or replace broad-spectrum antibiotics with precision anti-microbials that selectively target and remove pathogenic strains while leaving the community structure of the surrounding microbiota unchanged [102] . These new therapeutic strategies include the development of anti-virulence compounds that inhibit specific bacterial pathogenesis and persistence traits of targeted strains encoded by distributed genes, which therefore spare other strains of the same species that do not encode such virulence genes. As such these strategies are designed to identify compounds that are bactericidal or bacteriostatic to a minimal number of bacterial strains [102] , and thus provide precision treatments, based on druggable small molecules and natural compounds that are bacterial strain specific ( Table 2) . By preserving patients' microbiota, this microbiome-sparing strategy of identifying pathogen-specific targets has the potential to improve patient health during and after bacterial infections [48, 65, 102] . Identification and characterization of infection-causing microorganisms are crucial for successful treatment, recovery, and safety of patients. Culture fails to detect an organism in 80% of cases in which a patient has signs and/or symptoms of infection. This under-detection has multiple causes, including: antibiotic treatment; Nasopharynx Mouse Natural compound produced by S. lugdunensis (lugdinin) [178] Skin Human Natural compound produced by Staphylococcus epidermidis (succinic acid) [179] bacteria growing as biofilms; and slow-growing, fastidious organisms that cannot be cultured or literally take weeks to culture [2, 64, . Conventional clinical diagnostic culture methods are biased toward the 2% of microorganisms able to grow rapidly in standard culture media. Particularly for chronic or biofilm-related infections, these cultured bacteria are usually presumed to be relevant. However, we now know that in many chronic infections the organisms that grow out rapidly are often not the representative of the species that are driving the infection [120] [121] [122] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] . In part, this understanding is based on the reduction in time to closure of chronic wounds when patients are treated based on the results of molecular diagnostics and WGS as opposed to culture [130] [131] [132] [133] . A meta-analysis currently ongoing in the authors' laboratories covering multiple studies of multiple infectious diseases over more than a decade, in which multiple molecular diagnostic methods were used to diagnose over 7000 infections across a wide range of clinical presentations and anatomic sites demonstrated that under the best conditions culture would detect S. aureus ∼50% of the times it was actually present, and could detect other staphylococcal and most streptococcal species ∼20% of the time. For nearly all other pathogens the detection rate was less than 20% and for anaerobes such as Cutibacterium acnes (formerly Proprionibacterium acnes), and multiple Treponena spp. Prevatella spp. it was far less. Culture is also inadequate to rapidly and accurately distinguish among multiple strains and sub-strains of even the most common pathogens, in part because they often evolve rapidly through HGT mechanisms during the infectious process [7, 16, 20, 25, 28, 68, 142] . Despite its shortcomings, culture is entrenched as the primary diagnostic technique for the identification and characterization of bacterial pathogens. Nucleic acid amplification techniques ( principally PCR) overcame many of these shortcomings, and DNA sequencing methodologies, such as multiple-locus sequence typing (MLST), were a step in the right direction, but they lack the resolution needed for strain-specific diagnostics [63, [143] [144] [145] . Thus, there was a need for better molecular diagnostic tools to improve the accuracy and efficiency of diagnoses. WGS promises to provide the ultimate in resolution for strain, and even sub-strain identification, as it takes into account the genomic plasticity and supragenomes of bacterial species [63, 142] and can be used to identify novel strains as they appear. The superior resolution of WGS in the identification and characterization of pathogens has great potential for routine use in diagnostic laboratories (e.g. [143, 144] ). However, despite this promise, WGS has not completely superseded current diagnostic methods in most clinical microbiology laboratories. There are several obstacles to its routine implementation, including the high cost of WGS, a lack of training in bioinformatics among clinical microbiologists, a lack of the necessary computational infrastructure in most hospitals, and the difficulty of establishing the proper bioinformatic protocols [144] . Much of this can be overcome through the use of standardized or centralized 'cloud-based' computational systems. The European Society of Clinical Microbiology and Infectious Diseases (ESCMID) has recently published a review covering the need for such systems and a framework for their implementation [143, 145] . As with any system of analysis, appropriate technical requirements need to be instituted to prevent the caveats associated with poor quality sequencing that can lead to low-coverage and lost sequences. Doyle et al. [146] have highlighted the need for quality sequence for clinical diagnostics wherein they showed missed WGS-based antibiotic resistance calls due to short read, low-coverage data. This has led to the push to adopt long-read, and even circular consensus sequencing, which obviate these issues [37, [147] [148] . For over a century, infectious diseases have been controlled by vaccination and the administration of antibiotics [101] . Nevertheless, pathogenic microorganisms are still the most important threat to health worldwide [149] . Conventional vaccinology approaches were successful in conferring protection against some but not all infectious diseases [149] . The vaccine counterpart to antibiotic resistance -'vaccine escape'is partly responsible for vaccine failure. Vaccine escape implies that the target pathogen has mutated such that it no longer expresses the same form of the antigen as used to prepare the vaccine. This type of vaccine failure is completely different from those associated with host-based 'immunological windows' wherein antigen presentation does not induce a memory immune response to the vaccine antigen. Until very recently essentially all vaccines were based on killed or live-attenuated microorganisms, or their chemically deactivated toxins [101, 149] . With the genomic era we and others developed the concept of the supragenome/pan-genome [8] [9] [10] [11] [12] 14, 18] and from this came the recognition that distributed/accessory genes associated with virulence could be used to target specific pathogenic strains in a microbiome-sparing approach that would not result in the elimination of commensal strains [65, 71, 149] . Conversely, the pan-genome can be used to identify core genes that are universally present among all strains if the wish is to eliminate a bacterial species in its entirety. These potential vaccine antigens, whether core or distributed, are often identified in reverse manner starting from an analysis of the supragenome/pan-genome of a species as opposed to the use of a single whole organisma process called reverse vaccinology [150, 151] . Reverse vaccinology makes use of bioinformatics and for microbiome-sparing approaches, statistical methods using strain meta-data with regard to clinical provenance, i.e. commensal or pathogen [71, 65] , to utilize the information derived from the supragenome/pan-genome of the target bacterial species to predict potential vaccine candidates 'in silico'. The first the application of reverse vaccinology was for a vaccine against serotype B Neisseria meningitidis [152] . For a vaccine against the more complex Streptococcus agalactiae, reverse vaccinology using the supragenome/pangenome was employed for the first time [153] . Somewhat later reverse vaccinology was used to compare pathogenic and nonpathogenic strains of the same species to find antigens that truly affect pathogenesis These successes led to application of reverse vaccinology to other pathogens including N. gonorrheae and Mycoplasma pneumoniae [154, 155] . It is important to point out that reverse vaccinology and microbiome-sparing vaccine approaches are still susceptible to the evolution of vaccine escape mutants. Thus, it will always be necessary to constantly surveil the circulating pathogen population for the evolution of mutants. The vast majority (>99%) of microbial species in the biosphere cannot be cultured in the laboratory with current culturing methods and thus are contained within what has been referred to as the microbial 'dark matter' [156] [157] [158] . This cultural inability has limited our ability to understand the biology of these organisms. Microbiologists have traditionally studied populations of bacterial cells, typically using millions to billions of cells for analysis in bulk, rather than individual cells, as it has been assumed that individual cells are representative of the population. However, this assumption neglects any heterogeneity present in the population [159] [160] [161] . The individual behavior of single cells, particularly in spatially and taxonomically complex assemblages are substantially different from that of the whole population; thus, conclusions based on average molecular or phenotypic measurements of a population can be biased, as the patterns of subpopulations would not be revealed [162] . The recent development of single-cell meta-omics has greatly enhanced our understanding of the individuality and heterogeneity of microbes in multiple biological systems [162] . Single-cell omic technologies (genomics, transcriptomics, metabolomics) help reveal this hidden information from both unculturable organisms and low-abundance organisms in complex microbial communities [161] . These technologies are providing new perspectives with regard to our understanding of population diversity by bringing the power of meta-omics [163] to the single-cell level for studies of taxonomically and metabolically complex biofilms, microbiomes, and holobionts [60] [61] [62] . The ability to comprehensively characterize single cells or small populations of cells within a more complex system is revolutionizing our understanding of bacterial metabolic differentiation and how this contributes to the robustness of the biology of microbiomes and their holobionts [60] [61] [62] 160] . Single-cell sequencing (SCS) is one of the tools used in single-cell omic studies and complements metagenomic deep-sequencing methods [161] . The three main applications of single-cell omics in relation to bacterial populations are to: (1) investigate the genomes of unculturable microorganisms; (2) delineate cell-to-cell diversity within diverse populations [159] [160] [161] [162] ; and (3) compare the transcriptional activities of genomically identical cells based on their spatial orientations with respect to nutrient availability and access to the substrate. SCS has been applied in widely diverse biological and environmental contexts, including human microbiomes (e.g. [156, 164] ), seawater and marine sediments (e.g. [165] [166] [167] ), and even a hospital sink [168] [169] . The advent of the concepts of the supragenome/pan-genome and the DGH have revolutionized not only our understanding of bacterial genomics, evolution and adaptability, but they have also provided the framework for novel approaches to diagnosis, precision medicine, and vaccinology. As the scientific community continues to expand upon the past 20 years' accomplishments in this field, humanity stands to reap substantial rewards with regard to personalized medicine and public health. • The DGH and its prediction of the bacterial supragenome/pan-genome together with the biofilm paradigm has resulted in the formation of a new rubric, bacterial plurality. Bacterial plurality encompasses the concepts that persistent bacterial infections require both genotypic and metabolic heterogeneity, as well as evolution in situ, to explain what had been previously paradoxical findings associated with chronic infections. • Horizontal gene transfer (HGT) mechanisms provide the engine for robust and continuous recombination among bacteria that provides for continuous strain evolution during polyclonal colonizations and chronic infections as a means to adapt to changing environmental and host immune pressures. • The realization that there are differences in gene content and gene expression between commensal and pathogenic strains of the same species provides for specific targeting of the pathogenic strains in the design of drugs and vaccines, resulting in microbiome-sparing approaches. The authors declare no competing interests Bacterial plurality as a general mechanism driving persistence in chronic infections What role do periodontal pathogens play in osteoarthritis and periprosthetic joint infections of the knee? Population-level virulence factors amongst pathogenic bacteria: relation to infectious outcome Carriage of multiple ribotypes of non-encapsulated Haemophilus influenzae in aboriginal infants with otitis media Nonencapsulated Haemophilus influenzae in Aboriginal infants with otitis media: prolonged carriage of P2 porin variants and evidence for horizontal P2 gene transfer Simultaneous respiratory tract colonization by multiple strains of nontypeable Haemophilus influenzae in chronic obstructive pulmonary disease: implications for antibiotic therapy The distributed gene hypothesis as a rubric for understanding evolution in situ during chronic bacterial biofilm infectious processes Role for biofilms in infectious disease Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome Identification, distribution, and expression of novel genes in 10 clinical isolates of nontypeable Haemophilus influenzae Characterization, distribution, and expression of novel genes among eight clinical isolates of Streptococcus pneumoniae Extensive genomic plasticity in Pseudomonas aeruginosa revealed by identification and distribution studies of novel genes among clinical isolates The microbial pan-genome Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains The bacterial pan-genome: a new paradigm in microbiology Comparative genomic analyses of 17 clinical isolates of Gardnerella vaginalis provide evidence of multiple genetically isolated clades consistent with subspeciation into genovars Molecular and translational research approaches for the study of bacterial pathogenesis in otitis media Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species Comparative supragenomic analyses among the pathogens Staphylococcus aureus, Streptococcus pneumoniae, and Haemophilus influenzae using a modification of the finite supragenome model Generation of genic diversity among Streptococcus pneumoniae strains via horizontal gene transfer during a chronic polyclonal pediatric infection Pneumococcal genome sequencing tracks a vaccine escape variant formed through a multi-fragment recombination event Comprehensive identification of single nucleotide polymorphisms associated with β-lactam resistance within pneumococcal mosaic genes Lateral gene transfer, bacterial genome evolution, and the Anthropocene Dynamics and impact of homologous recombination on the evolution of Legionella pneumophila Antagonistic pleiotropy in the bifunctional surface protein fadL (OmpP1) during adaptation of Haemophilus influenzae to chronic lung infection associated with chronic obstructive pulmonary disease Haemophilus influenzae genome evolution during persistence in the human airways in chronic obstructive pulmonary disease Ehrlich GD construction and characterization of a highly redundant Pseudomonas aeruginosa genomic library prepared from 12 clinical isolates: application to studies of gene distribution among populations Deletion and acquisition of genomic content during early stage adaptation of Pseudomonas aeruginosa to a human host environment Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: insights into the pneumococcal supragenome Comparative analysis and supragenome modeling of twelve Moraxella catarrhalis clinical isolates Comparative genomic analyses of the Moraxella catarrhalis serosensitive and seroresistant lineages demonstrate their independent evolution In situ detection of high levels of horizontal plasmid transfer in marine bacterial communities Interspecies bacterial conjugation by plasmids from marine environments visualized by gfp expression Sexual isolation in bacteria Recombination and the nature of bacterial speciation Sexual isolation in Acinetobacter baylyi is locus-specific and varies 10,000-Fold over the genome Natural competence and the evolution of DNA uptake specificity Large-scale genomics reveals the genetic characteristics of seven species and importance of phylogenetic distance for estimating pan-genome size Gene recombination in E. coli Transduction in Escherichia coli K-12 Transductional heterogenotes in Escherichia coli Haemophilus influenzae: genetic variability and natural selection to identify virulence factors Sequence and functional analyses of Haemophilus spp. genomic islands Novel type IV secretion system involved in propagation of genomic islands The Selfish Gene Transformation of natural genetic variation into Haemophilus influenzae genomes Defining the DNA uptake specificity of naturally competent Haemophilus influenzae cells The bacterial guide to designing a diversified portfolio Competence-programmed predation of noncompetent cells in the human pathogen Streptococcus pneumoniae: genetic requirements New insights into the pneumococcal fratricide: relationship to clumping and identification of a novel immunity factor Competence-induced fratricide in streptococci Interbacterial predation as a strategy for DNA acquisition in naturally competent bacteria Population genomics of early events in the ecological differentiation of bacteria Structure of the bacterial sex F pilus reveals an assembly of a stoichiometric protein-phospholipid complex Genome expansion in early eukaryotes drove the transition from lateral gene transfer to meiotic sex Horizontal gene transfer: essentiality and evolvability in prokaryotes, and roles in evolutionary transitions. F1000Res 5, F1000. Faculty Rev-1805 Evolution: the end of an ancient asexual scandal Genetic exchange amongbdelloid rotifers is more likely due to horizontalgene transfer than to meiotic sex Developing insights into the mechanisms of evolution of bacterial pathogens from whole-genome sequences Beiträge zur Theorie der Evolution der Organismen. I. Das typologische Grundgesetz und seine Folgerungen für Phylogenie und Entwicklungsphysiologie [Contributions to the evolutionary theory of organisms: I. The basic typological law and its implications for Symbiogenesis and symbionticism What makes pathogens pathogenic? The time is now for gene-and genome-based bacterial diagnostics: "you say you want a revolution Codon usage comparison of novel genes in clinical isolates of Haemophilus influenzae Identification and characterization of msf, a novel virulence factor in Haemophilus influenzae Selfish operons and speciation by gene transfer Bacterial toxin-antitoxin systems: more than selfish entities? Phenotypic diversity and genotypic flexibility of Burkholderia cenocepacia during long-term chronic infection of cystic fibrosis lungs New human gene tally reignites debate Development and validation of an Haemophilus influenzae supragenome hybridization (SGH) array for transcriptomic analyses Identificatiuon and Characterization of a Novel Bacterial Virulence Factor in Haemophilus Influenzae (Doctoral dissertation Design and validation of a supragenome array for determination of the genomic content of Haemophilus influenzae isolates Complete genome sequence of Haemophilus influenzae strain 375 from the middle ear of a pediatric patient with otitis media Differences in genotype and virulence among four multidrug-resistant Streptococcus pneumoniae isolates belonging to the PMEN1 clone In vivo capsular switch in Streptococcus pneumoniae-analysis by whole genome sequencing Streptococcus pneumoniae supragenome hybridization arrays for profiling of genetic content and gene expression Genetic stabilization of the drug-resistant PMEN1 pneumococcus lineage by its distinctive DpnIII restriction-modification system Microbial genome-wide association studies: lessons from human GWAS The advent of genome-wide association studies for bacteria Host and microbiome genome-wide association studies: current state and challenges Host-bacterial symbiosis in health and disease Pathobionts of the gastrointestinal microbiota and inflammatory disease The new era of treatment for obesity and metabolic disorders: evidence and expectations for gut microbiome transplantation Metagenome-wide association studies: fine-mining the microbiome The mammalian intestinal microbiome: composition, interaction with the immune system, significance for vaccine efficacy, and potential for disease therapy Fusobacterium nucleatum induces premature and term stillbirths in pregnant mice: implication of oral bacteria in preterm birth Term stillbirth caused by oral Fusobacterium nucleatum Fecal transplants as a microbiome-based therapeutic Effectiveness of fecal-derived microbiota transfer using orally administered capsules for recurrent Clostridium difficile infection Effect of fecal microbiota transplantation on 8-week remission in patients with ulcerative colitis: a randomized clinical trial Fecal microbial transplantation and its potential application in cardiometabolic syndrome Weight gain after fecal microbiota transplantation Alert: Fecal microbiota for transplantation: safety communication -risk of serious adverse reactions due to transmission of multidrug-resistant organisms Sequencing and beyond: integrating molecular 'omics' for microbial community profiling Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle The skin microbiome The human skin microbiome Pilot study on novel skin care method by augmentation with Staphylococcus epidermidis, an autologous skin microbe-a blinded randomized clinical trial Transplantation of human skin microbiota in models of atopic dermatitis Skin microbiome modulation induced by probiotic solutions The pan-genome: towards a knowledge-based discovery of novel targets for vaccines and antibacterials Precision antimicrobial therapeutics: the path of least resistance? Broad-host-range IncP-1 plasmids and their resistance potential. Front. Microbiol. 4, 44 Mobile gene cassettes and integrons: moving antibiotic resistance genes in gram-negative bacteria A search for medications to treat COVID-19 via in silico molecular docking models of the SARS-CoV-2 spike glycoprotein and 3CL protease The development of a pipeline for the identification and validation of small-molecule RelA inhibitors for use as anti-biofilm drugs Active starvation responses mediate antibiotic tolerance in biofilms and nutrient-limited bacteria Detection of Streptococcus pneumoniae in whole blood by PCR Identification of a patient with Streptococcus pneumoniaebacteremia and meningitis by the polymerase chain reaction (PCR) Molecular analysis of bacterial pathogens in otitis media with effusion PCR-based detection of bacterial DNA after antimicrobial treatment is indicative of persistent, viable bacteria in the chinchilla model of otitis media Evidence of bacterial metabolic activity in culture-negative otitis media with effusion Direct detection of bacterial biofilms on the middle-ear mucosa of children with chronic otitis media Biofilms and chronic infections Chronic surgical site infection due to suture-associated polymicrobial biofilm Direct demonstration of Staphylococcus biofilm in an external ventricular drain in a patient with a history of recurrent ventriculoperitoneal shunt failure Pathogenic biofilms in adenoids: a reservoir for persistent bacteria Demonstration of Bacillus cereus in Peri-implant infection using a multi-primer PCR-mass spectrometric assay: report of two cases Characterization of a mixed MRSA/MRSE biofilm in an explanted total ankle arthroplasty Characterization of bacterial communities in venous insufficiency wounds by use of conventional culture and molecular diagnostic methods Successful identification of pathogens by polymerase chain reaction (PCR)-based electron spray ionization time-of-flight mass spectrometry (ESI-TOF-MS) in culture-negative periprosthetic joint infection Comparison of PCR/electron spray ionization -time-of-flight -mass spectrometry versus traditional clinical microbiology for detection of organisms contaminating high-use surfaces in a burn unit, an orthopaedic ward and healthcare workers Culture-Negative Orthopedic Biofilm Infections, Springer Verlag Series on Biofilms. 144 pages, 47 illustrations The microbiome of chronic rhinosinusitis: culture, molecular diagnostics and biofilm detection Detection of methicillin-resistant and methicillin-susceptible Staphylococcus aureus colonization of healthy military personnel by traditional culture, PCR, and mass spectrometry Can we trust intraoperative culture results in nonunions? And the MAPP research network. Search for microorganisms in men with urologic chronic pelvic pain syndrome: a culture-independent analysis in the MAPP research network Synovial fluid analysis using PCR-ESI-TOF-MS for detection of bacterial and fungal pathogens in native knee arthritis Bacterial diversity in surgical site infections: not just aerobic cocci any more The polymicrobial nature of biofilm infection Analysis of the chronic wound microbiota of 2,963 patients by 16S rDNA pyrosequencing Microbiota is a primary cause of pathogenesis of chronic wounds Recommendations for the management of biofilm: a consensus document The prevalence of biofilms in chronic wounds: a systematic review and meta-analysis of published data Consensus guidelines for the identification and treatment of biofilms in chronic nonhealing wounds Wound biofilm: current perspectives and strategies on biofilm disruption and treatments Temporal dynamics of relative abundances and bacterial succession in chronic wound communities Biofilms cause chronic infections Defying hard-to-heal wounds with an early antibiofilm intervention strategy: 'wound hygiene Aerococcus urinae and Globicatella sanguinis persist in polymicrobial urethral catheter biofilms examined in longitudinal profiles at the proteomic level Pan-genome analysis provides much higher strain typing resolution than does MLST Application of next generation sequencing in clinical microbiology and infection prevention From theory to practice: translating whole-genome sequencing (WGS) into the clinic ESCMID study group for genomic and molecular diagnostics (ESGMD). practical issues in implementing whole-genome-sequencing in routine diagnostic microbiology Discordant bioinformatic predictions of antimicrobial resistance from whole-genome sequencing data of bacterial isolates: an inter-laboratory study Species-level bacterial community profiling of the healthy sinonasal microbiome using pacific biosciences sequencing of full-length 16S rRNA genes Interaction between the microbiome and TP53 in human lung cancer Genome-based approaches to develop vaccines against bacterial pathogens Reverse vaccinology, a genome-based approach to vaccine development Reverse vaccinology 2.0: Human immunology instructs vaccine antigen design Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing Identification of a universal group B streptococcus vaccine by multiple genome screen Integrated bioinformatic analyses and immune characterization of new Neisseria gonorrhoeae vaccine antigens expressed during natural mucosal infection Reverse vaccinology and subtractive genomics reveal new therapeutic targets against Mycoplasma pneumoniae: a causative agent of pneumonia Dissecting biological "dark matter" with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth Insights into the phylogeny and coding potential of microbial dark matter Multi-informatic Approaches to Identifying OM-specific Virulence Genes from within the Bacterial Genomic Dark Matter Advances and applications of single cell sequencing technologies Single-cell genome sequencing: current state of the science Single cell sequencing: a distinct new field Tools for genomic and transcriptomic analysis of microbes at the single-cell level Meta-omic characterization of the marine invertebrate microbial consortium that produces the chemotherapeutic natural product ET-743 Culturing of 'unculturable' human microbiota reveals novel taxa and extensive sporulation Metagenome, metatranscriptome and single-cell sequencing reveal microbial response to deepwater horizon oil spill Single-cell (meta-)genomics of a dimorphic Candidatus Thiomargarita nelsonii reveals genomic plasticity Single-cell genomics based on Raman sorting reveals novel carotenoid-containing bacteria in the Red Sea Genome of the pathogen Porphyromonas gingivalis recovered from a biofilm in a hospital sink using a high-throughput single-cell genomics platform Candidate phylum TM6 genome recovered from a hospital sink biofilm provides genomic insights into this uncultivated phylum Dissecting vancomycin-intermediate resistance in Staphylococcus aureus using genome-wide association Dense genomic sampling identifies highways of pneumococcal recombination Genetic architecture of artemisinin-resistant Plasmodium falciparum Predicting the virulence of MRSA from its genome sequence Distinct phenotypic and genomic signatures underlie contrasting pathogenic potential of Staphylococcus epidermidis clonal lineages Combinatorial small-molecule therapy prevents uropathogenic Escherichia coli catheter-associated urinary tract infections in mice The Antiadhesive strategy in Crohn's disease: orally active mannosides to decolonize pathogenic Escherichia coli from the gut Glycomimetic, orally bioavailable LecB inhibitors block biofilm formation of Pseudomonas aeruginosa Human commensals producing a novel antibiotic impair pathogen colonization Propionibacterium acnes is developing gradual increase in resistance to oral tetracyclines The authors would like to thank Ms. Carol Hope, MBA, for administrative support in the preparation and submission of this paper. We are also grateful to Josh Earl for his principal role in the development and implementation of much of the bioinformatic software that underlies so much of the data discussed herein. Finally, we thank the entire faculty and staff of the Center for Genomic Sciences and the Center for Advanced Microbial Processing for generating much of the data that is summarized in this review. This work was supported by Drexel University College of Medicine; the Oskar Fisher Project, a gift from Dr. James Truchard; the Bill and Marion Cook Foundation; and NIH R01 DC-02148 and NIH U01 DK-082316 to GDE. CSP, competence stimulating peptide; DGH, distributed genome hypothesis; FMT, fecal microbial transplants; HGT, horizontal gene transfer; MLST, multiple-locus sequence typing; PCR, polymerase chain reaction; SCS, single-cell sequencing; WGS, whole-genome sequencing.