key: cord-0843919-7ac3edsl authors: Castro-Nallar, Eduardo; Pérez-Losada, Marcos; Burton, Gregory F.; Crandall, Keith A. title: The evolution of HIV: Inferences using phylogenetics date: 2011-11-26 journal: Mol Phylogenet Evol DOI: 10.1016/j.ympev.2011.11.019 sha: 59bf0d7138b4f41dc29fd6c62c3f3b87f2b3f57c doc_id: 843919 cord_uid: 7ac3edsl Molecular phylogenetics has revolutionized the study of not only evolution but also disparate fields such as genomics, bioinformatics, epidemiology, ecology, microbiology, molecular biology and biochemistry. Particularly significant are its achievements in population genetics as a result of the development of coalescent theory, which have contributed to more accurate model-based parameter estimation and explicit hypothesis testing. The study of the evolution of many microorganisms, and HIV in particular, have benefited from these new methodologies. HIV is well suited for such sophisticated population analyses because of its large population sizes, short generation times, high substitution rates and relatively small genomes. All these factors make HIV an ideal and fascinating model to study molecular evolution in real time. Here we review the significant advances made in HIV evolution through the application of phylogenetic approaches. We first examine the relative roles of mutation and recombination on the molecular evolution of HIV and its adaptive response to drug therapy and tissue allocation. We then review some of the fundamental questions in HIV evolution in relation to its origin and diversification and describe some of the insights gained using phylogenies. Finally, we show how phylogenetic analysis has advanced our knowledge of HIV dynamics (i.e., phylodynamics). AIDS is one of the most serious modern diseases (Leeper and Reddi, 2010; UNAIDS, 2010) and the Human Immunodeficiency Virus (HIV) is the causative agent ( Barre-Sinoussi et al., 1983; Gallo et al., 1983; Popovic et al., 1984; Sarngadharan et al., 1984) . According to World Health Organization (WHO), 33.4 million people (31.1-35.8 million) were living with HIV worldwide as of 2009. In the same year, 2 million infected people died and the disease grew at a rate of 7400 new infections per day, more than 97% of which occurred in low-and middle-income countries. To date, sub-Saharan Africa is the most affected region on earth, with 67% of the world's HIV infections (UNAIDS, 2010) . Host restriction factors in the form of proteins such as TRIM5/22, APOBEC, and Tetherin have been to some extent ineffective in blocking early HIV-1 infection (Neil et al., 2008; Sakuma et al., 2007; Stopak et al., 2003; Tissot and Mechti, 1995) . In addition, Highly Active Antiretroviral Therapy (HAART) has been very effective at reducing viral loads within patients and thereby significantly prolonging life expectancy for HIV infected individuals, particularly in those countries where HAART is accessible. However, even when HAART is available, effective control remains elusive due to the number of evolved mechanisms that HIV uses to evade the host immune system (Fischer et al., 2010; Wei et al., 2003) , the evolution of drug resistance (Price et al., 2011; Shi et al., 2010) , and the isolation of viral reservoirs from drug treatments (Chomont et al., 2009; Finzi et al., 1997) . The extraordinary genetic diversity observed among circulating populations of HIV has hampered the development of a vaccine to provide immunity or control of AIDS. Despite much research, phase III trials of HIV vaccines have been few and have failed to provide full protection, which is probably due to the extensive variation of HIV isolates (McBurney and Ross, 2008) . Although some partial protection was observed in the Thai RV144 Phase III HIV vaccine trial (Rerks-Ngarm et al., 2009) , an effective vaccine against HIV remains elusive. While our knowledge of HIV biology is still limited, we have gained significant insights through the application of phylogenetics to HIV diversity. For example, phylogenetic analyses have elucidated the origins of HIV-1 and -2 epidemic (Gao et al., 1994 (Gao et al., , 1999 Korber et al., 2000; Lemey et al., 2003; Salemi et al., 2001; Sharp et al., 2000) , the relationships of HIV to other simian lentiviruses Essex, 1994; Gao et al., 1992; Wertheim and Worobey, 2007) , and the classification of HIV diversity within HIV-1 (Kosakovsky Pond et al., 2009) . Cross-species transmissions Beer et al., 1999; Gao et al., 1999; Hahn et al., 2000; Plantier et al., 2009; Takehisa et al., 2009; Wertheim and Worobey, 2007; Worobey et al., 2004) have been identified and characterized through the use of phylogenetic approaches. Such methods have been used to test hypotheses of transmission events of HIV between individuals (Hillis and Huelsenbeck, 1994; Leitner et al., 1996; Xin et al., 1995) , including their use as evidence of transmission in legal settings (Bernard et al., 2007; Crandall, 1995; Metzker et al., 2002; Ou et al., 1992) . Phylogenetics has been key to the identification of drug-resistance mutational pathways (Buendia et al., 2009; Crandall et al., 1999) and the mechanisms of drug resistance (Carvajal-Rodríguez et al., 2008; Lemey et al., 2005a; Machado et al., 2009) . Moreover, phylogenetic approaches have been used to assess within-and among-host HIV diversity and population dynamics (i.e., phylodyanamics) (Grenfell et al., 2004) , co-divergence (Beer et al., 1999; Bibollet-Ruche et al., 2004; Chen et al., 1996; Wertheim and Worobey, 2007) , and the role of recombination in the diversification process (Carvajal-Rodríguez et al., 2007; Jobes et al., 2006; Schlub et al., 2010) . This exceptional diversity has been examined to infer geographical distribution and dispersion patterns Robbins et al., 2003) and thereby test hypotheses associated with molecular epidemiology (Holmes et al., 1995; Salemi et al., 2008) . Phylogenetics has been key to identifying patterns and mechanisms of natural selection (Lemey et al., 2007; Pond et al., 2008; Poon et al., 2010; Templeton et al., 2004) , including the intra-and inter-host adaptive forces that shape the evolution of the virus (Carvajal-Rodríguez et al., 2008; Keele et al., 2008a; Salazar-González et al., 2008; Shankarappa et al., 1999) , which is essential for effective vaccine development (Frahm et al., 2008) . The first applications of phylogenetics to the study of HIV date from the early 1990s and were aimed at inferring the origins of HIV-1 and the classification of HIV into different types (1 and 2), groups (M, N, O within HIV-1), and subtypes (A-D, F-H, and J and K within Group M of HIV-1) (Huet et al., 1990; Fig. 1) . Today, phylogenetic analysis has become a common practice of many HIV/AIDS research programs, due mainly to the many insights these analyses can provide and the novel questions they can address over a variety of topics related to HIV biology. Over the past two decades, HIV data have accumulated rapidly in public and specialized databases thereby creating one of the richest datasets we have for a single entity in terms of sequence tallies and epidemiological information (e.g., sampling locality, drug resistance mutations, tissue allocation, etc.). For instance, the number of available sequences in the Los Alamos database (http://www.hiv.lanl.gov) has exploded to 339,306 sequences, a 45% increase over the preceding year, with 2576 complete genomes (Kuiken et al., 2010) . Here, we review the application of phylogenies to the study of HIV. Although because of the nature of the subject, no article-size review can be comprehensive, we hope to instructively show how phylogenetic approaches have influenced our current understanding of the emergence, evolution of drug resistance, epidemiology and dynamics of HIV and assist with the problem of its eradication (Grenfell et al., 2004; Holmes and Grenfell, 2009; Stack et al., 2010) . The defining feature of HIV is its exceptional genetic diversity. This high diversity stems from at least four sources; namely, high substitution rates, a rather small genome, short generation times, and high recombination frequency. HIV-1 substitution rates [0.002 substitutions/site/year (Korber et al., 2000) ] are related, among others, to high mutation rates, which in HIV-1 has been estimated to be 0.1-0.2 mutations/genome/generation (Mansky and Temin, 1995) , that is 33 times more than Neurospora crassa, but around 10-fold lower than that of Influenza A virus (Drake et al., 1998) . The HIV genome, as well as of other RNA viruses, is small (exceptions are coronaviruses and roniviruses with >25 kb) with 9.8 kb in length (see Fig. 2 for genome structure). Genome size in HIV, as in other RNA viruses, is apparently limited by the errorprone nature of its replication machinery: the longer the genome, the more mutations produced, most of them being deleterious (Holmes, 2009) . In terms of HIV diversity, such a small genome size impacts on generation time (1.2 days for HIV-1) with 10 10 virions produced daily in an infected individual . In addition to the staggering numbers above, HIV-1 recombines at a frequency of 1-3 recombination events/genome/generation (Jetzt et al., 2000; Shriner et al., 2004) . Altogether, this represents a tremendous amount of raw material for evolution. However, variation is distributed unevenly across the genome. The HIV-1 genome structure is composed of three main genes, gag, pol and env, plus accessory genes, tat, rev, vif, vpr, vpu and nef, flanked by Long-Terminal Repeats (LTRs). All genes are coded over the three forward reading frames, including frame shifts in the case of tat and rev (Fig. 2) . Variation in substitution rates is known to happen genome-wide and within specific genes, which has been interpreted as evidence of functional and structural constrains acting upon nucleotide sequences (Ngandu et al., 2008; Pond and Muse, 2005) . As described below, this variation leads to divergent patterns in HIV evolution, whether we look at population or within-host data, with consequences to disease progression, natural selection, and drug resistance. Phylogenies allow researchers to determine patterns of the extensive genetic diversity of HIV to examine human-scale ecological and epidemiological processes. Divergent patterns of HIV evolution are observed when comparing intra-and inter-host phylogenies. Ladder-like intra-host phylogenies are evidence of a continuous immune-driven selection [similar to inter-host influenza phylogenies (Buonagurio et al., 1986; Bush et al., 1999) ], in Phylogenetic tree representation of HIV-1 recombinants and discrete subtypes. Note the lack of genetic structure due to the presence of recombinant sequences. Also note that the existence of discrete subtypes is questionable due to the high evolutionary rates the virus exhibits. This tree should be regarded as a snapshot of part of the observable diversity. CRF = circulating recombinant form; cpx = complex recombinant pattern. A-D, F-H, J and K denote HIV-1 subtypes. which there is no high genetic diversity at any given time point; rather, there are few lineages with sequential replacement of strains over time. In contrast, inter-host phylogenies do not exhibit this pattern; instead multiple lineages coexist at any given time. This is probably the product of major bottlenecks at transmission (Brown, 1997) . Whether drift, selection or both govern these bottlenecks is not clear (Edwards et al., 2006; Keele et al., 2008a; Salazar-González et al., 2008) , but the impact of random genetic drift on the population dynamics, genetic diversity, and clinical outcome is well studied (Tazi et al., 2011) . Thus, HIV-1 possesses intrinsic mutational properties that prompt it to exhibit different patterns of substitution depending upon whether we look at within-host or inter-host genetic data. The relationship between disease progression and substitution rates (or genetic diversity) was recognized early in HIV studies. The hypothesis is that host processes that determine HIV pathogenesis, also determine viral replication rates. Thus, by looking at substitution rates we can infer what is happening with HIV within patients. In a thorough study, Shankarappa et al. (1999) monitored nine patients over a 6-12 years period. They distinguished three phases of disease progression associated with diversity in 600 bp of env. In the first phase, a linear increase of diversity was associated with initial features of infection and X5 HIV populations. During the second phase, diversity leveled off or even decreased, which was correlated with the appearance of X4 HIV populations. Finally, the third phase was related to a decline in CD4 + T cells and with the failure of T cell homeostasis and a reduction in diversity. Although, these results might not be general for other patients, as these may have been infected with phenotypically different strains (different rates), and different genes may generate different patterns, other studies have somewhat supported this hypothesis. Using the previous dataset along with others, Lemey et al. (2007) found a positive association between disease progression and HIV-1 synonymous substitution rates. When analyzing HIV-2 sequences, they observed an overall low substitution rate that might reflect the reduced virulence that this viral type exhibits. On the other hand, non-synonymous rate changes have been associated with immune pressure. Consequently, a decrease in selective pressure would correlate with the breakdown of the immune system. However, with a different approach and on a different dataset, Carvajal-Rodríguez et al. (2008) reported no relationship between disease progression and substitution rates when analyzed separately into adaptive and neutral categories of variation. An opposite pattern of synonymous and non-synonymous substitutions were observed through time when dividing the dataset in rapid (RP) and non-rapid progressors (NRP), with RP showing a slow increase in non-synonymous substitutions and NRP showing a fast increase. These results could be a result of the different methodologies used; Lemey et al. were comparing absolute rates of substitution while Carvajal-Rodríguez et al. were comparing relative rates. Moreover, the former study accounted for deleterious mutations, but not for recombination and the latter did the opposite. It remains unknown if both datasets would converge on similar conclusions if the same methodologies were used. It is worth mentioning that animal models can provide an optimal environment to account for subject variability, sample size, and HIV variation (Berges et al., 2010) , thus, providing a well-defined and structured opportunity to test the hypothesis above. Substitution rates are logically tied to the identification of natural selection and site-based molecular adaptation. Development and refinement of methods have opened a plethora of questions that biologists can address with these sophisticated methods (Mens et al., 2007; Pond and Muse, 2005) . Several studies have addressed questions regarding natural selection in HIV at the molecular level. It is generally accepted that nucleotide changes that do not change the amino acid composition (synonymous; dS) are more likely to be neutral than changes that affect amino acid composition (non-synonymous; dN) (Sharp, 1997) . Therefore, rate ratio changes between both types of substitutions (dN/dS) could predict whether purifying (dN/dS < 1), positive (dN/dS > 1) or neutral (dN/dS = 1) selection is present at the gene and/or codon level. Under this paradigm and using phylogenies to make sense of nucleotide changes, several studies have tried to identify molecular determinants of selection in HIV. For instance, these studies have been instrumental in demonstrating that the switch from X5 to X4 tropic populations is highly positively selective, and so is the switch from non-syncytium forming HIV strains to the ones able to form syncytia (Templeton et al., 2004) . Using these approaches, others have attempted to study selection dynamics within-and among-hosts (even inter-populations) to ascertain the extent to which HIV variation is maintained and passed between individuals (Pond et al., 2006; Poon et al., 2007) . At the inter-population level, several within-host adaptations seem to be transient while persistent substitutions are subject to stronger selective pressures, which ultimately fix these variants in different populations (Pond et al., 2006) . More ambitious studies have analyzed large quantities of data in order to come up with novel drug-resistant and high-fitness mutations that exhibit signatures of positive selection (Chen et al., 2004) . Recently, a method well suited for analyzing adaptive rates (adaptations/codon/year) in large datasets has been published (Bhatt et al., 2011) ; some of its virtues are computationtractability and robustness to biases introduced by synonymous mutations and RNA secondary structures. When using serially sampled data, a previous time point can be used as a homologous outgroup to the more contemporaneous large ingroup dataset. In doing so, the algorithm determines which sites are ancestral or derived. Sites are classified as silent or replacement and/or high-, medium-and low-frequency polymorphisms. These values are then combined so that the output reflects the proportion of fixed sites that have undergone adaptive change. For additional details on detecting selection, see the following (McDonald and Kreitman, 1991; Pérez-Losada et al., 2007c; Smith and Eyre-Walker, 2002) . Drug therapy was once envisioned as a potential cure for HIV infected patients (Perelson et al., 1997; Wain-Hobson, 1997) . With the emergence of the first drugs during the mid of the 1980s, antivirals seemed to control viral infection readily (Mitsuya et al., 1985) . However, it took just 4 years following the introduction of Zidovudine (AZT) before the first mono-resistant HIV-1 strains were found and officially reported (Larder and Kemp, 1989) . HIV drug resistance has risen considerably in resource-rich countries, perhaps due to widespread treatment access, although resistance is also present in low-and middle-income countries. Mathematical predictions failed to account for certain biological characteristics that impact the population dynamics of HIV, including, (i) extremely fast evolutionary rates and within host population structure by specific cell types (Perelson et al., 1997) ; (ii) high within-host population sizes, reaching 10 7 -10 8 productively infected cells in lymphoid tissue; (iii) high substitution rates due to an error-prone reverse transcriptase (RT) (see Mansky and Temin, 1995) ; and (iv) the once-neglected recombination process that seems to play a major role in HIV evolution and, consequently, in drug resistance (Carvajal-Rodríguez et al., 2006) . In fact, recombination is of prime importance in HIV, accounting for much of the observed diversity, and at times exceeding the mutation rate by 5.5 times (Shriner et al., 2004) . Furthermore, cells can harbor different proviruses, (Keele et al., 2008a) and multiple infections can occur simultaneously, (Jobes et al., 2006; Xin et al., 1995) . Several gene products have been targeted for drug treatment as these are involved in key stages of the viral replication cycle. Those include RT, protease, the envelope glycoprotein complex (gp120-gp41) and lately the virion infectivity factor (vif) (see Coffin, 1999) (Fig. 2) . Inhibitors include nucleoside and nucleotide analogs as well as non-nucleoside analogs that are able to impair some parts of reverse transcription (analog incorporation, analog removal), inhibitors of protease activity, and inhibitors of fusion to plasma membrane (Clavel and Hance, 2004) . There is an extensive database (http://hivdb.stanford.edu/) of mutations conferring drug resistance spread throughout the HIV genome, some of them conferring cross-resistance, i.e., resistance to drugs that the patient has never been exposed to and mutations that have compensatory effects on fitness lost by the primary mutations (Rhee et al., 2003) . Without doubt, antiviral treatments such as HAART have greatly improved the quality of life and life expectancy of those infected. However, it is far from being a cure as, in part, evidenced by the appearance of drug resistance mutations in treated and untreated patients (Lataillade et al., 2010) . Recombination in retroviruses was described a few decades ago with mechanistic details (Coffin, 1979; Goodrich and Duesberg, 1990; Temin, 1991) . However, through the mid 1990s, recombination in HIV-1 was regarded as almost non-existent mainly because it was thought that multiple infections within the same individual were rather unlikely. This led to the general thought that recombination could not contribute to HIV-1 evolution. Retroviral recombination was demonstrated experimentally in feline and murine species, for which recombinant viruses possessed altered tropism, host ranges or virulence (Golovkina et al., 1994; Tumas et al., 1993) ; but again evidence for recombination in HIV was rare. However, in 1995 and using a phylogenetic incongruence method, recombination was detected in HIV and it was suggested to be underestimated (Robertson et al., 1995a) . Initially, recombination was detected readily in Africa, probably due to the high genetic divergence of HIV-1 strains co-circulating in that country. This allowed for the opportunity for co-infection of different subtypes, which made recombination detection easier. Soon more evidence of recombination in HIV-1 (Diaz et al., 1995; Robertson et al., 1995b; Zhu et al., 1995) and for HIV-2 was detected using phylogenies (Gao et al., 1994) . Evidence of intra-subtype recombination within-host was also detected through phylogenetic and substitution rate analyses (Jobes et al., 2006) . There are now a variety of methods available for detecting recombination in HIV sequences and estimating recombination rates and it is apparent that recombination plays a significant role in HIV evolution . HIV recombination has some unique characteristics that resembles sexual reproduction in multicellular organisms (Temin, 1991) . HIV is essentially diploid in that it possesses two full-length replication-capable genome copies within the protein capsid that can have different evolutionary histories and thus be viewed as a heterozygous virion. It is not diploid though in the sense that just one genome copy gets replicated and finally segregated when the virus infects another cell; instead just one allele is passed onto the progeny (Onafuwa-Nuga and Telesnitsky, 2009). However, different HIV genomes within the same cell can recombine and yield offspring carrying genetic material from both ''parents.'' The costs of sexual reproduction have been reviewed extensively (Fox et al., 2001) , though it is widely accepted that sex can put together ''good mutations'' and it can purge bad mutations out of the gene pool. Moreover, when a super-infection occurs, i.e., an infection by a second strain in a patient already infected, the likelihood of recombination or recombination detection should increase. Thus, this process could accelerate the emergence of multi-drug resistance recombinant forms. However, this view has been challenged by research groups that have found negative correlations between the appearance of drug resistance mutations and super-infection under computational genetic models (Bretscher et al., 2004) . Regardless of these models, inter-subtype recombinants, i.e., circulating recombinant forms (CRFs), have been described since 1996 and now occur worldwide encompassing at least 49 variants ( Fig. 1 and http://www.hiv.lanl.gov/). These recombinant forms reflect successful recombination events that have been fixed in populations and may represent higher fitness forms; nevertheless this is debatable because of a lack of fitness measurements in vivo (Holmes, 2009) . Therefore, evidence of recombination has been now found at every virus level, i.e., inter-and intra-subtypes, among HIV groups and among primate lentiviruses. In fact, recombination is so pervasive and has such an impact on genetic structure, that it is questionable whether HIV occurs as discrete subtypes (Holmes, 2009) . The effect of recombination on drug resistance evolution seems to be dependent on the intensity of selection pressure. By using simulated data, Carvajal-Rodríguez et al. (2007)painted a more detailed picture of this process. They found that, under high selection pressures, recombination would favor the appearance of drug resistant HIV variants. This would be dependent on population size; the larger the population size, the more likely drug resistance recombinants will appear and become fixed in the population. We expect this prediction to be met in, for instance, patients under drug treatment and/or experiencing a strong adaptive immune response. Given that recombination is such an important factor in the evolution of HIV, it is important to test for recombination in DNA sequence data prior to phylogenetic analyses . Phylogenetic recombination detection methods are the most popular if the goal is to analyze considerable amounts of data, e.g., bootscanning algorithm (Lole et al., 1999) , although it has been shown that they do not perform as well as others, e.g., Runs test (Posada and Crandall, 2001) . On the other hand, experimental detection of recombination relies on laborious and time-consuming assays based on single-round replication cycles. Typically, these use pairs of vectors which reconstitute a selectable marker when recombination occurs, or they use single vectors when the goal is to assess intra-strand recombination (Onafuwa-Nuga and Telesnitsky, 2009). The information you can draw from experimental studies allows for detecting average recombination frequencies or hotspots. However, the estimation of recombination rates, breakpoints, and the identification of parental sequences at the population level is not likely achievable in this framework. On the other hand, statistical methods are well suited for HIV inter-, intra-, and host population studies. The literature contains numerous methodologies for detecting recombination breakpoints (reviewed in . Based on relevant evidence for recombination, they have been tentatively classified as distance methods, phylogenetic methods, compatibility methods and substitution distribution methods (Posada and Crandall, 2001) . By far, phylogenetic methods are the most commonly used (again, not always the best choice). Despite the plethora of methodological alternatives at hand, recombination detection is not an easy task. One reason for this is that it depends on several factors, including the amount of divergence among sequences, and where and how frequently the event is occurring (Lewis-Rogers et al., 2004) . In addition, modern recombination rate estimation involves the use of coalescent-based methods that account for evolutionary history and uncertainty in the estimates (Kuhner, 2006) . Although efforts have been made to implement more complex and realistic models, caution should be exercised when using them because of their sensitivity to assumption violations (e.g., deviations from neutrality and population stability, which are likely to occur in natural populations, thus frequently violated (Carvajal-Rodríguez et al., 2006; Kuhner, 2009) . In HIV research, recombination methods have been used to explore many different aspects of HIV biology. For instance, intrahost recombination has been studied in post mortem tissues exhibiting normal and abnormal histopathology in patients who received HAART. Tissues with abnormal histopathology show higher numbers of recombinant sequences. Likewise, these tissues display increased macrophage proliferation and it is well known that this cell type is involved in hiding HIV from HAART. Thus, macrophages may contribute to elusive recombinant forms evidenced by extensive recombination in non-lymphoid populations . Nora et al. (2007) provided additional evidence supporting the role of recombination in drug resistance evolution. Based on phylogenetic incongruence of samples taken before and after patient treatment change, they showed that resistant HIV strains, after the treatment change, likely originated through recombination of strains carrying previously existing resistance mutations in a novel combination. It is worth noting that the observed patterns were apparently not consistent with convergent evolution because potential donors could be identified due to the extensive sampling performed. Moreover, the pattern of substitutions observed in the multidrug resistant variants present after treatment change suggests that this variation arises by recombination and not likely by the accumulation of mutations. However, it would be interesting to see whether observed recombination patterns are congruent when applying more stringent statistical and phylogenetic methods of detection (Martin et al., 2005) . A viral reservoir refers to a specific cell type or anatomical compartment where (i) HIV is protected from antiviral drugs and the immune system, (ii) shows greater stability than virus in the active replicating pool, (iii) possesses greater genetic diversity than nonreservoir virus due to the presence of archival strains, and (iv) remains replication-competent. HAART effectively reduces viral loads to <50 genome-copies/mL, the detection limit of most approved assays. The importance of virus in reservoirs is underscored by the observation that once therapy is withdrawn, viral loads increase within a few weeks to the levels of drug-naïve individuals (e.g., Imamichi et al., 2001) . Taking into account the average halflife of long-term, latently infected cells, Siliciano et al. (2003) estimated that it would take around 60 years to deplete the main viral reservoir. This estimate is based on current antiretroviral therapy and does not consider drug resistance, drug toxicity and tolerance, or treatment costs. Thus, although novel methods for inducing proliferation of latently infected cells and subsequent elimination have been proposed (Marsden and Zack, 2009) , the problem of eradication still persists and is not likely to be eliminated under existing therapies. Phylogenetic methods are especially helpful in characterizing HIV reservoirs. By sampling a suspected reservoir over time and inferring evolutionary relationships, answers to questions such as whether HIV replicates in a particular compartment or not can be answered by looking at the branch lengths (i.e., genetic changes over time) of specific phylogenies. Depending on the amount of change and data richness present in the collected dataset, more specific inferences can be drawn such as divergence times and changes in population size over the time. A reservoir is then expected to have greater genetic diversity than other compartments (e.g., blood). Genotype networks are particularly suited to this aim since ancestral genotypes are located ''center-wise'' in the network from which ''founder viruses'' branch off (Crandall and Templeton, 1993) , migrating into other compartments (Fig. 3) . This is also true if the aim is to test whether archival strains are present in the bloodstream. To a great extent, phylogenies are also instrumental to estimate parameters that can describe diversity in within-host HIV populations. Modern estimates of genetic diversity, expressed as substitution rate-scaled effective population size (h = 4N e l), rely on coalescent simulations for which genealogical reconstructions and phylogenetic models are essential. Several cell types have been identified that play a role in hosting HIV at different time points during the course of drug therapy. The main cell type that plays host to HIV after initiation of HAART is activated CD4+ T cells (Chun et al., 1997; Delobel et al., 2005) , which rapidly die within 2-3 days. Then, dendritic cells (DC), partially activated CD4+ T cells, and macrophages are thought to contribute to persistence due to their susceptibility to HIV infection, less vulnerability to cytopathic effects, and half-life up to several weeks (Dahl et al., 2010) . However, long-lived, memory CD4+ T cells bearing latent integrated provirus contribute the most to the problem of persistence. Populations of this cell type are maintained by the intrinsic long-term survival and homeostatic proliferation of infected cells (Chomont et al., 2009 ). Whether or not HAART impairs viral replication and then HIV evolution within CD4+ cell reservoirs is not entirely clear, yet replication in reservoirs is thought to be low (Hermankova et al., 2001; Kieffer et al., 2004; Parera et al., 2004; Persaud et al., 2007; Tobin et al., 2005) . For instance, some studies have looked at different CD4+ memory cells and found almost no drug resistance mutations in this reservoir and short genetic distances when inferring phylogenetic trees after 8.3 years of uninterrupted HAART treatment (Nottet et al., 2009) . Although the authors chose a less-reliable method of phylogenetic analysis and substitution model (Sullivan and Joyce, 2005; Susko et al., 2004) , their inferences seem robust since other studies have reached similar conclusions (Mens et al., 2007) . Nottet et al. (2009) hypothesized that viral replication in reservoirs would be indicated the appearance of drug resistance mutations, indicative of evolution within the reservoir. They concluded that due to the lack of drug resistance mutations and short tree branch lengths observed in the HIV viruses stored in reservoirs, replication and evolution have been halted in the blood compartment. Bailey et al. (2006) also inspected CD4+ T cells with a thorough sampling strategy and found limited evolution in the CD4+ reservoir. This conclusion was reached even when a greater diversity of pol genes was found in the reservoir than in plasma samples and that some sequences isolated from plasma were also found in the reservoir. Of course these results do not preclude the possibility that other reservoirs are the source of residual viremia. Although lack of evolution within an organism or entity is conceptually difficult to imagine, particularly since evolution can be defined in its simplest form as genetic change over time, irrespective of the observed magnitude, it seems like HIV hiding in reservoirs evolves at a slow rate and/or that the viruses released from the reservoir are subjected to strong purifying selection in the plasma. Diversity between reservoirs and peripheral blood cells or within specific tissues has been studied in other reservoirs as well. For example, Follicular Dendritic Cells (FDCs) also act as a viral reservoir; although less is known about this reservoir compared to latent CD4+ cells and macrophages. In the mouse, FDC-trapped HIV has a half-life of about 2 months and it remains replication-competent for at least 9 months (Burton et al., 2002) . In contrast to the other common reservoirs of HIV, the FDC is not infected, but contains only trapped extracellular HIV. Because the studies cited above were performed in mice, it was unclear how FDC-virus could contribute to HIV persistence (Smith et al., 2001) . However, using experimental and phylogenetic approaches with human tissues and cells, Keele et al. (2008b) confirmed that HIV trapped on FDCs was replication-competent. More importantly, HIV on FDCs had greater genetic diversity than viruses in other tissues and cells examined, including CD4+ T cells. Importantly, within the FDC trapped viruses, drug resistance variants were found and these were not identified in other sites (Fig. 3) . Moreover, with an elegant network approach, they showed the existence of archival viral variants from various time points of infection that were trapped on FDCs. Altogether these findings indicate that FDCs can act as reservoir and hold viruses for years (Fig. 3) . Phylogenetic analysis has been used to explore the contribution of HIV to tumorigenesis in reservoir cell types. Recently, Salemi et al. (2009) explored the dynamics of HIV-infected macrophages using p24 staining and prediction of co-receptor usage in tumor and non-tumor postmortem tissues from patients that died of AIDS-related lymphoma (ARL). They observed a high degree of compartmentalization between HIV from macrophages found in tumor and non-tumor tissues and an intermixing of HIV strains obtained from auxiliary lymph nodes. Viral effective population size was 100-fold greater in tumor tissues than in non-tumor tissues and, strikingly, the onset of lymphoma correlate with viral expansion. Moreover, evidence of gene flow to/from lymph nodes and tumor tissues indicates that lymph nodes might facilitate the movement of metastatic cells to different parts of the body. Poor penetration of HAART or properties such as immune privileged tissues can drive anatomical compartments to act as ''sanctuary sites'', places where HIV keeps replicating. Some suggested sanctuaries include the central nervous system (CNS), gut-associated lymphoid tissue (GALT) and the genitourinary tract (Dahl et al., 2010) . Evidence of compartmentalization of HIV sequences from different tissue types has been typically inferred from monophyletic assemblages of those sequences in phylogenetic trees (Wang et al., 2001; Wong et al., 1997) . Even further sub-compartmentalization has been found in GALT tissue throughout the gastrointestinal tract (van Marle et al., 2007) . When coupled with differential HIV gene expression, this can indicate that GALT has the capacity to host different HIV replicating strains. Hence, GALT tissue should also be considered when screening for drug resistant variants. While these conclusions are insightful for HIV genetic structure in GALT, overlooking nucleotide substitution model of selection, not using an optimality criterion for tree inference that accounts for phylogenetic uncertainty, and lacking powerful statistical methods such as the coalescent that account for historical patterns of divergence (Wakeley, 2004 ) (all issues with theses studies), can bias conclusions. Phylogenetic analyses can bring to light dimensions of HIV evolution such as ''where and when'' and even ''how'' infections are spreading across the globe that are impossible to assess with other approaches (Grenfell et al., 2004; Holmes, 2009; Holmes and Drummond, 2007; Moya et al., 2004; Welch et al., 2005) . In the following sections, we discuss these dimensions and how phylogenetics has led to our current understanding of HIV origin and geographic spread. The origins of HIV have been controversial since the beginning of the epidemic. HIV-1 and HIV-2, both species belonging to the genus Lentivirus (Retroviridae), are distinguished on the basis of their genome organization and phylogenetic relationships, clinical characteristics, virulence, infectivity and geographic distribution. During the beginning of the HIV-1 epidemic, serological evidence pointed to African green monkeys (Chlorocebus spp.; agm) as carriers of an HIV-1-like virus, simian T-cell lymphotrophic virus 3 [STLV-3, now Simian Immunodeficiency Virus (SIV) agm; Kanki et al. (1987) ]. Serum from HIV-1 infected patients cross-reacted with STLV-3 proteins (Hirsch et al., 1986; Kanki et al., 1985a Kanki et al., , 1985b ) and STLV-3-infected African green monkeys had an overlapping geographic distribution with the HIV-1 epidemic, which led researchers to believe that HIV-1 jumped to humans from African green monkeys (Kanki et al., 1987) . In contrast to the serological evidence, phylogenetics argued for a chimpanzee origin; however, scientists distrusted the results, taking the analyses with caution and concluding that their evidence was not enough to prove a chimpanzee cross-species infection (Huet et al., 1990) . The justifications stated for ignoring the phylogenetic evidence included the observations that high vpu gene divergence occurred between HIV-1 and SIVcpz, few lentiviruses had been isolated from simian hosts, and SIV had low prevalence in chimpanzees (Huet et al., 1990) . Evidence of a simian origin is now clear, as similar lentiviruses have been found in more than 40 species of African primates (Bibollet-Ruche et al., 2004; Hahn et al., 2000; Van Heuverswyn and Peeters, 2007) and geographical correlation exists between SIVs hosted in different primate species and HIV (Peeters et al., 2008) . Phylogenetic evidence has shed light on this subject, showing that HIV-1 and HIV-2 are the product of several cross-species transmission events between chimpanzee (Pan troglodytes troglodytes) SIV (SIVcpz) and sooty mangabey SIV (Cercocebus atys; SIVsm) with humans (Gao et al., 1992 (Gao et al., , 1999 Hahn et al., 2000; Huet et al., 1990; Plantier et al., 2009; Van Heuverswyn and Peeters, 2007) (Fig. 4) . Moreover, SIVcpz and SIVsm geographic range distributions correlate well with African regions where HIV-1 and HIV-2 show great endemicity, e.g., sooty mangabeys are most abundant in the regions of West Africa where HIV-2 is highly prevalent and diverse; thus, HIV-2 likely emerged there (Chen et al., 1996 (Chen et al., , 1997 Santiago et al., 2005) . The HIV-2 simian origin was rapidly established, since the only species naturally infected with a closely related virus is C. atys (Chen et al., 1996) . Furthermore, SIVsm cross-species transmission has occurred in multiple occasions demonstrated by phylogenetic analyses (Chen et al., 1997; Gao et al., 1992 Gao et al., , 1994 Lemey et al., 2003) . Indications of natural chimpanzee infections have been increasingly corroborated by phylogenetic analyses. Support for ''the chimpanzee hypothesis'' began to accumulate when SIV was found in natural populations in West Africa, initially in extremely low prevalence (Santiago et al., 2002) . Subsequently, scientists sequenced the entire genome of SIVcpz from a fecal sample of a wild chimpanzee in Tanzania (Santiago et al., 2003) and confirmed previous phylogenetic inferences based on the gag, pol and env genes. Moreover, epidemiological evidence has shown that SIVcpz can reach prevalence rates of up to 29-35% in some African communities (Keele et al., 2006) . Recently, also in Tanzania, Keele et al. (2009) followed populations of chimpanzees over 9 years and found typical AIDSlike features, e.g., increased death hazard for animals having SIVcpz, CD4+ T-cell depletion with high viral replication and histopathological findings consistent with end-stage AIDS. Altogether, epidemiological, physiological and clinical evidence support early phylogenetic predictions establishing chimpanzee cross-species transmission to humans, HIV-1 origin and chimpanzees as natural reservoir for the virus. The timescale of the evolution of HIV-1 and -2 has been estimated and it has allowed us to understand the circumstances surrounding the emergence of HIV and to test the hypothesis regarding natural or artificial means of cross-species transmission. Initially, several hypothesis of ''when'' HIV came into human populations could not be tested in a reductionist framework. Among these, the so-called Oral Polio Vaccine (OPV) hypothesis (Hooper, 2003) stated that HIV was introduced into human populations by the use of inadvertently infected monkeys (advocates claimed chimpanzees) as means for polio vaccine production. In fact, around 1960 African green monkeys were used to produce an attenuated polio vaccine (Plotkin, 2001) . Thus, if the molecular timing of HIV ''jump'' in humans matched this date, the result would be consistent with this hypothesis. On the other hand, if the timing of the HIV ''jump'' significantly predated 1960, then OPV hypothesis would seem less likely. As increasing amounts of data and more powerful computational/statistical approaches became available, the time to the most recent common ancestor (TMRCA) of HIV-1, whether it was in a human or a chimpanzee host, was estimated with increasing confidence (Korber et al., 2000; Salemi et al., 2001; Sharp et al., 2000) . These studies have used different tactics, yet obtained similar estimates for the HIV-1M group radiation (strict and relaxed molecular clock analyses). Applications of these methods have led scientists to estimate that the M group originated near 1930 with a range, depending on the study, of 10-20 years on either side. It is worth noting that, although most estimates are consistent, they can be biased partly by recombination and few historical samples. Recombination, apart from violating the single ancestry assumption, may increase apparent variation in rates among nucleotide sites and also has a decreasing effect upon genetic distances between sequences (Posada, 2001) . On the other hand, the partial lack of historical samples makes the calibration of such methods a hard task resulting in estimates with wide confidence intervals. The only archival samples available, DRC60 and ZR59, suggest extensive genetic diversity of HIV-1 in West Africa by 1960. In turn, divergence time estimates date to the 1920s, depending upon the coalescent tree model chosen (Worobey et al., 2008) . Since divergence time estimates represent the TMRCA of just the isolates included in the analysis, it is likely that new diverse archival sequences will yield even older divergence time estimates for the HIV-1M group radiation. Hence, the OPV hypothesis can be ruled out because the molecular data suggest that HIV-1 group M isolates originated 30 years prior to the use of primates in OPV preparation (Korber et al., 2000; Worobey et al., 2008) . In addition, it has not been possible to detect chimpanzee DNA in archival stocks of OPV (Berry et al., 2001; Blancou et al., 2001) . Most likely, crossspecies transmission can be explained by invoking socio-cultural factors during the postcolonial period in Africa (Chitnis et al., 2000) . By similar means, the origin of HIV-2 has been dated to 1940 ± 16 for subtype A and to 1945 ± 14 for subtype B . Moreover, phylogenetic inference has dated the introduction of HIV-1 clade B in North America to 1968 (1966 -1970 (Gilbert et al., 2007; Pérez-Losada et al., 2010) , which is consistent with the earliest known retrospective studies (Robbins et al., 2003) . It is worth mentioning that time estimation of deep (old) viral divergent nodes based on molecular clock analyses can be biased towards the present if no external calibrations are used. This could arise from extremely high saturation problems and constraints that would be hard to account for with substitution models, as it has been shown using an island biogeography approach to SIV dating (Worobey et al., 2010) . The study of the origin of HIV viruses is far from being a resolved issue; rather, phylogenetics is currently providing more insights as new isolates are being analyzed, especially isolates from other non-human primates. The study of HIV relatives shows that similar processes shape their natural history. For instance, SIVcpz itself is the product of recombination between SIVrcm (red-capped mangabeys, Cercocebus torquatus) and SIVgsn (greater spot-nosed monkeys, Cercopithecu snictitans). This has been revealed by strong discordance between topologies that suggested a hybrid origin for SIVcpz . Likewise, western gorilla SIV (SIVgor, from Gorilla gorilla gorilla) also seems to be the product of cross-species transmission (Fig. 4) . Despite the small number of samples available, phylogenetic analyses suggest that SIVgor is closely related to SIVcpz from Pan troglodytes troglodytes being also sister taxa to HIV-1 group O (Takehisa et al., 2009 ). However, due to the yet poor sampling of SIVgor, whether chimpanzees infected gorillas and humans, or humans were infected first and then humans infected gorillas or even gorillas to humans, is yet to be determined (Fig. 4) . Interestingly, a new HIV-1 group P, has been proposed based on one isolate found in a Cameroonian woman. The phylogenetic placement of group P as the sister taxon to all SIVgor but distinct from HIV-1 group O, could play a key role in testing hypotheses of human-gorilla transmissions (Fig. 4) (Plantier et al., 2009) . Phylogenetic analysis can be very informative, but the accuracy of phylogenetic conclusions is highly dependent on the method chosen and sampling strategy. As more lentivirus sequences from different locations and archival sequences become available, the issue of the origin of HIV should converge to more reliable conclusions. Before we explore different aspects of HIV dynamics and its applications, we should add a cautionary note on genetic marker choice to capture transmission and other desired signals. Extensive debate exists concerning the gene(s) choice in HIV phylogenetics (Hué et al., 2005a) . The env gene is sometimes preferable on the basis of high genetic variability; however, indications of convergent evolution on this region would preclude its use since it violates the unique evolutionary history assumption made by phylogenetic methods. On the other hand, the pol gene has been suggested as a candidate as well, however some researchers have been reluctant to use it given the number of drug resistance mutations associated with this region. Lemey et al. (2005b) showed that phylogenetic trees based on pol sequences, after excluding codons associated with resistance, were congruent with independent data on epidemiology and with trees based on env sequences. Similarly, criminal cases of HIV transmission that rely solely on phylogenetic evidence are precarious. Besides the inherent issues about model selection and phylogenetic inference, data availability also plays a major role. Some of the concerns are related to the direction of transmission or who infected whom, availability of all involved sexual contacts, and interpretation of the phylogeny given that certain individuals could be infected with more than one strain. Finally, issues of convergent evolution can erroneously link individuals in the absence of any other independent source of evidence (Pillay et al., 2007) . The term phylodynamics was coined in reference to ''the melding of immunodynamics, epidemiology, and evolutionary biology [. . .]'' in particular to pathogens such as viruses and bacteria in the whole breadth from within-host variation and immunity through transmission events, bottlenecks and global epidemiological dynamics (Grenfell et al., 2004) . This is one of the most exciting and insightful ongoing fields in which phylogenetics is contributing to our understanding of virus evolution and in particular to HIV. Viral populations dynamics can be explored using phylogenetic, coalescent and other statistical methods to make historical inferences about their temporal and spatial distributions. This is possible, basically, by taking advantage of certain attributes, such as high mutation rates, large population sizes, short generation times and the realization that genetic changes occur so fast that ecological and epidemiological processes leave marks on their genomes. The basic idea behind this new approach is that phylogenies are modulated by immune selection, viral population sizes and spatial dynamics and thus, together with experimental data, it would be possible to tear apart individual contributions and identify forces dominating pathogen evolution and behavior. Although mainly focused on RNA viruses (Amore et al., 2010; Bennett et al., 2010; Holmes and Grenfell, 2009; Kerr et al., 2009; Mondini et al., 2010; Pérez-Losada et al., 2010 Rambaut et al., 2008; Siebenga et al., 2010) , phylodynamic approaches have been also applied to DNA viruses (Zehender et al., 2010a) and bacteria (Conlan et al., 2007; Pérez-Losada et al., 2007a , 2007b Pérez-Losada et al., 2005; Tazi et al., 2010) . Most of the methods available (see : Kuhner, 2009; Pérez-Losada et al., 2007c) take advantage of the coalescent theory developed by Kingman based on previous work made by Wright and others (Kingman, 2000) . Although the idea of a coalescent theory was used several times in population genetics [reviewed in Tavaré (1984) ], the development is credited to Kingman (1982) and independently to Hudson (1983) and Tajima (1983) [see Wakeley (2008) for a thorough discussion of coalescent theory]. The basic idea behind coalescent theory, as opposed to summary statistics or classic population genetics, is that coalescence tries to explain the present of a population by taking a look into its past. It is a realization of the Wright-Fisher neutral model of evolution, recording the genealogical relationships among a random sample of population genetic data. The model has been further generalized to account for varying population size, different time scales, structure, recombination and selection (Nordborg, 2004) . Within a coalescent framework, statistical advances in Bayesian inference regarding the use of time-stamped data (Drummond et al., 2002) , models of population dynamics and relaxation of molecular clock assumptions (Drummond et al., 2005 Drummond and Rambaut, 2007) have greatly helped to understand better HIV patterns and processes on a temporal scale. Recent advances in sequence dating provide tools to estimate unknown sequence ages as these can be jointly or individually estimated under a full probabilistic framework (Shapiro et al., 2011) . In addition, this temporal framework can be enhanced by explicitly modeling spatial dispersion rates in a phylogeographic context . Using phylogenetic diffusion models, one can infer ancestral state locations for the sequences sampled under a discrete or continuous context. Moreover, the most parsimonious explanation for the diffusion process is obtained under this Bayesian framework by the implementation of Bayesian Stochastic Search Variable Selection (BSSVS). This new methodology has several advantages over previous maximum likelihood and maximum parsimony methods such as fitting a diffusion model simultaneously with a substitution model, incorporation of branch lengths in ancestral state reconstruction, and accommodation of uncertainty in both the phylogeny and the diffusion process. Applications of this method are rapidly increasing as recent studies in dengue (Raghwani et al., 2011) , influenza (Nelson et al., 2011) , Staphylococcus aureus (Gray et al., 2011) , and, of course, HIV-1 (Esbjörnsson et al., 2011; Skar et al., 2011) indicate. More recently, new methods have been adapted from systematic studies to estimate the basic reproductive number (R 0 ) (Stadler, 2010; Stadler et al., 2011) . The R 0 parameter has been traditionally used in epidemics to determine whether or not an infectious agent can spread in a population, i.e., if R 0 > 1 the infectious agent will spread in the population and if R 0 < 1 the infection will die out. Given the amount of viral sequence data and the ease of data acquisition, estimating R 0 from genetic data can become a novel tool for molecular epidemiologists. The model uses a Birth-Death process (as in species phylogenetics) instead of a coalescent model, in which a birth event is equivalent to a new infection and a death event to various phenomena such as death, treatment, eradication, etc. In order to exploit most of these methods, sampling strategy is paramount and arbitrary sequence collection from GenBank or the Los Alamos database is probably not adequate (Stack et al., 2010) . Next-gen sequencing approaches, e.g. (Bybee et al., 2011) , allow for more comprehensive sampling at efficient costs for future studies of HIV diversity rather than half-hazard sampling of sequences available from other studies in public databases. Clearly, for greatest utility in studying HIV, sequence data submitted to public databases should include geographic, clinical, and, especially, temporal information (see http://datadryad.org for storage options for such data). Transmission dynamics have been studied thoroughly, in particular regarding transmission network reconstructions and inspecting the loss of diversity at the transmission event. Popular examples of HIV transmission are cases involving legal issues such as the Florida dentist case (Ou et al., 1992) and the Louisiana attempted murder trial (Metzker et al., 2002) . In both of these cases, phylogenetic evidence was concordant with the transmission hypothesis from the defendant to the victims. In fact, the Louisiana case constituted the first case in the US in which phylogenetic analyses were used in a criminal court case. In this case, different substitution models, genes (env and pol), and optimality criteria were used in linking the defendant with the victim's HIV-1 variants. Additionally, drug resistance genetic signatures were also used as indications of transmission events. Some other examples of transmission reconstruction include a Swedish rape case (Albert et al., 1994) , and a healthcare related case in Baltimore (Holmes et al., 1993) . It is also interesting to highlight a recent report of highly divergent HIV variants transmitted by a donor to another two individuals on the same evening (English et al., 2011) . Transmission dynamics have also been studied across transmissions because of the opportunity for treatment due to a reduction in genetic diversity (Edwards et al., 2006; Fischer et al., 2010) . Most of the work has focused on monitoring discordant couples (i.e., couples in which one partner is HIV positive and the other is not) and to test whether there is a reduction in genetic diversity down to one virus at the transmission event. Edwards et al. (2006) showed using phylodynamic methods that the reduction in genetic diversity (<1%) is no different between horizontal (homo-or heterosexual) and vertical transmission (mother-tochild). Understanding genetic diversity at transmission events has therapeutic implications as less diverse populations of small size are strongly influenced by genetic drift, decreasing the chance of transmission of high fitness variants. Indications that single virus variants were transmitted horizontally came first from studies using Sanger sequencing and single-genome amplification coupled with phylogenetic and mathematical modeling (Keele et al., 2008a; Salazar-González et al., 2008) , and were further supported by the enhanced capabilities of ultra-deep-sequencing (UDS). UDS revealed that early HIV variants explored extensive sequence space within epitope regions. Interestingly, as the infection proceeds, reversion to the canonical subtype sequence occurred in positions under immune pressure, but not in positions that were not under pressure even in the earliest samples, suggesting that immune pressure is present earlier than previously known (Fischer et al., 2010) . Thus, phylogenetic analyses can be very insightful regarding practical situations such as court trials, and also in situations of medical importance such as characterizing genetic diversity for potential therapy development. While the population dynamics of HIV has been well characterized (Coffin, 1995) , phylogenetic studies have added greatly to our understanding, especially of the dynamic nature of genetic diversity over the course of infection within a host individual and across transmission events. Much of the phylodynamic research has focused on associations between clinical/epidemiological aspects and genetic diversity such as transmission/spread dynamics between men having sex with men [MSM; Lewis et al. (2008) ] revealing episodic clusters of transmission, and between heterosexual patients in which transmission dynamics appeared to be slower compared to MSM . Note that these studies used large sample sizes to draw their conclusions, a desirable feature to capture the phylodynamic signature from the data. Phylodynamic studies have focused on temporal dynamics of transmission and its frequency. In Italy, a recent study showed similar conclusions to that of Hué et al. (2005b) in the UK in that currently circulating subtype B HIV was introduced multiple times into MSM populations. Similarly, when comparing inter-node intervals between transmission clades in Lewis et al. (2008) and Zehender et al. (2010b) the time between transmission events differed with medians of 14 months and 30 months for UK and Italian study, respectively. These results could reflect actual differences in transmission dynamics or could be due to an artifact because of the small sample size and the restricted area sampled. Other studies have explored dissemination patterns and possible transmission of particular HIV strains between risk groups. Recently, Liao et al. (2009) looked at spatial and temporal patterns to explain the distribution of the CRF01_AE variant in Vietnam. Their results suggested that CRF01_AE came from Thailand and that, within Vietnamese population, it has been transmitted from heterosexual patients to Intravenous Drug Users (IDUs). Similar work was done with other subtypes and recombinant forms in Asia (Tee et al., 2008) , South America (Bello et al., 2010) , Africa Lemey et al., 2003) and North America (Gilbert et al., 2007) , among others. Patterns of diversity among populations and how they compare to epidemiological data have been studied. Pérez-Losada et al. (2010) studied HIV-1 envelope gene sequence variation in cohorts of vaccinated and placebo-treated patients in North America. Their phylodynamic analysis showed that genetic diversity remained nearly constant from approximately the 1970s to date, suggesting that viral populations had already expanded around ten years before HIV was detected in the US (Fig. 5 ). In addition, despite a drop in the number of cases since the 90's, genetic diversity has remained high across time. Previously, Robbins et al. (2003) showed similar results in a different cohort of HIV-1 positive US subjects. Although they used parametric and non-parametric methods that did not account for phylogenetic uncertainty and a smaller dataset, they reached similar demographic conclusions. In a similar study, phylodynamic analyses revealed that CRF01_AE was cryptically circulating in the Thai HIV-1 virus population for 3-10 years before it was detected in 1989 [Fig. 5; ]. In both the North American and Thai studies, historical estimates of genetic diversity correlate well with known epidemiological data. Within-host evolution in HIV has proven to be important in understanding clinical processes associated with disease progression. Within-host, HIV genetic diversity of plasma isolates is reduced at any time point, but increasing over the course of infection, similar to that observed in the population phylogenies of influenza virus (Grenfell et al., 2004; Rambaut et al., 2008; Shankarappa et al., 1999) . HIV evolution could be also different in specific tissues. For example, phylodynamic analyses of post mortem brain tissues have revealed that HIV is evolving at different rates at different brain compartments. However, this is apparently not related to selective pressure, but rather to inherent drift associated with macrophage-tropic viral expansion after immune failure (Salemi et al., 2005) . Within-host variation, commonly misnamed as quasispecies (Holmes, 2009) , has been addressed under a phylodynamic framework for co-receptor usage. For example, Salemi et al. (2007) explored co-receptor usage dynamics in tissue and peripheral blood Fig. 5 . HIV-1 past population dynamics in North America (green) and Thailand (gray). Plots were built using the env gene and the Bayesian Skyline Plot model. The analyses primarily revealed that genetic diversity (thick lines) has remained high through time despite the number of AIDS cases (thin lines) dropping considerably and thatHIV was circulating years before the first AIDS cases were detected. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) mononuclear cells (PBMC) and found a temporal structure between CCR5-tropic (R5) virus and the appearance of CXCR4 (X4) variants, i.e., majority of X4 virus found in thymus tissue seemed to come from PBMC viruses. Conversely, substitution rates between R5 and X4 sequences were not significantly different; supporting that X4 amplification could be due to the availability of target cells. As recognized by the authors, phylodynamic studies of HIV subpopulations can be greatly enlightening, but they have some limitations as human tissue samples are not easy to obtain and could involve ethical issues. As new animal models (e.g., new humanized mice) for HIV infections become available (Denton and Garcia, 2009 ), viral subpopulations can be studied over the course of 1-2 years infection (Berges et al., 2010) . Nonetheless, despite the potential of these models, little has been done in experimental HIV evolution due to little interaction between evolutionary and molecular biologists. Although the field of phylodynamics is young, its statistical tools are key in linking epidemiological and evolutionary information (Tibayrenc, 2005) . Surveillance programs would greatly benefit from the implementation of these approaches, as it would be possible to study the impact of vaccines/chemotherapeutic treatments in population genetic diversity. Likewise, such studies would allow for the identification of novel risk groups and indicate changes (or not) in population dynamics as a result of intervention strategies. At the same time, the creation of specialized databases to collect phylodynamic informative data (randomly sampled HIV sequences across a broad target area of HIV infection to monitor HIV diversity and associated changes) would greatly aid the implementation of these approaches. Phylogenetics has a new vigor. The development of new robust statistical frameworks such as Bayesian inference (Huelsenbeck et al., 2001) has allowed the testing of complex hypothesis accounting for the inherent uncertainties of historical and unrepeatable processes. Currently, implementation of these methods under biologist-accessible software has empowered scientists to test biologically meaningful scenarios. To a certain extent, the great phylogenetic questions in HIV evolution have been undertaken, e.g., HIV origin, evolutionary driving forces, within and among host variation. However, phylogenetics opens the door to new questions and new insights in HIV. For example, are recombination/substitution rates impacted by antiviral treatments? Do transmission routes influence the outcome of infection? How does compartmentalization of HIV strains evolve within an individual? These are all questions motivated by phylogenetic approaches. Similarly, large collaborative efforts such as the UK HIV Drug Resistance Collaboration provide opportunities for addressing key questions in phylodynamics and drug resistance through the study of longitudinal and/or retrospective data from large cohorts. In summary, phylogenetics is an ever-evolving field that promises to give more insights into pathogen evolution, mainly in pathogens such as HIV that form measurably evolving populations (Drummond et al., 2003) . Analysis of a rape case by direct sequencing of the human immunodeficiency virus type 1 pol and gag genes Multi-year evolutionary dynamics of West Nile virus in suburban Chicago Hybrid origin of SIV in chimpanzees Residual human immunodeficiency virus type 1 viremia in some patients on Antiretroviral therapy is dominated by a small number of invariant clones rarely found in circulating CD4(+) T cells Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immune deficiency syndrome (AIDS) Simian immunodeficiency virus (SIV) from sun-tailed monkeys (Cercopithecus solatus): evidence for host-dependent evolution of SIV within the C. Lhoesti superspecies Phylodynamics of HIV-1 circulating recombinant forms 12_BF and 38_BF in Argentina and Uruguay Epidemic dynamics revealed in dengue evolution Humanized Rag2-/-[gamma]c-/-(RAG-hu) mice can sustain long-term chronic HIV-1 infection lasting more than a year HIV forensics: pitfalls and acceptable standards in the use of phylogenetic analysis as evidence in criminal investigations of HIV transmission Vaccine safety: analysis of oral polio vaccine CHAT stocks The genomic rate of molecular adaptation of the human influenza A virus New simian immunodeficiency virus infecting De Brazza's monkeys (Cercopithecus neglectus): evidence for a Cercopithecus monkey virus clade Polio vaccine samples not linked to AIDS Recombination in HIV and the evolution of drug resistance. for better or for worse? Analysis of HIV-1 env gene sequences reveals evidence for a low effective number in the viral population A phylogenetic and Markov model approach for the reconstruction of mutational pathways of drug resistance Evolution of human influenza A viruses over 50 years: rapid, uniform rate of change in NS gene Follicular dendritic cell contributions to HIV pathogenesis Predicting the evolution of human influenza A Targeted amplicon sequencing (TAS): a scalable next-gen approach to multi-locus, multi-taxa phylogenetics Recombination estimation under complex evolutionary models with the coalescent composite-likelihood method Recombination favors the evolution of drug resistance in HIV-1 during antiretroviral therapy Disease progression and evolution of the HIV-1 env gene in 24 infected infants Genetic characterization of new West African simian immunodeficiency virus SIVsm: geographic clustering of household-derived SIV strains with human immunodeficiency virus type 2 subtypes and genetically diverse viruses from a single feral sooty mangabey troop Human immunodeficiency virus type 2 (HIV-2) seroprevalence and characterization of a distinct HIV-2 genetic subtype from the natural range of simian immunodeficiency virus-infected sooty mangabeys Positive selection detection in 40,000 human immunodeficiency virus (HIV) type 1 sequences automatically identifies drug resistance and positive fitness mutations in HIV protease and reverse transcriptase Origin of HIV Type 1 in Colonial French Equatorial Africa? HIV reservoir size and persistence are driven by T cell survival and homeostatic proliferation Quantification of latent tissue reservoirs and total body viral load in HIV-1 infection HIV drug resistance. New Engl Structure, replication, and recombination of retrovirus genomes: some unifying hypotheses HIV population dynamics in vivo: implications for genetic variation, pathogenesis, and therapy Molecular biology of HIV Campylobacter jejuni colonization and transmission in broiler chickens: a modelling perspective Intraspecific phylogenetics -support for dental transmission of human-immunodeficiency-virus Empirical tests of some predictions from coalescent theory with applications to intraspecific phylogeny reconstruction Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection HIV reservoirs, latency, and reactivation: prospects for eradication Persistence of distinct HIV-1 populations in blood monocytes and naive and memory CD4 T cells during prolonged suppressive HAART Novel humanized murine models for HIV research Dual human immunodeficiency virus type 1 infection and recombination in a dually exposed transfusion recipient. The Transfusion Safety Study Group Rates of spontaneous mutation BEAST: Bayesian evolutionary analysis by sampling trees Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data Measurably evolving populations Bayesian coalescent inference of past population dynamics from molecular sequences Relaxed phylogenetics and dating with confidence Population genetic estimation of the loss of genetic diversity during horizontal transmission of HIV-1 Phylogenetic analysis consistent with a clinical history of sexual transmission of HIV-1 from a single donor reveals transmission of highly distinct variants HIV-1 molecular epidemiology in guinea-bissau, West Africa: origin, demography and migrations Simian immunodeficiency virus in people Identification of a reservoir for HIV-1 in patients on highly active antiretroviral therapy Transmission of single HIV-1 genomes and dynamics of early immune escape revealed by ultra-deep sequencing Evolutionary Ecology: Concepts and Case Studies Increased detection of HIV-specific T cell responses by combination of central sequences with comparable immunogenicity Isolation of human T-cell leukemia virus in acquired immune deficiency syndrome (AIDS) Human infection by genetically diverse SIVSM-related HIV-2 in West Africa Genetic diversity of human immunodeficiency virus type 2: evidence for distinct sequence subtypes with differences in virus biology Origin of HIV-1 in the chimpanzee Pan troglodytes troglodytes The emergence of HIV/AIDS in the Americas and beyond Coexpression of exogenous and endogenous mouse mammary tumor virus RNA in vivo results in viral recombination and broadens the virus host range Retroviral recombination during reverse transcription Spatial phylodynamics of HIV-1 epidemic emergence in east Africa Testing spatiotemporal hypothesis of bacterial evolution using methicillin-resistant Staphylococcus aureus ST239 genome-wide data within a Bayesian framework Unifying the epidemiological and evolutionary dynamics of pathogens AIDS as a zoonosis: scientific and public health implications HIV-1 drug resistance profiles in children and adults with viral load of <50 copies/mL receiving combination therapy Support for dental HIV transmission Crossreactivity to human T-lymphotropic virus type III/lymphadenopathy-associated virus and molecular cloning of simian T-cell lymphotropic virus type III from African green monkeys The phylogeography of human viruses The Evolution and Emergence of RNA Viruses The evolutionary genetics of viral emergence Discovering the phylodynamics of RNA viruses Molecular investigation of human-immunodeficiency-virus (HIV) infection in a patient of an HIV-infected surgeon The molecular epidemiology of human immunodeficiency virus type 1 in Edinburgh The River: a Journey Back to the Source of HIV and AIDS Properties of a neutral allele model with intragenic recombination. Theor Investigation of HIV-1 transmission events by phylogenetic methods: requirement for scientific rigour Genetic analysis reveals the complex structure of HIV-1 transmission within defined risk groups Bayesian inference of phylogeny and its impact on evolutionary biology Genetic organization of a chimpanzee lentivirus related to HIV-1 Molecular phylodynamics of the heterosexual HIV epidemic in the United Kingdom Human immunodeficiency virus type 1 quasi species that rebound after discontinuation of highly active antiretroviral therapy are similar to the viral quasi species present before initiation of therapy High rate of recombination throughout the human immunodeficiency virus Type 1 genome Longitudinal population analysis of dual infection with recombination in two strains of HIV type 1 subtype B in an individual from a phase 3 HIV vaccine efficacy trial Antibodies to simian T-lymphotropic retrovirus type Ill in African green monkeys and recognition of STLV-Ill viral proteins by AIDS and related sera Serologic identification and characterization of a macaque T-lymphotropic retrovirus closely related to HTLV-III The Origins of HIV-1 and HTLV-4/HIV-2 Chimpanzee reservoirs of pandemic and nonpandemic HIV-1 Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection Characterization of the follicular dendritic cell reservoir of human immunodeficiency virus type 1 Increased mortality and AIDS-like immunopathology in wild chimpanzees infected with SIVcpz Origin and phylodynamics of rabbit hemorrhagic disease virus Genotypic analysis of HIV-1 drug resistance at the limit of detection: virus production without evolution in treated adults with undetectable HIV loads On the genealogy of large populations Origins of the coalescent: 1974-1982 Timing the ancestor of the HIV-1 pandemic strains An evolutionary model-based algorithm for accurate phylogenetic breakpoint mapping and subtype prediction in HIV-1 LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters Coalescent genealogy samplers: windows into population history HIV Sequence Compendium Multiple mutations in Hiv-1 reversetranscriptase confer high-level resistance to zidovudine (Azt) Prevalence and clinical significance of HIV drug resistance mutations by ultradeep sequencing in antiretroviral-naïve subjects in the CASTLE study United States global health policy: HIV/AIDS, maternal and child health, and The President's Emergency Plan for AIDS Relief (PEPFAR) Accurate reconstruction of a known HIV-1 transmission history by phylogenetic tree analysis Tracing the origin and history of the HIV-2 epidemic Molecular footprint of drugselective pressure in a human immunodeficiency virus transmission chain Molecular testing of multiple HIV-1 transmissions in a criminal case Synonymous substitution rates predict HIV disease progression as a result of underlying replication dynamics Bayesian phylogeography finds its roots Phylogeography takes a relaxed random walk in continuous space and time Episodic sexual transmission of HIV revealed by molecular phylodynamics Evolutionary analyses of genetic recombination Phylodynamic analysis of the dissemination of HIV-1 CRF01_AE in Vietnam Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination Emergency of primary NNRTI resistance mutations without antiretroviral selective pressure in a HAART-treated child Lower in-vivo mutation-rate of humanimmunodeficiency-virus type-1 than that predicted from the fidelity of purified reverse-transcriptase Eradication of HIV: current challenges and new directions RDP2: recombination detection and analysis from sequence alignments Viral sequence diversity: challenges for AIDS vaccine designs Adaptive protein evolution at the Adh locus in Drosophila Investigating signs of recent evolution in the pool of proviral HIV type 1 DNA during years of successful HAART Molecular evidence of HIV-1 transmission in a criminal case 3 0 -Azido-3 0 -deoxythymidine (BW A509U): an antiviral agent that inhibits the infectivity and cytopathic effect of human T-lymphotropic virus type III/lymphadenopathy-associated virus in vitro Spatio-temporal tracking and phylodynamics of a DENV-3 outbreak in a city from Brazil The population genetics and evolutionary epidemiology of RNA viruses Tetherin inhibits retrovirus release and is antagonized by HIV-1 Vpu Spatial dynamics of human-origin H1 influenza A virus in North American swine Extensive purifying selection acting on synonymous sites in HIV-1 Group M sequences Contribution of recombination to the evolution of human immunodeficiency viruses expressing resistance to antiretroviral treatment Coalescent Theory. Handbook of Statistical Genetics HIV-1 can persist in aged memory CD4(+) T lymphocytes with minimal signs of evolution after 8.3 years of effective highly active antiretroviral therapy The remarkable frequency of human immunodeficiency virus type 1 genetic recombination. Microbiol Molecular epidemiology of HIV Lack of evidence for protease evolution in HIV-1-infected patients after 2 years of successful highly active antiretroviral therapy Genetic diversity and phylogeographic distribution of SIV: how to understand the origin of HIV Decay characteristics of HIV-1-infected compartments during combination therapy Population genetics of Neisseria gonorrhoeae in a high-prevalence community using a hypervariable outer membrane porB and 13 slowly evolving housekeeping genes Distinguishing importation from diversification of quinolone-resistant Neisseria gonorrhoeae by molecular evolutionary analysis Temporal trends in gonococcal population genetics in a high prevalence urban community New methods for inferring population dynamics from microbial sequences Phylodynamics of HIV-1 from a phase-III AIDS vaccine trial in North America Phylodynamics of HIV-1 from a phase III AIDS vaccine trial in Slow human immunodeficiency virus type 1 evolution in viral reservoirs in infants treated with effective antiretroviral therapy HIV phylogenetics A new human immunodeficiency virus derived from gorillas Untruths and consequences: the false hypothesis linking CHAT type 1 polio vaccination to the origin of human immunodeficiency virus Site-to-site variation of synonymous substitution rates Adaptation to different human populations by HIV-1 revealed by codonbased analyses Estimating selection pressures on HIV-1 using phylogenetic likelihood models Adaptation to human populations is revealed by within-host polymorphisms in HIV-1 and hepatitis C virus Phylogenetic analysis of population-based and deep sequencing data to identify coevolving sites in the nef gene of HIV-1 Detection, isolation, and continuous production of cytopathic retroviruses (HTLV-III) from patients with AIDS and pre-AIDS Unveiling the molecular clock in the presence of recombination Evaluation of methods for detecting recombination from DNA sequences: computer simulations The effect of recombination on the accuracy of phylogeny estimation Recombination in evolutionary genomics Transmitted HIV Type 1 drug resistance among individuals with recent HIV infection in east and Southern Africa Endemic dengue associated with the co-circulation of multiple viral lineages and localized density-dependent transmission The causes and consequences of HIV evolution The genomic and epidemiological dynamics of human influenza A virus Vaccination with ALVAC and AIDSVAX to prevent HIV-1 infection in Thailand Human immunodeficiency virus reverse transcriptase and protease sequence database US human immunodeficiency virus type 1 epidemic: date of origin, population history, and characterization of early strains Recombination in AIDS viruses Recombination in HIV-1 Coalescent estimates of HIV-1 generation time in vivo Rhesus monkey TRIM5[alpha] restricts HIV-1 production through rapid degradation of viral Gag polyproteins Deciphering human immunodeficiency virus type 1 transmission and early envelope diversification by single-genome amplification and sequencing Dating the common ancestor of SIVcpz and HIV-1 group M and the origin of HIV-1 subtypes using a new method to uncover clocklike molecular evolution Phylodynamic analysis of human immunodeficiency virus type 1 in distinct brain compartments provides a model for the neuropathogenesis of AIDS Phylodynamics of HIV-1 in lymphoid and non-lymphoid tissues reveals a central role for the thymus in emergence of CXCR4-using quasispecies Highresolution molecular epidemiology and evolutionary history of HIV-1 subtypes in Albania Distinct patterns of HIV-1 evolution within metastatic tissues in patients with non-Hodgkins lymphoma Amplification of a complete simian immunodeficiency virus genome from fecal RNA of a wild chimpanzee Simian immunodeficiency virus infection in free-ranging sooty mangabeys (Cercocebus atys atys) from the Tai Forest Antibodies reactive with human T-lymphotropic retroviruses (HTLV-III) in the serum of patients with AIDS Accurately measuring recombination between closely related HIV-1 genomes Consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 infection A Bayesian phylogenetic method to estimate unknown sequence ages In search of molecular darwinism Origins and evolution of AIDS viruses: estimating the time-scale Evolution and recombination of genes encoding HIV-1 drug resistance and tropism during antiretroviral therapy Pervasive genomic recombination of HIV-1 in vivo Phylodynamic reconstruction reveals norovirus GII. 4 epidemic expansions and their molecular determinants Long-term follow-up studies confirm the stability of the latent reservoir for HIV-1 in resting CD4(+) T cells Dynamics of two separate but linked HIV-1 CRF01_AE outbreaks among injection drug users in Adaptive protein evolution in Drosophila Persistence of infectious HIV on follicular dendritic cells Protocols for sampling viral sequences to study epidemic dynamics Sampling-through-time in birth-death trees Estimating the basic reproductive number from viral sequence data HIV-1 Vif blocks the antiviral activity of APOBEC3G by impairing both its translation and intracellular stability Model selection in phylogenetics On inconsistency of the neighbor-joining, least squares, and minimum evolution estimation when substitution processes are incorrectly modeled Evolutionary relationship of DNA sequences in finite populations Origin and biology of simian immunodeficiency virus in wild-living western gorillas Line-of-descent and genealogical processes, and their applications in population genetics models. Theor Population dynamics of Neisseria gonorrhoeae in Shanghai, China: a comparative study HIV-1 infected monozygotic twins: a tale of two outcomes Temporal and spatial dynamics of human immunodeficiency virus type 1 circulating recombinant forms 08_BC and 07_BC in Asia Sex and recombination in retroviruses Selection in context: patterns of natural selection in the glycoprotein 120 region of human immunodeficiency virus 1 within infected individuals Bridging the gap between molecular epidemiologists and evolutionists Molecular cloning of a new interferon-induced factor that represses human immunodeficiency virus type 1 long terminal repeat expression Evidence that low-level viremias during effective highly active antiretroviral therapy result from two processes: expression of archival virus and replication of virus Loss of antigenic epitopes as the result of Env gene recombination in retrovirus-induced leukemia in immunocompetent mice Global Report: UNAIDS Report on the Global AIDS Epidemic 2010. Joint United Nations Programme on HIV/AIDS (UNAIDS) The origins of HIV and implications for the global epidemic Compartmentalization of the gut viral reservoir in HIV-1 infected patients Down or out in blood and lymph? Recent trends in population genetics: more data! More math! Simple models? Coalescent Theory: An Introduction Identification of shared populations of human immunodeficiency virus type 1 infecting microglia and tissue macrophages outside the central nervous system Antibody neutralization and escape by HIV-1 Integrating genealogy and epidemiology: the ancestral infection and selection graph as a model for reconstructing host virus histories. Theor A challenge to the ancient origin of SIVagm based on African green monkey mitochondrial genomes In vivo compartmentalization of human immunodeficiency virus: evidence from the examination of pol sequences from autopsy tissues Origin of AIDS: contaminated polio vaccine theory refuted Direct evidence of extensive diversity of HIV-1 in Kinshasa by 1960 Island biogeography reveals the deep history of SIV Dual infection with Hiv-1 Thai subtype-B and subtype-E Rapid molecular evolution of human bocavirus revealed by Bayesian coalescent inference Population dynamics of HIV-1 subtype B in a cohort of men-having-sex-with Evidence for coinfection by multiple strains of human immunodeficiency virus type 1 subtype B in an acute seroconvertor