key: cord-0971724-vqhyyp88 authors: Zhang, ChiYu; Ding, Na; Chen, KePing; Yang, RongGe title: Complex positive selection pressures drive the evolution of HIV-1 with different co-receptor tropisms date: 2010-10-17 journal: Sci China Life Sci DOI: 10.1007/s11427-010-4066-5 sha: 24a98f5c08b96b0a50ca97edc2626155041bf15b doc_id: 971724 cord_uid: vqhyyp88 HIV-1 co-receptor tropism is central for understanding the transmission and pathogenesis of HIV-1 infection. We performed a genome-wide comparison between the adaptive evolution of R5 and X4 variants from HIV-1 subtypes B and C. The results showed that R5 and X4 variants experienced differential evolutionary patterns and different HIV-1 genes encountered various positive selection pressures, suggesting that complex selection pressures are driving HIV-1 evolution. Compared with other hypervariable regions of Gp120, significantly more positively selected sites were detected in the V3 region of subtype B X4 variants, V2 region of subtype B R5 variants, and V1 and V4 regions of subtype C X4 variants, indicating an association of positive selection with co-receptor recognition/binding. Intriguingly, a significantly higher proportion (33.3% and 55.6%, P<0.05) of positively selected sites were identified in the C3 region than other conserved regions of Gp120 in all the analyzed HIV-1 variants, indicating that the C3 region might be more important to HIV-1 adaptation than previously thought. Approximately half of the positively selected sites identified in the env gene were identical between R5 and X4 variants. There were three common positively selected sites (96, 113 and 281) identified in Gp41 of all X4 and R5 variants from subtypes B and C. These sites might not only suggest a functional importance in viral survival and adaptation, but also imply a potential cross-immunogenicity between HIV-1 R5 and X4 variants, which has important implications for AIDS vaccine development. The usage of different kinds of chemokine receptors (referred to as co-receptors) in the cellular entry of HIV-1 strains confers their co-receptor tropisms [1] , which can further influence viral phenotypes. HIV-1 co-receptor tropism is central for understanding the transmission and pathogenesis of HIV-1 infection [2, 3] . The two major HIV-1 co-receptor tropisms are R5 and X4, using CCR5 and CXCR4 as co-receptors, respectively [4] . R5 and X4 have different viral characteristics, with R5 variants dominating the viral quasispecies early in and even throughout infection [1] , whereas X4 variants classically emerge late in infection. X4 exhibits rapid replication and higher virulence in peripheral blood mononuclear cells (PBMCs) than R5 [5] . The emergence of X4 strains is usually accompanied by an accelerated decrease in CD4 + T cell counts, which leads to an accelerated disease progression [6] . HIV-1 co-receptor switch from R5 to X4 occurs in approximately 50% of HIV-1 subtype B infected individuals during progression to AIDS. This co-receptor switch occurs to a lower extent in HIV-1 subtype C infected individuals [2] . Therefore, an evolutionary comparison between subtypes B and C will help to understand the mechanism of HIV-1 co-receptor switch. HIV-1 has high mutation rates and is often subject to strong selective pressures from human immune responses [7] . Positive (diversifying) selection has been widely detected in whole genomes, but especially in the env gene of HIV-1 group M, HIV-1 group O, HIV-2 and SIV [8] [9] [10] [11] [12] [13] [14] . Furthermore, an association between positive selection and AIDS disease progression was observed in pediatric and adult HIV-1 infections [15, 16] . Although only the C2V3C3 region of the env gene of R5 variants was used in these studies, increasing evidence showed that positive selection was more prevalent in individuals with slow HIV-1 disease progression than those with rapid disease progression [7, 17] . The reason for this was likely due, in part, to a stronger immune response in slow progressors and a destroyed immune system in rapid progressors. However, the reverse was observed in one study based on HIV-1 subtype B V3 sequences, in which syncytium-inducing (SI) variants appeared to evolve faster than the non-syncytium-inducing (NSI) variants [18] , implying that SI variants were subject to stronger positive selection than NSI variants. The disagreement was likely due to the usage of only part of the env gene in these studies. To reveal the intricate nature of selection pressures driving the evolution of HIV-1 R5 and X4 variants, adaptive evolutionary analyses based on whole env genes, or even whole genomic sequences are required. Therefore, a genome-wide comparison between the adaptive evolution of R5 and X4 variants from HIV-1 subtypes B and C was performed in the present study. We found that R5 and X4 variants underwent obviously different evolutionary patterns and different HIV-1 genes were subject to various selection pressures. We found that a significantly higher proportion of positively selected sites were identified in the C3 region than in other conserved regions of Gp120 in all analyzed HIV-1 variants, indicating that the C3 region might be more important to HIV-1 adaptation than previously believed. In addition, approximately half of the positively selected sites identified in env genes were identical between R5 and X4 variants. These common positively selected sites might not only imply functional importance in viral survival and adaptation, but also have important implications for AIDS vaccine development. All of the sequences used in this study were collected from the Los Alamos National Laboratory (LANL) HIV Sequence Database (http://www.hiv.lanl.gov/content/sequence/HIV/ mainpage.html). Because the full-length HIV-1 sequences with known co-receptor tropism are very limited, we primarily downloaded all 624 subtype B and 466 subtype C full-length sequences from this database. To determine the co-receptor tropism of each full-length sequence, three robust online prediction tools WebPSSM (http://indra.mullins. microbiol.washington.edu/webpssm) [19] , Geno2pheno (http:// coreceptor.bioinf.mpi-inf.mpg.de/index.php) [20] , and HIV-1PhenoPred (http://yjxy.ujs.edu.cn/R5-X4%20pred.rar) [21] were used based on hypervariable region 3 (V3) of Gp120. Only sequences that yielded consistent prediction results by the three tools were selected and divided into CCR5 and CXCR4 datasets. Because a very short evolutionary distance is able to reduce the discriminatory power [22] , sequences with closer evolutionary distances were deleted. As a consequence, 37 R5 and 33 X4 sequences from subtype B, and 28 R5 and 13 X4 sequences from subtype C were kept. These sequences were divided into four HIV-1 sub-populations for adaptive evolutionary analyses and their GenBank accession numbers are provided in Appendix Table 1. To compare the different HIV-1 genes that play various functions in the HIV-1 life cycle, each complete genome was divided into five genes: gag, reverse transcriptase (RT), integrase (IN), gp120 and gp41. For X4 variants of subtype C, only 13 env (gp120 and gp41) sequences were kept for adaptive evolutionary analyses and other gene subsets were excluded due to only a few sequences being available. The sequences of each data subset were aligned using Clustal W implemented in MEGA4 [23] and manually adjusted. The phylogenetic tree of each data subset was obtained using the maximum likelihood (ML) method (PHYML v2.4.4) [24] . Positive selection is measured by comparing the rate of nonsynonymous nucleotide substitutions per nonsynonymous site (d N ) with that of synonymous substitutions per synonymous site (d S ). The d N /d S ratio (ω) is traditionally used as an index to assess positive selection. The ω greater than 1 is taken as evidence of positive selection, ω equal to 1 indicates neutral selection, and ω less than 1 reflects strong negative (purifying) selection. The analyses of adaptive molecular evolution on all datasets were performed using six codon substitution models: M0 (one-ratio), M1a (nearly neutral), M2a (positive selection), M3 (discrete), M7 (beta), and M8 (beta and ω), as implemented in PAML 4.0 [25] . The details of these models were described in a previous study [26] . The likelihood ratio test (LRT) comparing three pairs of nested codon evolutionary models (M0 vs. M3, M1a vs. M2a, and M7 vs. M8) was used to test against the null hypothesis of no positive selection [27] . The null hypotheses of three pairs of nested models were rejected and positive selection was inferred when the LRT statistic was significant for a χ 2 distribution with the degrees of freedom equivalent to the difference in the number of parameters between nested models. Then the datasets were subjected to the identification of positively selected sites using three within the class with ω greater than 1 by selection models were admitted as positively selected [26] . Because of the overestimate of the number of actual positively selected sites [27, 28] , the results under model M3 were not used to identify positively selected sites. To reduce or avoid possible false-positive results, positively selected sites identified simultaneously by models M2a and M8 in the Codeml program were defined as positively selected [28] . To confirm the results obtained using the codon substitution models in PAML, similar analyses were also performed using online DataMonkey package [29] . HIV-1 R5 and X4 variants exhibit different phenotypic characteristics in cellular tropism and replication ability [5] . Two viral proteins Gp120 and RT are closely associated with HIV-1 phenotypes since the former determines cellular tropism by specifically recognizing co-receptors and the latter determines viral replication ability. Here, gp120, gp41, RT, IN and gag of R5 and X4 variants from subtypes B and C were subjected to adaptive evolutionary analyses. The results of the codon-based maximum likelihood analyses are shown in Tables 1 and 2 for HIV-1 subtype B gp120 and gp41 genes, and in Tables 3 and 4 for subtype C gp120 and gp41 genes, respectively. The results of other genes are shown in Appendix Tables 2-7 . Positive selection was detected in all analyzed genes by three positive selection models (M2a, M3 and M8), except for the IN gene of R5 variants of the B subtype. Comparison between different genes showed that stronger positive selection acted on env and gag genes than on RT and IN genes. The env genes appeared to be under the strongest positive selection pressure [8] [9] [10] [11] . The numbers of positively selected sites identified in different genes further supported stronger positive selection acting on env and gag genes. These results suggested that differential HIV-1 genes suffered differential positive selection patterns. In HIV-1 subtype B datasets, the numbers of positively selected sites in gp120 were 29 and 31 for R5 and X4 sub-populations, respectively. In gp120 of HIV-1 subtype C datasets, the numbers of sites were 16 and 24 for R5 and X4 sub-populations, respectively ( Table 5 ). The number of positively selected sites identified in X4 sub-populations was greater than in the R5 sub-populations, suggesting that gp120 gene of X4 variants was subject to a stronger positive selection pressure. An opposite pattern was observed in gp41 genes, where 27 and 17 positively selected sites were identified in R5 sub-populations of B and C subtypes, respectively, which was greater than the 24 and 11 sites in the X4 sub-population of B and C subtypes, respectively (Table 5 ). This implied that gp41 gene of R5 variants underwent a stronger positive selection than that of X4 variants. Therefore, although as a whole there was no obvious difference in a) The numbers in boldface represent the statistically significant differences (P<0.05) between one variable region and all other variable regions or between variable (V1-V5) and conserved regions (C1-C5) using Fisher's exact test. b) The numbers in boldface represent the statistically significant differences (P<0.05) between C3 and all other conserved regions (C1, C2, C4, and C5) using Fisher's exact test. the number of positively selected sites on env genes between R5 and X4 sub-populations, the gp120 and gp41 genes underwent different evolutionary pathways in R5 and X4 variants. When taking subtypes into account, the number of positively selected sites in the env genes of subtype B was greater than in subtype C. For the subtype B gp120 genes, 29 and 31 sites were identified in R5 and X4 sub-populations, respectively. There were 16 and 24 sites in the R5 and X4 sub-populations of subtype C, respectively. For the gp41 genes, 27 and 24 sites were identified in R5 and X4 sub-populations of subtype B, respectively, which was greater than the 17 and 11 sites in R5 and X4 sub-populations of subtype C, respectively. These results suggest that env genes of subtype B underwent stronger positive selection than that of subtype C. The HIV-1 gag gene encodes four structural proteins. The gag products are crucial targets, recognized by the human immune system. In gag genes, 9 and 10 positively selected sites were detected in R5 sub-populations of subtypes B and C, respectively, obviously greater than the six sites in the X4 sub-population of subtype B (Appendix Tables 2 and 3) , however, four of these sites (91, 138, 280 and 374) were all detected in both the R5 and X4 sub-populations of subtype B. Two (91 and 138) of the four common sites were also identified in the R5 sub-population of subtype C, possibly implying importance of HIV-1 adaptation. HIV-1 RT and IN are key enzymes in the HIV life cycle. A comparison of the selection pressures for the two genes in the R5 and X4 sub-populations could help to distinguish the difference in replication rates between R5 and X4 variants. In HIV-1 subtype B datasets (Appendix Table 4 ), two (162 and 376) sites in the R5 sub-population and one (211) site in the X4 sub-population were detected under positive selection in RT genes. Two sites (118 and 123) were detected under positive selection for the IN gene in the X4 sub-population, whereas no positive selection was detected in the R5 subpopulation (Appendix Table 6 ). In the R5 sub-population of subtype C dataset, three sites (123, 344, and 377) in RT and three sites (50, 72, and 125) in IN were identified under positive selection (Appendix Tables 5 and 7 ). Gp120 is the HIV-1 surface glycoprotein that not only determines viral tropism, but also is the most important target for the host immune response. Gp120 contains five conserved (C1-C5) and five hypervariable regions (V1-V5). Comparing the location of positively selected sites in both conserved and hypervariable regions showed that significantly more positively selected sites occurred in hypervariable regions in the X4 sub-populations of subtypes B (41.9%, P=0.041) and C (62.5%, P=0.0002) than in conserved regions relative to the proportion (28.8%) of hypervariable regions in whole Gp120 (Table 5 ). This result implied that Gp120 hypervariable regions of X4 variants were subject to stronger positive selection pressures. We further analyzed the distribution of positively selected sites in five hypervariable regions. In X4 sub-population of subtype B, significantly more sites (61.5%, P=0.0011) appeared in the V3 region that was a critical determinant of co-receptor tropism and the main epitope for eliciting neutralizing antibody [30, 31] compared with other hypervariable regions relative to the proportion (24.5%) of V3 in whole hypervariable regions. This implied a stronger positive selection pressure on V3, consistent with previous observations [7, 18] . In X4 sub-population of subtype C, however, higher proportions (66.7%, P=0.0316) of positively selected sites were located in the V1 and V4 regions, both of which account for 34% of hypervariable regions (Table 5) . Additionally, in R5 sub-population of subtype B, significantly higher proportions (83.3%, P=0.0013) of positively selected sites were located in the V2 region that accounts for only 26.5% of the hypervariable region (Table 5) . When taking conserved regions into account, all four sub-populations exhibited significantly higher proportions of positively selected sites in the C3 region than in other conserved regions (R5 sub-population of subtype B: 43.5%, P<0.0001; X4 sub-population of subtype B: 33.3%, P= 0.0206; R5 sub-population of subtype C: 45.5%, P=0.0032; X4 sub-population of subtype C: 55.6%, P=0.0004; Table 5 ). These results suggested that C3 might be more important in Gp120 evolution than previously thought. With regard to Gp41, we found that relatively few positively selected sites (9.1%-22.2%) occurred in the two heptad repeat (HR) regions when compared with the proportion (24.3%) of HR1 and HR2 in whole Gp41 (Table 5 and Figure 1 ). We observed that approximately half of the positively selected sites identified in the env genes were the same between R5 and X4 sub-populations regardless of whether they were subtypes B or C. As an example, in subtype B, 30 of 56 positively selected sites identified in the R5 subpopulation were also detected in X4. In subtype C, 16 positively selected sites were common between the R5 and X4 sub-populations (Table 5 and Figure 1 ). Of additional note were three positively selected sites (96, 113 and 281) in gp41 genes that were common among all four sub-populations ( Figure 1 ). By comparing selection pressures acting on several key genes of HIV-1 from subtypes B and C and from R5 and X4 sub-populations, we found that env (gp120 and gp41) and gag genes underwent higher selection pressures than other genes. These results suggested that certain HIV-1 genes were subject to different selection pressures [8] . Comparison of positively selected sites identified in env of R5 and X4 sub-populations showed that both variants experienced obviously different evolutionary patterns. For the gp120 genes, more positively selected sites were identified in the X4 than in the R5 sub-population (Table 5) , similar to previous observations of highly positive selection in the V3 region of SI compared with NSI variants [18] . However, this pattern was reversed for Gp41, which underwent stronger positive selection in R5 compared with X4 variants (Table 5 ). These results suggested R5 and X4 had different evolutionary patterns. At least two kinds of potentially positive selection pressures, the host immune system and the target cell range, can drive HIV-1 evolution in treatment-naïve HIV-1-infected individuals [2] . The host immune responses including humoral and cellular immune responses were generally assumed to be the most important evolutionary pressures for adaptive evolution observed in the HIV-1 genome [7] . The gag gene encodes viral structural proteins that are less involved in viral replication and co-receptor recognition/ binding. A total of 19 positively selected sites were identified in gag genes of three analyzed HIV-1 sub-populations (R5 and X4 sub-populations from subtype B and R5 subpopulation from subtype C) (Appendix Tables 2 and 3 ). All these sites were found to be associated with at least one of three kinds of epitopes (Ab, CTL and T-helper). In particular, 83.3% and 66.7% of these sites were associated with CTL and T-helper epitopes, respectively. These results indicated that the positive selection pressures on gag genes were primarily imposed by the host immune response. Among these positively selected sites, two sites at 91 and 138 were detected in all three sub-populations, possibly implying an additional importance for HIV-1 adaptation. A previous study demonstrated that a residue change in site 30 of Gag was able to confer a species specific replication advantage in HIV or SIV to adapt to their hosts [32] . The potential roles of sites 91 and 138 in HIV-1 adaptation need to be assessed by site-directed mutagenesis analyses. RT is a key enzyme responsible for HIV-1 replication. HIV-1 X4 viruses usually exhibit higher replication rates than R5 viruses [2] . A total of six positively selected sites were identified in RT of three sub-populations. All these sites were located in the DNA polymerase domain of RT [33] , and the sites identified in R5 and X4 variants were different, implying that these positively selected sites might be associated with specific replication characteristics of R5 or X4 variants. All these sites were associated with CTLspecific epitopes. Therefore, cellular immune responses were also likely to drive the evolution of RT. The envelope (Env) glycoprotein of HIV-1 is exposed on the surface of the virus particle and HIV-1 infected cells, playing an important role in viral survival. It not only determines the co-receptor tropism of HIV-1, but is also the major determinant of immunogenicity for humoral and cellular immune responses. Moreover, the highest mutation rate of the env gene in HIV-1 genome confers a potential ability to escape host immune responses [34, 35] . Therefore, the adaptive evolution of env genes was thought to be complex and might involve multiple selection factors such as cell source, host immune responses and the virus itself [2, 8] . The hypervariable rather than conserved regions were demonstrated to determine HIV-1 co-receptor tropism. The V3 region plays the most important role in the determination of HIV-1 co-receptor tropism [30, 36, 37] and other hypervariable regions, such as V1V2 and V4 regions, affect the co-receptor usage [38] [39] [40] [41] [42] [43] [44] . Comparing the location of positively selected sites in Gp120 showed that in X4 sub-populations significantly more sites were located in the hypervariable regions (P<0.05). However, a similar pattern was not observed in R5 sub-populations. Further analyses showed that significantly more positively selected sites were in the V3 region (61.5%, P=0.0011) of subtype B X4 variants, the V2 region (83.3%, P=0.0013) of subtype B R5 variants, and V1 and V4 regions (66.7%, P=0.0316) of subtype C X4 variants compared with other hypervariable regions of Gp120 (Table 5 ). These results distinctly indicated that positive selection acting on the Gp120 hypervariable regions was closely associated with the function of co-receptor recognition/binding. The V1V2, V4 and V5 regions have been demonstrated to contribute to autologous neutralization [45] . This implied that humoral immune response-imposed positive selection also contributed to the evolution of Gp120. A higher proportion (33.3%-55.6%, P<0.05) of positively selected sites were located in the C3 region than in other conserved regions of Gp120 in all four HIV-1 sub-populations (Table 5 ). This result indicated that the C3 region might be more important to the function of HIV-1 Gp120 than previously believed. Moreover, the C3 region of the subtype C virus was able to elicit early autologous neutralizing response to HIV-1 infection by forming an important structural motif together with the V4 region [45] . The results observed in the C3 region also supported that humoral immune response-imposed positive selection might play a role in the evolution of Gp120. Like the S2 domain of SARS-CoV spike (S) protein, HIV-1 Gp41 contains two HR regions, which have been shown to be important in virus membrane fusion [46] . We found that low proportions (9.1%-22.2%) of positively selected sites occurred in two HR regions of Gp41 (Figure 1 ), possibly arguing against membrane fusion as a major selection factor for the evolution of HIV-1 Gp41 [47] . However, three sites (96, 113 and 281) were detected in Gp41 of all four HIV-1 sub-populations, and two of these sites (96 and 113) were located in the middle region between two HR regions. This finding suggested that the three mutual sites might play some role in the membrane fusion function of Gp41, supporting membrane fusion as a minor selection factor for the evolution of HIV-1 Gp41. Furthermore, we found that approximately half of the positively selected sites identified in env genes were identical between R5 and X4 variants (Figure 1 ). These common positively selected sites not only indicated that they were functionally important for the survival of both HIV-1 R5 and X4 variants, but also suggested that immune responses might be targeting the same viral region in both variants. These positively selected sites shared by all X4 and R5 variants might have important implications for AIDS vaccine development. This work was supported by the National Natural Science Foundation of China (Grant No. 30600352), the "Top-notch Personnel Chemokine receptors as HIV-1 coreceptors: roles in viral entry, tropism, and disease The HIV coreceptor switch: a population dynamical perspective The CCR5 and CXCR4 coreceptors--central to understanding the transmission and pathogenesis of human immunodeficiency virus type 1 infection A new classification for HIV-1 Coreceptor usage of primary human immunodeficiency virus type 1 isolates varies according to biological phenotype Change in coreceptor use correlates with disease progression in HIV-1-infected individuals Immune-mediated positive selection drives human immunodeficiency virus type 1 molecular variation and predicts disease duration Comparative study of adaptive molecular evolution in different human immunodeficiency virus groups and subtypes Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene Widespread adaptive evolution in the human immunodeficiency virus type 1 genome Evidence for heterogeneous selective pressures in the evolution of the env gene in different human immunodeficiency virus type 1 subtypes Genealogical evidence for positive selection in the nef gene of HIV-1 Mapping sites of positive selection and amino acid diversification in the HIV genome: an alternative approach to vaccine design? Positive selection on HIV accessory proteins and the analysis of molecular adaptation after interspecies transmission Disease progression and evolution of the HIV-1 env gene in 24 infected infants Selective pressures of human immunodeficiency virus type 1 (HIV-1) during pediatric infection Adaptation in the env gene of HIV-1 and evolutionary theories of disease progression Evolutionary mechanisms and population dynamics of the third variable envelope region of HIV within single hosts Improved coreceptor usage prediction and genotypic monitoring of R5-to-X4 transition by motif analysis of human immunodeficiency virus type 1 env V3 loop sequences Predicting HIV coreceptor usage on the basis of genetic and clinical covariates Improved prediction of coreceptor usage and phenotype of HIV-1 based on combined features of V3 loop sequence using random forest Reliabilities of identifying positive selection by the branch-site and the site-prediction methods MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0 PHYML Online--a web server for fast maximum likelihood-based phylogenetic inference PAML 4: phylogenetic analysis by maximum likelihood Codon-substitution models for heterogeneous selection pressure at amino acid sites Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution Accuracy and power of bayes prediction of amino acid sites under positive selection Datamonkey: rapid detection of selective pressure on individual sites of codon alignments Identification of the envelope V3 loop as the primary determinant of cell tropism in HIV-1 Broadly neutralizing antibodies elicited by the hypervariable neutralizing determinant of HIV-1 Adaptation of HIV-1 to its human host Crystal structure of HIV-1 reverse transcriptase in complex with a polypurine tract RNA:DNA The neutralizing antibody response to HIV-1: viral evasion and escape from humoral immunity HIV: current opinion in escapology Phenotype-associated sequence variation in the third variable domain of the human immunodeficiency virus type 1 gp120 molecule CCR5 coreceptor usage of non-syncytium-inducing primary HIV-1 is independent of phylogenetically distinct global HIV-1 isolates: delineation of consensus motif in the V3 domain that predicts CCR-5 usage A single amino acid substitution in the V1 loop of human immunodeficiency virus type 1 gp120 alters cellular tropism Determinants of entry cofactor utilization and tropism in a dualtropic human immunodeficiency virus type 1 primary isolate Relation of phenotype evolution of HIV-1 to envelope V2 configuration Human immunodeficiency virus type 1 coreceptor switching: V1/V2 gain-of-fitness mutations compensate for V3 loss-of-fitness mutations Effect of amino acid changes in the V1/V2 region of the human immunodeficiency virus type 1 gp120 glycoprotein on subunit association, syncytium formation, and recognition by a neutralizing antibody Complex determinants in human immunodeficiency virus type 1 envelope gp120 mediate CXCR4-dependent infection of macrophages Identification of determinants on a dualtropic human immunodeficiency virus type 1 envelope glycoprotein that confer usage of CXCR4 Autologous neutralizing humoral immunity and evolution of the viral envelope in the course of subtype B human immunodeficiency virus type 1 infection Heptad repeat sequences are located adjacent to hydrophobic regions in several types of virus fusion glycoproteins Adaptive evolution of the spike gene of SARS coronavirus: changes in positively selected sites in different epidemic groups Appendix Table 1 List of GenBank accession numbers for HIV-1 genomic sequences analyzed in the text Co-receptor tropism Sequence number GenBank accession numbers R5 37 AB286956, AB253432, AF003888, AF042101, AF224507, AY173952, AY037282, EU576191, AY586543, AY713412, AY835748, AY713411, EU574998, AY561236, AY970946, AY839827, AY857022, D10112, DQ854714, DQ837381, DQ886031, FJ469746, EF363124, EF637046, EF514699, EF637049, EU786675, FJ460501, FJ469770, FJ495937, FJ469703, FJ469731, M93258, U23487, U63632, FJ496085, FJ496150, K02007 HIV-1 B X4 33 EU281726, K02013, AB287363, AB287365, AF049494, AF086817, AF146728, AY037268, AY736821, AY173956, AY180905, AY560108, AY835767, AY835768, D86068, DQ127534, DQ396398, DQ823363, EF514712, FJ469686, FJ469692, FJ469736, FJ469737, FJ469739, FJ469748, FJ469753, FJ469759, L02317, L31963, M17449, M26727, FJ496166 R5 28 AB254141, DQ369991, AY734550, DQ275642, EU786673, AY878054, AF286227,AY945738, FJ496185, U46016, AY713414, AF110978, AF110981, AF286224, AF286231, AY444800, AY463217, AY563170, AF286233, AF286234, AF290027, AY043176, AF391231, AY118165, AY253303, AY228556, AY228557, AY253321 HIV-1 C X4 13 FJ846637, FJ846642, AY878064, DQ093600, AY529666, FJ846647, AY529678, DQ382362, DQ382372, DQ382378, AY529677, AY529673, AF411966 Appendix a) Positively selected sites were identified with posterior probability P≥95%; in boldface, P≥99%.