key: cord-1003341-ne9d0siv authors: Banks, Jonathan C.; Whitfield, James B. title: Dissecting the ancient rapid radiation of microgastrine wasp genera using additional nuclear genes date: 2006-06-07 journal: Mol Phylogenet Evol DOI: 10.1016/j.ympev.2006.06.001 sha: 094439867383c6b40f9e81e1b67e6524da5dadab doc_id: 1003341 cord_uid: ne9d0siv Previous estimates of a generic level phylogeny for the ubiquitous parasitoid wasp subfamily Microgastrinae (Hymenoptera) have been problematic due to short internal branches deep in the phylogeny. These short branches might be attributed to a rapid radiation among the taxa, the use of genes that are unsuitable for the levels of divergence being examined, or insufficient quantity of data. We added over 1200 nucleotides from four nuclear genes to a dataset derived from three genes to produce a dataset of over 3000 nucleotides per taxon. While the number of well-supported short branches in the phylogeny increased, we still did not obtain strong bootstrap support for every node. Parametric and nonparametric bootstrap simulations projected that an enormous, and likely unobtainable, amount of data would be required to get bootstrap support greater than 50% for every node. However, a marked increase in the number of well-supported nodes was seen when we conducted a Bayesian analysis of a combined dataset generated from morphological characters added to the seven gene dataset. Our results suggest that, in some cases, combining morphological and genetic characters may be the most practical way to increase support for short branches deep in a phylogeny. Uncertainty in phylogenetic estimation at higher taxonomic levels is inevitable, due to the confounding eVects of factors that may indicate alternative patterns. These factors include the convergence of morphological characters from similar ecological forces, and multiple substitutions in genetic data ("saturation"). Convergence and saturation often result in low bootstrap support values, poor Bremer decay indices or low Bayesian posterior probability values for some branches (SwoVord et al., 1996) . However, poor branch support can also be caused by failure to use a suYcient quantity of data (Fishbein et al., 2001) , use of data that are inappropriate for the level of divergence that is being analysed (de Queiroz et al., 1995) , or rapid evolutionary radiations among taxa (Fishbein et al., 2001) . Often, it is diYcult to know which factors are operating in any particular case. Although phylogenies without strong support for all branches are sometimes well accepted, there are situations, such as the study of cophylogenetic relationships between hosts and associates when well-supported phylogenies are important. For example, reconciliation analysis (Page, 1995) , the method most commonly used to examine cophylogenetic relationships (Brooks and McLennan, 2003) , infers cophylogenetic history from the topology of the host and associate phylogenies and thus requires robust phylogenies to reconstruct the evolutionary history of the relationship between hosts and associates. Other situations requiring robust phylogenies include the forensic use of phylogenies to identify the source of infections such as human immunodeWciency virus (Korber et al., 2000; Rambaut et al., 2001; Worobey et al., 2004) and severe acute respiratory syndrome (SARS) (Guan et al., 2003) . One example of poor support possibly caused by several factors occurs in the phylogenies estimated for microgastrine wasps . Microgastrinae, a subfamily of Braconidae (Hymenoptera), is a speciose group with approximately 1400 described species in over 55 genera, and it has been estimated that there may actually be 5000 to 10,000 species worldwide (WhitWeld, 1997b; Whit-Weld et al., 2002) . Microgastrine wasps lay their eggs on lepidopteran larvae, and the wasp larvae develop while consuming the tissues of the lepidopteran larvae (WhitWeld, 1997b; WhitWeld et al., 2002) . Many microgastrine wasp species have been transferred around the world to aid in the control of crop pests (WhitWeld, 1997b; WhitWeld et al., 2002) . All microgastrine wasps have inherited an association with polydnaviruses, which are incorporated into the wasp genomes and help the wasp larvae evade lepidopteran immune systems (WhitWeld and Asgari, 2003) . It has therefore been of considerable coevolutionary interest to compare the phylogenetic histories of the wasps and those of the viruses. A robust phylogenetic framework is essential for producing a useful and informative classiWcation for this large, economically and ecologically important insect group. Previous work that estimated a phylogeny for the microgastrines from 2300 nucleotides from three genes (16S, 28S and COI) and 53 morphological characters found a tree with low bootstrap support for many branches (Mardulyn and WhitWeld, 1999; WhitWeld et al., 2002) . The poorly supported branches in the microgastrine phylogenies are mainly short internal branches (Mardulyn and WhitWeld, 1999; WhitWeld et al., 2002) . It was proposed that the short branches might have arisen from a rapid radiation as the microgastrines colonised new lepidopteran host species (Mardulyn and WhitWeld, 1999) , which themselves may have been diversifying in the early Tertiary (Grimaldi, 1999; WhitWeld, 2002) . Support for the rapid radiation of microgastrines was bolstered by the fact that the same branches were estimated to be short from multiple data sources. However, it was also acknowledged that the poorly supported short branches may have been due to insuYcient data or the use of genes with rates of divergence that are inappropriate for the levels of divergence between the taxa (Mardulyn and WhitWeld, 1999) . Here we present analyses of data from two mitochondrial and Wve nuclear genes, including the genetic data (16S, 28S and COI) and 53 morphological characters analysed by WhitWeld et al. (2002) . These analyses show that completely robustly supported phylogenies for Microgastrinae are unlikely to be estimated from genetic data alone. We use parametric and nonparametric bootstrapping of simulated datasets to estimate how much data would be required to resolve the phylogeny with every branch having nonparametric bootstrap values greater than 50%. The simulations show that unless an impractically large amount of molecular data is obtained, the use of morphological characters may be necessary to produce a completely robustly supported phylogeny that can be used to examine cophyloge-netic relationships between microgastrine wasps and polydnaviruses. Wasps were stored in 100% ethanol at 4°C until genomic DNA could be extracted. Specimens were identiWed by JBW to genus, and to species where possible, using morphological characters and often also host data. Taxa from which sequences were obtained are listed in Table 1 . Because we had few sequences from Apanteles canarsiae we pooled sequences for A. canarsiae with A. galleriae and the resulting "chimera" is labelled Apanteles sp. in the phylogenies. Whole wasps were macerated using mini-mortar and pestles and the DNA extracted using Qiagen DNeasy tissue extraction kits. Polymerase chain reactions (PCR) were carried out with an Eppendorf Mastercycler thermocycler. PCR consisted of 2.5 L of Hotmaster buVer (Eppendorf), 1.2 L of dNTPs (8 mM), 2.5 L of each primer (2.5 M), 0.125 L Hotmaster Taq (5 units/ L, Eppendorf), 0.8 L of DNA and 15.375 L water. PCRs consisted of an initial denaturing step of 94°C for 2 min, followed by 35 cycles of 94°C for 20 s, 20 s at the temperatures listed in Table 2 , 65°C for 40 to 60 s depending on the size of the target region, and a Wnal step of 65°C for 5 min Primer sequences are listed in Table 2 . A negative control was incorporated in each ampliWcation round using water rather than DNA. PCR products were puriWed using Qiagen QIAquick kits. Sequencing was carried out on an ABI 3730 capillary sequencer. The three genes, 16S, COI and 28S, originally used for this group were selected for their broad use among diVerent groups of insects, ease of ampliWcation across all taxa and because they provide resolution at several phylogenetic levels. 16S has been used to resolve intra-family relationships in Hymenoptera (WhitWeld and Cameron, 1998) . The nuclear gene 28S (including the D2 and D3 expansion loops) has provided a strong signal for intermediate and moderately deep levels in the phylogeny (Belshaw et al., 1998; Belshaw and Quicke, 1997; Cameron and Mardulyn, 2001; Cameron and Williams, 2003; Mardulyn and WhitWeld, 1999; Michel-Salzat and WhitWeld, 2004; WhitWeld, 2002; WhitWeld et al., 2002; Wiegmann et al., 2003) and retains at least some signal at the species level. COI has been found to saturate quickly at the third position while remaining quite "conserved" at the Wrst two positions due to a small number of sites free to vary (Mardulyn and Whit-Weld, 1999) . Thus, it has proven highly useful at lower levels to detect species boundaries (Hebert et al., 2003a (Hebert et al., , 2004 (Hebert et al., , 2003b but has tended to fail at higher levels, especially in divergence time estimation studies (e.g., WhitWeld, 2002) . We added sequences from four nuclear genes to the dataset of WhitWeld et al. (2002) . Arginine kinase has been used to resolve bee relationships at species and tribal level (Danforth et al., 2005; Kawakita et al., 2003) . The nuclear gene EF1 has been used extensively to resolve lepidopteran relationships at intermediate phylogenetic levels (Cho et al., 1995; Friedlander et al., 1998; Mitchell et al., 1997 Mitchell et al., , 2000 Mitchell et al., , 2006 Wiegmann et al., 2000) and has been used recently in a number of studies on Hymenoptera (Cameron, 2003; Danforth et al., 2004; Danforth and Ji, 1998; Kawakita et al., 2003; Leys et al., 2002; . The gene occurs in at least two divergent copies in most Hymenoptera (originally reported by Danforth and Ji, 1998) , but these copies are easily separated by PCR once taxon-speciWc primers are developed. We used primers that amplify the F2 copy in bumble bees (Kawakita et al., 2003) . Long wavelength rhodopsin (opsin) has been used to resolve relationships among bees (Mardulyn and Cameron, 1999) , and is especially useful for intermediate levels of phylogeny (from species up to intergeneric and tribal levels- Cameron and Mardulyn, 2001; Cameron and Mardulyn, 2003; Cameron and Williams, 2003; Danforth et al., 2004; Kawakita et al., 2003; Michel-Salzat and WhitWeld, 2004) , despite early reservations (Ascher et al., 2001) . Wingless is less variable than many mtDNA genes, but more variable than most of the other nuclear protein-coding genes we sequenced in this study. Thus wingless tends to be useful at the generic level rather than at higher hierarchical levels (Brower and DeSalle, 1998) . Sequences were aligned with Clustal X (Thompson et al., 1997) . Alignment of COI, EF1 , and wingless sequences was straightforward as there were few insertions or deletions. The alignment of arginine kinase and opsin sequences was straightforward once an intron in each gene had been removed. There were several variable length-regions of 16S and 28S where it was diYcult to assign homology. Regions of 16S and 28S that could not be aligned unambiguously were omitted from the analysis. We tested for incongruence between genes using the incongruence length diVerence test (Farris et al., 1994 (Farris et al., , 1995 implemented in PAUP*4.0b10 (SwoVord, 2002) as the partition homogeneity test. We conducted 100 replicates and compared genes in both a pairwise manner and each gene to the rest of the combined sequence data with that gene excluded. Parsimony uninformative characters were removed before each test. We also tested the data for stationarity (equal nucleotide proportions between taxa) using the 2 test in PAUP*. We used PAUP* to conduct maximum parsimony (MP), maximum likelihood (ML) and LogDet phylogenetic anal-yses. LogDet is a distance based method that is less aVected than MP when taxa diVer in their base frequencies (Lockhart et al., 1994) . The Akaike Information Criterion as implemented in ModelTest 3.06 (Posada, 2000; Posada and Crandall, 1998) was used to select the model and estimate model parameters (GTR + gamma + proportion of invariable sites (Rodríguez et al., 1990; Tavaré, 1986; Yang et al., 1994) ; base frequencies A D 0.3166, C D 0.1589, G D 0.1819; rate matrix AC D 1.7463, AG D 11.0123, AT D 8.9781, CG D 2.1939, CT D 14.8349; D 0.6963; proportion of invariable sites D 0.3614) from all seven genes combined for the ML analysis. MrBayes 3.1 (Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck, 2003) was used to generate Bayesian estimates of microgastrine phylogeny. We used a mixed model approach with eight partitions corresponding to the morphological characters and the seven gene regions. The models used for each of the seven genes were GTR (Tavaré, 1986) plus a proportion invariable sites plus gamma (Rodríguez et al., 1990; Yang et al., 1994) . MrBayes estimated the model parameters from the data using one cold and three heated Markov chains. The Monte Carlo Markov chain length was 2,000,000 generations and we sampled the chain every 100 generations. We discarded the Wrst 5000 samples as burnin and thus estimated our phylogeny and posterior probabilities from a consensus of the last 15,000 sampled trees. To compare the branch lengths of branches with bootstrap support greater than 50% to branches with less than (Brower and DeSalle, 1998) 50% support, we reduced our data set to the 27 taxa for which we had data for all seven genes and estimated the phylogeny for the 27 taxa under the MP criterion. The MP analysis found three most parsimonious trees. We then loaded one of the three most parsimonious trees found from the MP analysis into PAUP* as a constraint tree and used MP to estimate the branch length for the constraint tree for individual genes. Branches for the constraint tree were categorised as either bootstrap support >50% or <50% and branch lengths for the branches for each gene were recorded and compared using a Student's t test in Systat 9 (SPSS, 1998). Pseudoreplicate datasets one and a half, two, three, four, Wve and 10 times the size of our dataset were constructed from the aligned data by altering the number of characters re-sampled in the nonparametric bootstrap command of PAUP*. These pseudoreplicate datasets were then analysed under the MP criterion and bootstrapped to estimate the amount of data that would be required to estimate phylogenies with all nodes having bootstrap support greater than 50%. Pseudoreplicate-data sets one and a half, two, three, four, Wve and 10 times the size of our dataset were also generated using a parametric approach with Seq-Gen (Rambaut and Grassly, 1997) from the ML equation calculated from the original data by Modeltest 3.06. Nonparametric bootstrap values were then obtained with PAUP* from the MP trees estimated from the data sets produced by Seq-Gen. This approach assumes that the data added will have similar properties to the data already obtained. This seems a valid assumption given that the genes we sequenced cover a range of evolutionary rates. To assess the eVect of altering the number of taxa, we randomly deleted taxa from our actual dataset to give datasets containing 15, 20, 25, 30, 35 and 40 taxa. We then obtained nonparametric bootstrap values for the branches from 100 replicates using MP. We added 1248 nucleotides to the previously published dataset and analysed a total of 3031 nucleotides (including gaps). We used primers that bound to a more conserved region of COI and thus reduced the previously published COI sequences (Mardulyn and WhitWeld, 1999) from 1235 nucleotides to 419 nucleotides that were homologous with our sequences. Levels of variation between species and genera for the seven genes diVered with 28S, arginine kinase, EF1 , opsin and wingless diverging more slowly than 16S and COI, which quickly saturated at the generic level (Fig. 1) . The ILD test found 17 of the 21 pairwise comparisons of the genes were signiWcantly incongruent (P 6 0.01). The four comparisons that were not signiWcantly incongruent were 28S to arginine kinase, EF1 and wingless, and EF1 to 16S. Six of the seven comparisons of individual genes to the rest of the combined molecular data (with each individual gene excluded) revealed signiWcant diVerences (P D 0.01). The exception was arginine kinase, which was not signiWcantly heterogeneous with the combined data (P D 0.6). The 2 test of sequence stationarity in PAUP* found there was signiWcant heterogeneity in nucleotide proportions. A nonsigniWcant result was obtained when the third positions of codons in protein coding genes were excluded. A maximum parsimony analysis of all seven genes for all taxa found three equally parsimonious trees of length 6861; consistency index, excluding uninformative characters D 0.30; retention index D 0.44 from 3031 characters of which 1494 were constant and 1207 were parsimony informative. The strict consensus tree of the three most parsimonious trees is shown in Fig. 2. A MP analysis found a single most parsimonious tree when third positions were excluded (result not shown) that was broadly congruent with the MP tree estimated from all positions (Fig. 2) . Excluding third positions did not appreciably alter bootstrap support values obtained with MP, as 15 branches still had bootstrap support of less than 50%. Deleting third positions altered relationships found by MP within more recently diverged clades, but most deep and mid level relationships were not altered. The exception was that Choeras was placed as sister to Sendaphne and Promicrogaster, rather than with Fornicia and Deuterixys, when third positions were excluded, a more reasonable result based on morphology. Both MP and Bayesian methods (Figs. 2 and 3 ) supported Microgastrinae as a monophyletic group. Both methods found broadly similar relationships. However, the placement of Fornicia diVered greatly depending on the Fig. 1 . Average uncorrected pairwise distances between microgastrine species and genera, and braconid subfamilies for the seven genes sequenced. Arginine kinase (Argk), elongation factor 1 alpha (EF1 ), opsin and wingless are the genes added to the original dataset of Mardulyn and WhitWeld (1999) and WhitWeld (2002). We were unable to get sequences from three genes for Fornicia, and it is possible that missing data are causing the two methods to diVer in their placement of Fornicia. The numbers of branches with high or low levels of support diVered slightly for the phylogenies estimated using MP and Bayes. Maximum parsimony had 14 branches with bootstrap support of less than 50%, whereas the Bayesian tree had 10 branches with posterior probability values of less than 0.9. There was no obvious trend for branches to have higher Bayesian posterior probabilities than MP bootstrap support values. Some branches were supported with posterior probabilities higher than 0.9 but had low bootstrap support, while some branches that had posterior probabilities much less than 0.9 received high bootstrap support values. For example, grouping Hypomicrogaster ecdytolophae and Parapanteles sp. as sisters received a bootstrap value of 79% but had a posterior probability of 0.56. It must of course be kept in mind that in these comparisons, both the branch support measure and the optimality criterion for tree estimation diVer. The LogDet analysis found a single tree (result not shown) that was broadly congruent with both the MP and Bayesian trees. Nine nodes had bootstrap support of less than 50% (similar to the other methods). Nodes in the Log-Det tree that diVered from the MP and Bayesian trees did not have high levels of bootstrap support. Branches with bootstrap support of <50% in phylogenies obtained from the individual genes were signiWcantly shorter than branches with bootstrap support >50% (mean bootstrap <50% D 5.5, SE D 0.7; mean bootstrap >50% D 10.5, SE D 0.7; Student's t D 5.21, df D 127.3, P < 0.001) in the phylogeny (Fig. 4 ) estimated for the 27 taxa for which we had sequences for all seven genes. The nonparametric approach using PAUP* to resample more characters from the original dataset found that the number of branches in the phylogeny with less than 50% bootstrap support reduced reasonably quickly until the dataset contained approximately 12,000 nucleotides (Fig. 5) . After 12,000 nucleotides the rate of reduction of poorly supported branches decreased, and even with ten times more data than we have obtained Wve nodes still had bootstrap support of less than 50%. The parametric approach using Seq-Gen found that all branches of the phylogeny would have greater than 50% bootstrap support at around 12,000 nucleotides (Fig. 5) . However, the nonparametric approach is more likely to be realistic, simulating normal "messy" data. Reducing the number of taxa in the dataset had little eVect on the proportion of branches in the phylogeny with less than 50% bootstrap support (Fig. 6) . Between 50% and 71% of the branches had bootstrap support of less than 50% depending of the number of taxa in the data set. The Bayesian analysis of 53 morphological and 3031 molecular characters reduced the number of nodes with posterior probabilities of less than 0.9 from nine branches to three (Fig. 7) . The phylogeny estimated from the Bayesian analysis of the seven genes alone diVered from the phylogeny estimated from seven genes and 53 morphological characters in only two places. Fornicia moved from being part of a clade with Hypomicrogaster and Parapanteles (Fig. 3) in the phylogeny estimated from the genetic data alone, to being a sister to Venanides, Glyptapanteles and Cotesia (Fig. 7) . Dolichogenidea and Pholetesor moved to being a sister to Hypomicrogaster, Promicrogaster, Parapanteles and Sendaphne in the combined genetic and morphological dataset. The placements of Fornicia and Pholetesor both had posterior probabilities of less than 0.9 in the phylogeny estimated from genes alone and only the branch placing Fornicia improved markedly in support (from 0.63 to 0.98). We identiWed genes that diverge more slowly than those already sequenced for Microgastrinae and their addition resulted in a more robust phylogeny for Microgastrinae, as assessed by higher nonparametric bootstrap proportions and posterior probabilities. However, despite substantially increasing the size of the dataset, we still did not obtain a completely robustly supported phylogeny using several methods of phylogeny estimation. Indeed, our nonparametric bootstrap simulations suggest we are unlikely to get a completely supported phylogeny from DNA alone. Although bootstrap support is not a direct measure of phylogenetic accuracy, most authors at least implicitly interpret the Wgures as rough measures of statistical support for a node (Buckley and Cunningham, 2002) . Support for the branches in our phylogeny could not be increased markedly by the method of analysis alone. The LogDet method of analysis is less aVected by nonstationary data than is MP (Lockhart et al., 1994) . However, using the LogDet transformation also did not produce a totally robustly supported tree. Excluding character sets did not produce a totally supported tree. For example, when a MP analysis of all data was compared to a MP analysis of the data with the third codon positions excluded, similar numbers of nodes had bootstrap support greater than 50%. A marked improvement in support for our phylogeny was seen however when we added morphological characters and used a mixed model Bayesian analysis. Several other studies on diverse groups such as weevils (Marvaldi et al., 2002) , molluscs (Collin, 2003) and feather mites (Dabert et al., 2001) have noted improvements in resolution and statistical support when analyses are conducted on combined morphological and DNA data. Dabert et al. (2001) also noted that molecular data alone tended to produce trees with better resolution and support at the terminal tips, and poor resolution and poor support deeper in the phylogeny, whereas the opposite (i.e., better resolution and support deeper in the phylogeny and poor resolution and support at the tips) occurred for phylogenies estimated from morphological data alone. Short branches deep in a phylogeny are notoriously diYcult to resolve. These branches will invariably have poor support as they are short due to a paucity of shared derived characters (Mardulyn and WhitWeld, 1999) and we found that branches with less than 50% bootstrap support were signiWcantly shorter than branches with greater than 50% support. In the case of the microgastrines, the short branches deeper in the phylogeny may be associated with the radiation of the Ditrysia (which contains 98% of present day lepidopteran species) that occurred from approximately 60 to 70 mya in the late-Cretaceous or early Cenozoic (Grimaldi, 1999) . This radiation coincides approximately with the origin of the microgastrine group calculated by WhitWeld (2002). The ideal gene to resolve short deep branches should have a fast rate of divergence at the time of the radiation but then the gene's rate of divergence needs to slow so that informative changes are not obscured by multiple substitutions at each site (Donoghue and Sanderson, 1992; Fishbein et al., 2001) . It has been suggested that morphological characters may be more likely than nucleotide substitutions to undergo rapid changes followed by a slowing in the rate of change due to stabilising selection (Fishbein et al., 2001) . Thus morphological characters may be a practical method to resolve short deep branches. Also, phenotypic variation is likely inXuenced by variation in many genes and morphological characters may be a cost eVective way to indirectly increase the size of genetic datasets and improve levels of support for phylogenetic estimates. The increase in support for our phylogeny when morphological characters were added to the analysis was not due solely to changing from using bootstrap support of a MP analysis to using the posterior probabilities under a Bayesian approach. The Bayesian phylogeny estimated from the genetic data alone had lower levels of support than the Bayesian tree estimated from both genetic and morphological characters. It has been suggested that Bayesian posterior probabilities tend to be higher than MP bootstrap proportions for the same groups (Erixon et al., 2003) . However, it has also been suggested that bootstrap values may be a conservative estimate of support for clades when support for clades is strong (Huelsenbeck and Rannala, 2004; Rannala and Yang, 1996) and it is likely that posterior probabilities give a better estimate of support than bootstrap proportions, especially when the most complex Bayesian models are used (Huelsenbeck and Rannala, 2004) . We would have liked to compare bootstrap values of phylogenies generated by ML, rather than MP, to the posterior probabilities of the Bayesian trees but this was impractical due to computational constraints. The simulations using the bootstrap function in PAUP* to produce larger character sets from our data suggested that a phylogeny with every branch having more than 50% bootstrap support in a MP analysis was unlikely to be obtained without considerably more data. The approach using Seq-Gen was more optimistic, suggesting complete support was possible with around 12,000 nucleotides. However, the nonparametric method of simulating datasets produces data without gaps or missing data and thus produces a "perfect" dataset that is almost certainly unobtainable in reality. For example, we estimated a phylogeny with only one branch with less than 50% bootstrap support from a dataset simulated with Seq-Gen of the same size as our actual dataset. This compares to 14 branches with less than 50% bootstrap support in the phylogeny estimated from the actual data. The PAUP* based approach produces datasets that are more realistic as the simulated datasets have gaps in the sequences and missing data and is therefore more likely to give a better estimate of the data required to estimate a totally bootstrap supported phylogeny. The P-values of <0.05 obtained for many of the ILD tests does not necessarily mean that conXict between individual genes has reduced bootstrap support for nodes in our phylogeny. There is signiWcant disagreement as to what level of signiWcance should be used to reject partition homogeneity. For example, Cunningham (1997) suggested a critical value of less than 0.01 should be used. There is also controversy over whether signiWcant heterogeneity should preclude combining data derived from diVerent genes for a phylogenetic analysis. Yoder et al. (2001) examined the eVect of changing character weighting and/or data combinations on the phylogenetics of slow lorises and found that correct results were poorly supported and even incorrect results were obtained when character weighting and/or data combinations were altered to reduce incongruence (as assessed by the ILD). Likewise, Sullivan (1996) found that combining two heterogeneous datasets produced a better estimate of deer mouse and grasshopper mouse phylogenies than did either gene alone. Barker and Lutzoni (2002) also found from simulations that the ILD test was a relatively poor predictor of the eVect of combining datasets on phylogenetic accuracy. Dolphin et al. (2000) found that when rate diVerences between the two matrices being assessed reach a certain level, the ILD test could suggest signiWcant heterogeneity despite the two matrices having similar underlying topologies. It seems likely that the marked diVerences in the divergence rates of the genes we analysed have generated the signiWcant ILD test results. We suggest that we reduced the adverse eVects of data heterogeneity by using complex evolutionary models for each partition of our data in a Bayesian analysis. Inappropriate gene choice has been suggested as a reason why it has been diYcult to obtain a robust microgastrine phylogeny (Mardulyn and WhitWeld, 1999) . Several of the genes we used have been used in other studies of braconid phylogenetics (for example, Belshaw et al., 1998; Mardulyn and WhitWeld, 1999; Michel-Salzat and WhitWeld, 2004; Min et al., 2005; WhitWeld et al., 2002) , and in other studies of hymenopteran phylogenetics (Cameron and Mardulyn, 2001; Danforth et al., 2004; Kawakita et al., 2003; Sanchis et al., 2001; Weiblen, 2004) . Given that our choice of genes covered deep, medium and shallow divergences and that these genes have been used successfully to estimate robust phylogenies for an enormous variety of taxa, we do not conclude that inappropriate gene choice has caused the poor support for some of the nodes. The contribution to the phylogenies from all the COI data is likely subject to long branch attraction (Felsenstein, 1978; Hendy and Penny, 1989) as this gene has the highest uncorrected pairwise divergences of the seven genes between microgastrine species and yet it has only the Wfth highest levels of divergence between the braconid subfamilies. However, using model-based methods for estimating the phylogenies has probably lessened the eVect of long branch attraction. Missing data are also unlikely to have resulted in poor bootstrap support. A phylogeny estimated from those taxa for which we obtained sequences for all seven genes had six branches (of 51 total) with bootstrap support <50% in a MP analysis. The poorly supported branches were signiWcantly shorter than branches with >50% bootstrap support. An examination of the contribution of individual genes to the branch length of the poorly supported branches found that there were few changes in all seven genes for these short branches, suggesting that a rapid radiation (i.e., truly short branches) has indeed occurred. InsuYcient data has also been suggested as a reason for poorly supported phylogenies. Rokas et al. (2003) suggested that 20 genes would be required to obtain mean bootstrap values of 95% with a 95% conWdence interval for seven species of Saccharomyces yeasts. However, as few as eight genes would give mean bootstrap values of 95% with a 95% conWdence interval if nonstationary genes (genes that have markedly shifted nucleotide frequencies for some taxa) were excluded from the Saccharomyces analysis (Collins et al., 2005) . Deleting the third positions of codons from our data resulted in the genes becoming stationary. However, while deletion of third positions did not markedly alter the relationships estimated, it also did not markedly increase bootstrap support for our MP trees. There has been debate over whether it is better to add taxa or characters to an analysis to improve accuracy, given that resources are always Wnite. Rosenberg and Kumar (2001) suggested that longer sequences, rather than more taxa, will improve the accuracy of the phylogeny estimated. However, it was also argued that increasing the number of taxa equally reduces error in phylogenetic estimations (Pollock et al., 2002) . The improvement in phylogenetic accuracy is in part determined by the length of sequences already obtained and by the levels of divergence between the taxa (Hillis et al., 2003) . For example, if there are several long branches in the phylogeny, eVort may be better expended on sequencing taxa that break up the long branches rather than adding more characters (Graybeal, 1998) . In the microgastrine case, our simulations showed that neither adding taxa nor genetic data would increase bootstrap support for the short branches in the phylogeny. We found strong support (100% of bootstrap replicates from the MP analysis of the seven genes, and posterior probabilities of 0.99 for both the molecular data and the molecular data plus morphology) for monophyly of the microgastroid complex, sensu Wharton (1993) , Mason (1994), WhitWeld (1997a) , and . Our Wnding of monophyly for Microgastrinae agrees with an earlier analysis of 16S data that also found the microgastroid complex to be monophyletic, although with equivocal bootstrap support WhitWeld, 1997b ). An analysis of portions of 16S and 28S rDNA also found strong bootstrap support for monophyly of the microgastroids . Our Bayesian analysis of the combined molecular and morphological characters found a relationship for the braconid subfamilies of (Cheloninae, (Mendesellinae, (Microgastrinae, (Cardiochilinae, Miracinae)))). Belshaw et al. (1998) found a similar relationship for the microgastrine subfamilies, excluding Mendesellinae, from an analysis of a portion of the 28S region. These results however, conXict with a MP analysis of portions of the 16S and 28S genes and 11 morphological characters , and a Bayesian analysis of portions of 16S 18S and 28S regions and 96 morphological characters (Min et al., 2005) , that found a relationship of (Adelinae + Cheloninae, (Miracinae, (Microgastrinae, Cardiochilinae))). We intend a more extensive examination of subfamily relationships within the microgastroid complex in the near future. It is diYcult to compare our estimate of relationships within Microgastrinae to other studies, as generally diVerent or fewer microgastrine species were sampled in those studies (e.g., Belshaw et al., 1998; . We found Microplitis and Snellenius represent an early diverging lineage of microgastrines. A phylogeny estimated from portions of 16S and 28S also found Microplitis to be basal as did a phylogeny estimated from a portion of 28S (Mardulyn and WhitWeld, 1999) . Phylogenetic utility of the major opsin in bees (Hymenoptera: Apoidea): a reassessment The utility of the incongruence length diVerence test A phylogenetic reconstruction of the Ichneumonoidea (Hymenoptera) based on the D2 variable region of 28S ribosomal A molecular phylogeny of the Aphidiinae (Hymenoptera: Braconidae) Extending phylogenetic studies of coevolution: secondary Brooks parsimony analysis, parasites, and the Great Apes Patterns of mitochondrial versus nuclear DNA sequence divergence among nymphalid butterXies: the utility of wingless as a source of characters for phylogenetic inference The eVect of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support Data from the elongation factor-1alpha gene corroborate the phylogenetic pattern from other genes, revealing common ancestry of bumble bees and stingless bees (Hymenoptera: Apinae) Multiple molecular data sets suggest independent origins of highly eusocial behavior in bees (Hymenoptera: Apinae) The major opsin gene is useful for inferring higher level phylogenetic relationships of the corbiculate bees Hymenoptera: Apidae): congruence of molecular and morphological data A highly conserved nuclear gene for low-level phylogenetics: elongation factor1-alpha recovers morphology-based tree for heliothine moths The utility of morphological characters in gastropod phylogenetics: an example from the Calyptraeidae Choosing the best genes for the job: the case for stationary genes in genome-scale phylogenies Can three incongruence tests predict when data should be combined? Phylogeny of feather mite subfamily Avenzoariinae (Acari: Analgoidea: Avenzoariidae) inferred from combined analysis of molecular and morphological data Single-copy nuclear genes recover Cretaceous-age divergences in bees Elongation factor-1alpha occurs as two copies in bees: implications for phylogenetic analysis of EF-1alpha sequences in insects How do insect nuclear ribosomal genes compare to protein-coding genes in phylogenetic utility and nucleotide substitution patterns? Separate versus combined analysis of phylogenetic evidence Noise and incongruence: interpreting results of the incongruence length diVerence test The suitability of molecular and morphological evidence in reconstructing plant phylogeny Molecular Systematics of Plants Molecular phylogeny of the insect order Hymenoptera: Apocritan relationships Phylogenetic relationships among the microgastroid wasps (Hymenoptera: Braconidae): combined analysis of 16S and 28S rDNA genes and morphological data Evolutionary relationships among the Braconidae (Hymenoptera: Ichneumonoidea) inferred from partial 16S rDNA sequences Reliability of bayesian posterior probabilities and bootstrap frequencies on phylogenetics Testing the signiWcance of congruence Constructing a sig-niWcance test for incongruence Cases in which parsimony or compatibility methods will be positively misleading Phylogeny of Saxifrageles (Angiosperms, Eudicots): analysis of a rapid ancient radiation Two nuclear genes yield concordant relationships within Attacini (Lepidoptera: Saturniidae) DNA primers for ampliWcation of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates Is it better to add taxa or characters to a diYcult phylogenetic problem? The co-radiations of pollinating insects and angiosperms in the Cretaceous Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China Biological identiWcation through DNA barcodes Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterXy Astraptes fulgerator Barcoding animal life: cyctochrome c oxidase subunit 1 divergences among closely related species A framework for the quantitative study of evolutionary trees Is sparse taxon sampling a problem for phylogenetic inference? Frequentist properties of bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models MR BAYES: Bayesian inference of phylogeny Evolution and phylogenetic utility of alignment gaps within intron sequences of three nuclear genes in bumble bees (Bombus) Timing the ancestor of the HIV-1 pandemic strain Molecular phylogeny and historical biogeography of the large carpenter bees, genus Xylocopa (Hymenoptera: Apidae) Recovering evolutionary trees under a more realistic model of sequence evolution The major opsin in bees (Insecta: Hymenoptera): A promising nuclear gene for higher level phylogenetics Phylogenetic signal in COI, 16S, and 28S genes for inferring relationships among genera of Microgastrinae (Hymenoptera; Braconidae): evidence of a high diversiWcation rate in this group of parasitoids Molecular and morphological phylogenetics of weevils (Coleoptera, Curculionoidea): do niche shifts accompany diversiWcation? Phylogeny of the orchid bees (Hymenoptera: Apinae: Euglossini): DNA and morphology yield equivalent patterns Preliminary evolutionary relationships within the parasitoid wasp genus Cotesia (Hymenoptera: Braconidae: Microgastrinae): combined analysis of four genes Phylogenetic relationships among the Braconidae (Hymenoptera: Ichneumonoidea) inferred from partial 16S rDNA, 28S rDNA D2, 18S rDNA gene sequences and morphological characters Phylogenetic utility of elongation factor-1alpha in Noctuoidea (Insecta: Lepidoptera): the limits of synonymous substitution More taxa or more characters revisited: combining data from nuclear protein-encoding genes for phylogenetic analyses of Noctuoidea (Insecta: Lepidoptera) Systematics and evolution of the cutworm moths (Lepidoptera: Noctuidae): evidence from two proteincoding nuclear genes TreeMap, version 1.0b Increased taxon sampling is advantageous for phylogenetic inference Testing Models of Evolution-Modeltest Version 3.06, version 3.06 Modeltest: testing the model of DNA substitution Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees Phylogeny and the origin of HIV-1 Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference The general stochastic model of nucleotide substitution Genome-scale approaches to resolving incongruence in molecular phylogenies MrBayes 3: Bayesian phylogenetic inference under mixed models Incomplete taxon sampling is not a problem for phylogenetic inference The phylogenetic analysis of variable-length sequence data: elongation factor-1alpha introns in European populations of the parasitoid wasp genus Pauesia (Hymenoptera: Braconidae: Aphidiinae) Systat 9.01 Statistics Combining data with diVerent distributions of amongsite variation Phylogenetic inference PAUP* Phylogenetic analysis using parsimony (*and other methods), version 4 Some probabilistic and statistical problems on the analysis of DNA sequences The Clustal X windows interface: Xexible strategies for multiple sequence alignment aided by quality analysis tools Correlated evolution in Wg pollination Bionomics of the Braconidae Molecular and morphological data suggest a single origin of the polydnaviruses among braconid wasps IdentiWcation Manual to the New World Genera of the Family Braconidae (Hymenoptera) Estimating the age of the polydnavirus/braconid wasp symbiosis Virus or not? Phylogenetics of polydnaviruses and their wasp carriers Hierarchical analysis of variation in the 16S rRNA gene among Hymenoptera Phylogenetic relationships among microgastrine braconid wasp genera based on data from the 16S, COI and 28S genes and morphology. Syst. Entomol Mendesellinae, a new subfamily of braconid wasps (Hymenoptera, Braconidae) with a review of relationships within the microgastroid assemblage Nuclear genes resolve Mesozoic-aged divergences in the insect order Lepidoptera Time Xies, a new molecular time scale for brachyceran Xy evolution without a clock Contaminated polio vaccine theory refuted Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation Failure of the ILD to determine data combinability for slow loris phylogeny This study was primarily supported by National Science Foundation grant DEB 0316566 to J.B.W. It also includes work supported by two other grants to JBW: USDA NRI Grant 9501893 and NSF IBN 0344829. We wish to jointly thank those who contributed specimens for our analysis, especially Andy Austin, Martin Hauser, Mike Irwin, Dan Janzen, Angelica Penteado-Dias, and Mike Sharkey. We also wish to thank Andy Austin, Sydney Cameron, Bryan Danforth, Mark Dowton, and Heather Hines for helpful discussion concerning the genes used here, and John Huelsenbeck, Jack Sullivan and Dave SwoVord for suggestions concerning the simulations. Two anonymous reviewers contributed helpful comments.