Paper1_ver4.dvi Causal Foundations of Evolutionary Genetics Jun Otsuka ∗ (forthcoming in The British Journal for the Philosophy of Science) Abstract The causal nature of evolution is one of the central topics in the philosophy of biology. It has been discussed whether equations used in evolutionary genetics point to some causal processes or are purely phenomenological patterns. To address this question the present pa- per builds well-defined causal models that underlie standard equa- tions in evolutionary genetics. These models are based on minimal and biologically-plausible hypotheses about selection and reproduc- tion, and generate statistics to predict evolutionary changes. The causal reconstruction of the evolutionary principles shows adaptive evo- lution as a genuine causal process, where fitness and selection are both causes of evolution. 1 Introduction The causal nature of evolution is one of the central topics in the philosophy of biology. Is evolution a causal process? Are selection and fitness causes of ∗Philosophy Department, University of California, Davis. Email: junotk@gmail.com 1 population change? Recent discussions in the literature have given conflict- ing answers to these question. One reason for the skeptical response is that most if not all principles of evolutionary theory, such as the Price equation or Fisher’s fundamental theorem of natural selection, are expressed by purely statistical terms such as variances or covariances. This does not preclude, however, the possibility that such statistics are themselves products of cer- tain causal structures. The past few decades have seen the development of a mathematical framework for studying causal relations and the probability distributions generated by them (Spirtes et al., 2000; Pearl, 2000). Using this framework the present paper shows how the core part of one of the “fundamental” principles of genetics, the Price equation, is generated from causal relationships underlying selection and reproduction. The derivation provides causal foundations for the standard equations used in evolutionary genetics and establishes adaptive evolution as a bona fide causal process. The structure of the paper is as follows. After a brief description of the problem in Section 2, basic notions of causal models are introduced in Sec- tion 3. Using this machinery Section 4 investigates causal models underlying the Price equation, the breeder’s equation in quantitative genetics, and the one-locus population genetics system. equation in quantitative genetics, and the two-allele population genetics system. These models are causal in the sense that they give a reliable prediction of a consequence resulting from an ideal intervention on some of their variables, and evolutionary to the extent that they can be used to describe or predict changes in population frequencies induced by selection. The explicit definition of causal structures brings several philosophical upshots (Section 5). The first corollary is that 2 selection must be understood as a causal process (a trait affecting fitness) rather than just an outcome (statistical dependence between the trait and fitness). Second, the causal models give clear cut answers to the entangled questions as to whether fitness and/or selection cause population change. Applying the formal intervention calculus (Spirtes et al., 2000; Pearl, 2000) to the causal models obtained in Section 4, I will show there are some in- terventions on selection and fitness that affect evolutionary outcomes. This result gives an unequivocal proof that fitness and selection are both genuine causes of evolution. One disclaimer before proceeding: in this paper I only focus on evolution by natural selection. Hence any reference to the word “evolution” in this paper should be understood as a shorthand for adaptive evolution. Also all distributions should be taken as population distributions in an infinite population. This, of course, eliminates drift. The causal basis for drift may be discussed on another occasion. 2 The Philosophical Puzzle Modern mathematical theories of evolution study changes in populations by state transition functions (Lewontin, 1974; Lloyd, 1988). Such functions describe a temporal change in certain features of a population based on its current state, thereby allowing a prediction of its evolutionary trajectory.1 Taking the simplest example, in the two allelic system with no dominance where the fitnesses of genotype AA, Aa and aa are respectively 1 + s, 1 + s/2 1Throughout this paper I assume generations to be discrete and non-overlapping. 3 and 1, the change in the population frequency p of alleles A between two consecutive generations is given by: ∆p = sp(1 − p) 2(sp + 1) . (1) Alternatively, we may be interested in the evolutionary change of a pheno- type rather than of a gene. In quantitative genetics, the between-generation change in the phenotypic mean Z̄ is given by the breeder’s equation: ∆Z̄ = h2S (2) where the selection differential S measures the shift in the phenotypic mean of the parental generation by selection (but before reproduction) and the heritability h2 estimates the fidelity of reproduction — i.e. how much of the change induced by selection is passed onto the next generation. Both equations make a quantitative prediction of the change in population fea- tures (genetic frequencies or the phenotypic mean) based on characteristics of the current population. The mathematical analysis using such state tran- sition functions integrated Mendelian inheritance and Darwin’s theory of natural selection, and has formed the core part of evolutionary theory after the Modern Synthesis. Although successful state transition functions may give correct or at least acceptable predictions of a future population under certain conditions, this does not automatically mean they represent causal processes that generate evolutionary dynamics. For, obviously, transition functions may represent 4 non-causal as well as causal patterns. One may write down sufficiently predictive transition functions for planetary motion based on the Ptolemaic system, but such equations would not reflect the correct causal mechanisms governing astronomical bodies. Predictive state transition functions can be purely phenomenological, rather than causal. Hence philosophers have long been concerned with whether these evolutionary equations reflect any causal process, and if so, how. Elliott Sober (1984), for example, argues that the causal contents of evolutionary equations are furnished by the “source laws” which estimate or measure parameters and variables in the equations by empirical means such as functional analysis. Since such estimates reflect causal facts surrounding organisms, Sober claims the whole of evolutionary theory is causal and em- pirical even if its core equations — what he calls the “consequence laws” — may be purely mathematical. This view has been vigorously challenged recently by a group of philoso- phers called statisticalists (Matthen and Ariew, 2002, 2005, 2009; Matthen, 2010; Walsh et al., 2002; Walsh, 2007, 2010), who argue that the mathe- matical quantities appearing in evolutionary equations, especially fitness, cannot be estimated by Sober’s source laws or any other causal analysis of similar kind, but only by census. Decoupling evolutionary equations from underlying mechanisms, they insist that modern genetics gives a purely phe- nomenological description of the “statistical trends” of a population, or in their words “explains the changes in the statistical structure of a population by appeal to statistical phenomena” (Walsh et al., 2002, p. 471). Parallel to this issue — whether evolutionary theory describes a causal 5 process or not — is the question as to whether its key concepts, most no- tably fitness and selection, identify a cause of evolutionary change. Millstein (2006), for example, argues that selection is a population-level cause of evo- lution, while Matthen and Ariew (2009) and Lewens (2010) deny any causal power to selection. Walsh (2007, 2010) claims that fitness is causally inert since it fails to satisfy certain criteria of causality, while his argument was criticized by Otsuka et al. (2011). Sober (2013), finally, argues that fitness itself does not cause population change, but its variance does. Reasonings that drove these authors to different conclusions vary, but there is one thing that is common: they all base (some portion of) their argument on the ma- nipulationist notion of causation (Woodward, 2003). That is, both parties seem to agree that fitness and selection are (not) causal to the extent that manipulating them (does not) affects evolutionary response. But how do we know the consequence of such manipulations? To exam- ine this most (but not all) of these authors resort to conceptual analysis: what really are fitness and selection? What do they stand for? With a certain interpretation of these concepts, they go on to argue that a sup- posed manipulation should (or should not) affect evolution, and thus that the concepts must be (un)causal. This, to say the least, is a very peculiar move. In the manipulationist framework, the outcome of a possible intervention is not determined by the meaning of variables, but their relationships. This is clear in Woodward’s own example (Woodward, 2003, p. 197). It is known that the period T of a 6 simple pendulum is related to its length l by T = 2π √ l/g (3) where g is the acceleration due to gravity. It seems natural to read this equation causally to the effect that the right hand side (the length and gravity) determines or “causes” the left hand side (the period), until we find that Eqn. 3 is mathematically equivalent to the following: l = T 2g 4π2 . (4) Now it is obviously absurd to claim, based on this new equation, that the pe- riod causes the length of the pendulum. What determines the (il)legitimacy of the causal reading of each equation? Surely not the meaning of the vari- ables, for they stay the same between the two equations. The moral of this simple example is not only that the conceptual analysis is utterly irrelevant to the investigation of the causal nature of some concept under the manipulationist framework. Sober (2013), in his recent paper, resorts to the breeder’s equation (Eqn. 2) to make his case that the fitness variation, measured by the term S in the right hand side, affects the response to selection in the left hand side. But this begs the question. How do we know the breeder’s equation captures the flow of causal influence right? Why isn’t it like Eqn. 4, rather than Eqn. 3?2 We never know, until the causal 2In the breeder’s equation the evolutionary response cannot proceed fitness variance, and hence cannot be its cause. But they may be effects of a common cause, or, as statisticalists may argue, the relation may be “purely statistical”. In fact, we will later see that, pace Sober, a manipulation of the fitness variance does not affect the expected 7 relationships among the variables are explicitly specified beforehand. Such relations are usually given by a causal model, which also determines a set of equations that allow for causal reading (Spirtes et al., 2000; Pearl, 2000). It is the causal model given by the Newtonian mechanics that authorizes the causal reading of Eqn. 3, but not of Eqn. 4. In the same way, if we want to know the effect of intervening on some variable in an evolutionary formula, we need the causal model underlying that equation. Hence the second contention — whether fitness or selection causes evolution — hinges on the first: is there a causal model that underlies evolutionary transition functions? The answer is yes. This paper describes causal models that (1) include the relevant variables such as genetic, phenotypic and environmental factors, (2) generate the statistics necessary to describe and predict evolutionary trajectories, and (3) can be used to predict the consequence of a possible intervention on a subset of these variables. The derived models will reveal the causal foundations underlying the evolutionary transition functions as described above (Eqns. 1 and 2), and help us determining whether fitness and/or selection can be properly regarded as a cause of evolutionary change. 3 Causal models A causal model employs a graphical structure to represent causal relation- ships among variables (Spirtes et al., 2000; Pearl, 2000). A causal graph G = (V, E) is a pair comprising a set of variables V (or nodes) and a set evolutionary response predicted by the breeder’s equation. 8 of edges E ⊆ V × V. An edge (X, Y ) ∈ E, or more graphically X → Y , represents a direct causal relation from X to Y , where X is called a parent of Y and Y a child of X. A path between X and Y is any chain of edges between X and Y , where a path can follow arrows in either the direction of the arrow or the reverse direction. If every arrow in a path between X and Y is pointing towards Y , it is called a directed path from X to Y , and then X is a cause of Y and Y is an effect of X. A bidirected edge X ↔ Y represents unmodeled association between X and Y , i.e. the association not accounted for by any causal path in the graph. In this paper such edges are allowed only between those variables having no causes/parents (called exogenous). It is assumed that the value of each variable Vj ∈ V is determined by its direct causes or parents PA(Vj) such that Vj = fj(PA(Vj )). (5) This is called the structural equation for Vj . When the relationship is linear, as assumed throughout this paper, Eqn. 5 can be expressed as Vj = ∑ Vi∈PA(Vj ) βjiVi (6) with a set of linear coefficients β (also called path coefficients). Hence in a linear causal model each directed edge in the graph is associated with one linear coefficient. A causal graph G over V, a set of corresponding structural equations 9 F, and a probability distribution P over exogenous variables in V uniquely determine the joint distribution over V. The induced distribution satis- fies useful properties such as the Markov Condition (Pearl, 1988). Another feature of our interest is the trek rule (Wright, 1921). A trek between vari- ables X and Y is a path between them that does not contain a collider where two arrows on the path collide at one variable (i.e. → V ←). A trek is equivalent to a pair of directed paths that share the same source or whose separate sources are connected by a bidirected edge.3 Thus in Fig. 1, X1 → X3 → X5, X3 ← X2 → X4, and X5 ← X3 ← X1 ↔ X2 → X4 → X5 are examples of treks, whereas X3 → X5 ← X4 is not. For each trek, we can calculate its trek coefficient by multiplying the (co)variance of its source(s) and all the linear coefficients on the edges constituting the trek. The trek rule states that the covariance of two variables equals the sum of trek co- efficients over all the treks connecting them. That is, if T is the set of all the treks between X and Y and βti is the linear coefficient of the ith edge in t ∈ T, Cov(X, Y ) = ∑ t∈T σt ∏ i∈t βti (7) where σt is the (co)variance of the source(s) of trek t. To take some examples from Fig. 1, Cov(X1, X4) = Cov(X1, X2)c, Cov(X3, X4) = Cov(X1, X2)ac+ Var(X2)bc, and Cov(X3, X5) = Var(X3)d + Cov(X1, X2)ace + Var(X2)bce. Causal models give a formal tool to study the relationships between a causal structure and the probability distribution generated by it. In what follows I make use of this machinery to reveal causal processes underlying 3Note that one of the pair may be empty. Thus one directed path from X to Y counts as a trek between them. 10 X1 X2 X3 X4 X5 a b d c e Figure 1: A causal graph with path coefficients. evolutionary transition equations. 4 Causal Foundations of Evolutionary Genetics In evolutionary genetics, it is well known that a change in moments (e.g. mean) of a population from one generation to the next is completely de- scribed by the Price equation (Robertson, 1966; Price, 1970). Let Z be the trait of interest, W be the (Darwinian) fitness as defined by the num- ber of offspring, and Z′ be the average phenotype of offspring of each individual. Thus if George, who reproduces asexually, has four children each having the phenotypic value of 1, 1, 1, and 2, then wGeorge = 4 and z′George = (1 + 1 + 1 + 2)/4 = 1.25. The Price equation gives the difference ∆Z̄ of average phenotypic values between the parental generation and the offspring generation by ∆Z̄ = 1 W̄ Cov(W, Z′) + Z̄′ − Z̄, (8) where the upper bars denote the averages. The first term of the equation is the covariance of the fitness and the averaged offspring phenotype, and 11 thus reflects both selection and reproduction. The second and third terms, in contrast, compare the phenotype of parents and the averaged phenotypic value of their offspring, regardless of the fitness of the parents. A difference in these terms, therefore, implies a transmission bias4. In this paper I will assume transmission bias to be absent, in which cases evolutionary dynamics is described just by the covariance of the first term. Before moving on, let us emphasize the variables used in the Price equa- tion, including fitness W, are all properties of an individual (or of an pair of individuals for diploid organisms, as we will see later). Alternatively the concept of fitness is sometimes used to refer to a property of a type, i.e. phe- notype, genotype, haplotype or an allele. Such type-level fitnesses are called marginal fitness and represented by the conditional distribution P (W |T ) or the average thereof for a given type T . But what we denote by “fitness” in this paper is primarily a property of an individual.5 The Price equation thus gives population change ∆Z̄ as a statistical function of these individual variables. A remarkable feature of the Price equation is that it is a mathemati- cal theorem and thus holds true of any evolving population satisfying its assumptions. This has motivated the view that the core evolutionary prin- ciples (“consequence laws”) are a priori truths (e.g. Sober, 1993, p. 72) and at the same time generated the philosophical puzzle as to how such non- 4This “transmission bias”, however, may include selection at lower levels (such as genic selection of “selfish genes”) and effects of non-genetic inheritance (such as maternal effects). 5Some statisticalists (e.g. Pigliucci and Kaplan, 2006) seem to interpret fitness to be a population level feature, i.e. as a random variable or the expectation thereof defined over a set of populations, but no such use of the concept is warranted by the evolutionary literature. See De Jong (1994) for a discussion of various concepts of fitness. 12 empirical theorems can represent causal processes in the real world. Indeed, Price’s theorem does not tell us how the variables in the equation affect each other or what will happen if one of them is altered by some external means — or in the other words, it does not explain why evolution takes place in that way. As we saw in section 2, answering such why-questions requires a suitable causal model beyond a mere mathematical equation. The goal of this section is to find such causal foundations for evolu- tionary change represented by the Price equation. Our basic strategy is as follows: build a causal model (i.e. specify a causal graph and structural equations) representing evolutionary processes and then show that such a model indeed generates the Price covariance, Cov(W, Z′). This will give us an evolutionary state transition function with a definite causal basis which describes evolutionary changes in terms of parameters of the hypothesized causal model. I will show this for phenotypic evolution first, and then con- sider the population genetics model. 4.1 Univariate Quantitative Genetics Model The Causal Graph Evolution by natural selection involves two phases: selection and repro- duction. Let us take reproduction first. Reproduction is a process that connects parent’s phenotype to offspring’s phenotype through genes or epi- genetic materials. Hence a causal model for reproduction must specify how a phenotype is formed out of such factors and also how they are transmitted to offspring. Obviously there are many possible reproductive structures, but here we confine ourselves to a very simple case of purely Mendelian inheri- 13 tance which is enough to show the causal basis of the standard evolutionary equations. Suppose there are n different types of alleles segregating in a population. Then the genotype of an organism is characterized by a set (vector) of n variables X := (X1, X2, . . . , Xn), where Xi ∈ X is the gene content, i.e. the count of copies of the ith allele type in an individual (Lynch and Walsh, 1998, p. 65). For a haploid organism the value xi of Xi for any i can be either 0 or 1, while for diploids xi ∈{0, 1, 2}. Phenotype Z is made out of these genes as well as of an environmental factor denoted by EZ . We thus have edges drawn from EZ and each of X to Z. A parent’s genotype also affects its offspring’s genotype by contributing to gene content. The transmission of genes is represented by the causal edge from parental to offspring gene contents, Xi → X ′ i, for each i. Finally we assume the same developmental process for offspring phenotype, Z′ being caused by X′ and E′Z . The above construction gives the causal graph for reproduction as shown in Fig. 2 (the path coefficients in the graph will be explained shortly). The graph, however, makes further assumptions not mentioned above. First, bidirected edges between parental genes represent genetic correlations al- ready present in the population. Such correlations can arise in two ways: gene counts of the same locus are necessarily correlated for they must sum up to the ploidy of the organism, while inter-locus correlations, often called linkage disequilibrium or gametic phase disequilibrium arise due to various factors including previous selection, drift or non-random mating.6 In con- 6These empirical covariances can be seen as a dependency due to a selection bias. 14 trast, it is assumed that environment EZ is not correlated with any genes, as implied by the absence of bidirected edges between X ∈ X and EZ . The graph also presupposes that parental environment EZ has no causal influ- ence on, or correlation with, offspring environment E′Z . Finally, transmission is strictly Mendelian in the sense that each gene is inherited independently without affecting the transmission process of other genes — this excludes segregation distortion. Z Z′ X1 X2 Xi Xj Xn EZ X ′ 1 X ′ 2 X ′ i X ′ j X ′ n E ′ Z b b b b b b b b b b b b b b b b b b α1 αi αj .5 .5 P a re n t O ff sp ri n g Figure 2: Linear (additive) decomposition of the covariance between parental and off- spring traits. Bold arrows illustrate an example of a trek connecting Z and Z′, whose contribution to the covariance is αi Cov(xi, xj ) · 1 2 ·αj . See the main text for the explana- tion of the variables. Selection refers to the process in which parental phenotypes lead to dif- ferential reproductive success. We say trait Z is selected if and only if it, along with an environmental factor denoted by EW , causally affects fitness 15 W (e.g. Glymour, 2011).7 This means that in order for Z to be selected there must be some intervention on Z, at least as a possibility, that changes fitness W . Selection can thus be represented in the above causal graph (Fig. 2) by adding edges Z → W and EW → W . A slight complication arises, however, for diploid organisms that do not produce offspring by themselves but only by a pair. It follows that the proper unit for analyzing diploid evolution is a pair of a female and a male. For a given pair let us denote the phenotypes of the female and the male by ZF and ZM , and their gene contents by XF , XM , respectively. Fitness W of the pair is the number of offspring produced by that pair, and has ZF and ZM (and EW ) as its direct causes. Likewise, Z ′ and X′ are redefined to be the average phenotypic value and the gene contents of offspring of that pair. With these modifications, the overall causal graph that incorporates se- lection and reproduction of diploid organisms should look like Fig. 3, where each branch in the middle (the mother and the father) is an abbreviated representation of the reproductive causal model represented in Fig. 2. As before, this graph introduces additional assumptions. First, the environ- ment factors affecting fitness (EWM and EWF ) must be uncorrelated with phenotypes or genotypes, as implied by the absence of edges between them. Another assumption is random mating: nonrandom mating would intro- duce bidirected edges between corresponding elements in XM and XF in the graph. 7I will discuss in Section 5.1 why selection must be defined as a causal process, not just a statistical dependence. 16 WEWM EWF Z′ ZM ZF XM XF X ′ β β α α .5 .5 α Figure 3: The causal graph showing the connections between parental fitness W and offspring trait Z′. Boldface letters abbreviate multiple nodes/coefficients, e.g. XM := (XM1 , XM2 , . . . , XMn ), and bold arrows multiple arrows. Each side represents respective parents, and has the structure shown in Fig. 2. Environmental factors for phenotypes EZM , EZF and E ′ Z are omitted from the graph. Although minimal and even simplistic, Figs. 2 and 3 submit a biolog- ically plausible hypothesis of the causal structure underlying selection and reproduction. It specifies causal links among relevant variables in such a way that we can identify which part of the system would be affected if some of them are manipulated by an external means. It is not yet clear, however, how this causal structure over individual variables relates to the population change as described by evolutionary transition functions. To see this rela- tion we need to “quantify” each causal relationship appearing in the graph, the task to which we now turn. Structural Equations Compared to the causal graph, there is much less, if any, a priori reason for 17 determining a functional form for a given causal relationship. How a cause determines its effects should depend on empirical facts about their nature and circumstance. As a first approximation, however, I assume in this paper that every cause affects its effects in a linear fashion. This means that selec- tion is purely directional and there is no dominance or epistasis. Nonlinear structural equations are possible in theory but complicate the mathematical derivation provided below, and most importantly to our purpose, are out- side the scope of the standard equations of evolutionary genetics mentioned above, providing causal structures of which is the primary goal of this paper. In linear/directional selection, a unit change in the phenotype affects the fitness by the amount specified by the path coefficient β, so that W = βZ + EW . (9) We further assume the selection pressures to be the same for male and female, i.e. ZF and ZM have the same path coefficient with respect to W . For a structural equation for the genotype-phenotype mapping we as- sume each allele Xi linearly affects the phenotype by coefficient αi, i.e. Z = ∑ Xi∈X αiXi + EZ = αXT + EZ (10) where α = (α1, α2, . . . , αn) and T denotes matrix transpose. αi is called the additive effect and measures the change in the phenotype induced by adding one copy of the ith allele, say from Xi = 0 to Xi = 1 (Fisher, 1930, p. 31). 18 It is assumed that additive effects are the same for all individuals in the population. Hence Eqn. 10 characterizes the genotype-phenotype mapping of females, males, and offspring. Under diploid Mendelian inheritance (e.g. no segregation distortion), ev- ery gene in a parent has a half chance to get inherited. Hence the structural equation representing the genetic transmission is simply X ′ = 1 2 X. (11) Eqns 9, 10, and 11 constitute the structural equations corresponding to the causal graph Figs. 2 and 3, as indicated by path coefficients on edges. Together with the graphs they tell us how a unit alternation in any variable in the model brings about changes in other parts — that is, they give reliable predictions of effects resulting from a possible intervention. This completes the description of a causal model underlying the univariate Price covariance. Deriving Evolutionary Transition Functions The final step employs the trek rule to obtain evolutionary transition functions based on the causal model as defined above. Recall that according to the trek rule the Price covariance Cov(W, Z′) is given by the sum of trek coefficients between W and Z′. From Figs. 2 and 3 each trek connecting W and Z′ has the form either of W ← Z ← Xi → X ′ i → Z ′ or of W ← Z ← Xi ↔ Xj → X ′ j → Z ′, the trek coefficient of each being βαi Cov(Xi, Xj ) · 1 2 · αj for Xi, Xj ∈ X. Summing all these treks for each side of the two parents 19 yields ∆Z̄ = 1 W̄ Cov(W, Z′) = 2 W̄ β ∑ Xi∈X ∑ Xj ∈X αi Cov(Xi, Xj ) · 1 2 · αj = 1 W̄ βα Var(X)αT (12) where Var(X) is the covariances of gene contents and is a function of pop- ulation genetic frequencies. Hence Eqn. 12 relates phenotypic change to causal parameters (β and α) as well as a distributional feature of the exoge- nous variables (Var(X)) — or in other words, gives a causal underpinning of evolutionary change. The same model also reveals the causal basis of the standard formula of quantitative genetics, the breeder’s equation (Eqn. 2). To see this let us first derive the additive genetic variance, σ2A, which is defined as the part of the phenotypic variance due to additive effects of gene contents X. Since the variance is nothing but the covariance of a variable with itself, we can apply the trek rule to calculate this value. Noting in Fig. 2 all treks connecting Z to itself have the form Z ← Xi ↔ Xj → Z with the trek coefficient αi Cov(Xi, Xj )αj , the additive genetic variance for Z is σ2A(Z) = ∑ Xi∈X ∑ Xj∈X αi Cov(Xi, Xj )αj = α Var(X)αT. (13) 20 Plugging this into Eqn. 12 yields ∆Z̄ = 1 W̄ βσ2A (14) From standard regression theory the least squares estimate of linear coef- ficient β is Cov(W, Z)/ Var(Z). Letting W̃ := W/W̄ denote the relative fitness, we get ∆Z̄ = 1 W̄ Cov(W, Z) Var(Z) σ2A = Cov(W̃ , Z) σ2A Var(Z) = Sh2 (15) where S := Cov(W̃ , Z) is the selection differential and h2 := σ2A/ Var(Z) is the (narrow-sense) heritability. The breeder’s equation (Eqn. 15), therefore, is an estimate of the linear evolutionary response generated by the causal structure in Figs. 2 and 3. We can thus conclude that the graph and model specified above represent the causal foundation of the standard evolutionary formula in quantitative genetics.8 4.2 One Locus Population Genetics Model The same method can be used to build the causal model for the simple population genetics model as in Eqn. 1, if one thinks of the genes as a kind of “phenotype”. Let A and a be two alleles segregating at one locus 8In the same fashion one can derive the multivariate version of the breeder’s equation — the Lande equation — which plays the central role in today’s quantitative genetics (Lande, 1979). 21 WEWM EWF ZM ZF XM1 XF 1 X ′ 1 Z′ sM sF .5 .5 .5 .5 .5 Figure 4: The causal graph for the one-locus population genetics system. Non-directed edges represent mathematical relations (change of units). Variable X2 is omitted since the gene content of allele a does not affect Zs. with the allelic frequencies p and 1 −p, respectively. Gene contents X1 and X2 then are counts of allele(s) of A and a in an organism. Let us define our “phenotype” Z to be the frequency of allele A in one organism. Hence Z = X1/2 and its value can be either 0, 0.5 or 1. Noting the population frequency of allele A equals Z̄, its change is given by the Price equation (Wade, 1985): ∆p = ∆Z̄ = 1 W̄ Cov(W, Z′). (16) Here again we ignore the transmission bias assuming genes are passed to offspring more or less directly. The causal graph connecting the variables for a pair of organisms is shown in Fig. 4. The non-directed edges in the graph represent the unit conversion between the gene counts (Xs) and the gene frequencies (Zs) in 22 an individual. Since these two variables point to the same thing, the causal flows remain undisrupted and the trek rule is still applicable. In the graph there are only two treks connecting W and Z′, i.e. W ← ZM ← XM 1 → X′1 → Z ′ and W ← ZF ← XF 1 → X ′ 1 → Z ′. Assuming selection acts on each sex equally, i.e. sF = sM , the trek sum is 1 W̄ Cov(W, Z′) = 1 4W̄ s Var(X1) = sp(1 − p) 2W̄ (17) where the second line follows from the fact that the variance of the multino- mial random variable X1 is 2p(1−p). Under no dominance the mean fitness W̄ is p2(1 + s) + 2p(1 − p)(1 + s/2) + (1 − p)2 = sp + 1, giving ∆p = 1 W̄ Cov(W, Z′) = sp(1 − p) 2(sp + 1) (18) which accords with Eqn. 1. In general, plugging regression estimate ŝ = Cov(W, Z)/ Var(Z) into Eqn. 17 yields the standard one-locus population genetics model (Gillespie, 2004, p. 62): ∆p = p(1 − p)[p(wAA − wAa) + (1 − p)(wAa − waa)] W̄ (19) where wAA, wAa and waa are the fitnesses of individuals having genotypes AA, Aa and aa, respectively. State transition functions of population genet- ics can hence be derived from the Price equation and the underlying causal model in the same fashion as in quantitative genetics. 23 5 Evolution as a Causal Process The causal decompositions of the Price covariance given above reveal the causal structures underlying the evolutionary state transition functions and hence the evolutionary phenomena they describe. Our causal models satisfy all three desiderata mentioned earlier: they relate relevant genetic, phe- notypic and environmental factors, give predictions of evolutionary conse- quences, and can be used to estimate the effect of possible interventions on a subset of the variables. In addition to providing the causal foundations, the philosophical importance of defining the formal model is twofold. First, it tells us what selection must be in order for it to promote evolution. Second, the explicit definition of the causal model makes it possible to evaluate what happens if some of its factors are altered — i.e. to determine whether fitness and/or selection cause evolution. These points are discussed in turn. 5.1 Selection as a Causal Process The causal model derived above required a trait to cause fitness, favoring the notion of selection as a causal process (Millstein, 2002, 2006; Stephens, 2004) rather than a mere outcome (Matthen and Ariew, 2009; Matthen, 2010). The outcome interpretation claims that selection is nothing but a statistical fact holding in a population, such as the fitness variance or the fitness-trait covariance. At first sight such a view fits well with the popular accounts of selection, including Richard Lewontin’s much cited summary of Darwinian evolution as a necessary consequence of three conditions, pheno- typic variation, differential fitness and heritability, where differential fitness 24 — i.e. selection — means that “different phenotypes have different rates of survival and reproduction in different environments” (Lewontin, 1970, p.1), or in other words, that the phenotypes are correlated with the fitness. Our causal model, however, reveals an inadequacy of the purely statis- tical interpretation of adaptive evolution. To see this, imagine a situation where a trait does not cause fitness but both are affected by some com- mon cause (Fig. 5). Rausher (1992), for example, considers a hypothetical plant population whose foliar alkaloid concentration (phenotype) and seed production (fitness) are affected by the nitrate level of the soil environment (see also Mauricio and Mojonniner, 1997; Morrissey et al., 2010, for similar discussions). The common environmental confounder in such a situation will generate a statistical association between the trait and the fitness, so that Lewontin’s criteria are satisfied provided the trait is heritable. Evolu- tionary response, however, does not ensue for there is no trek between W and Z′ and thus the Price covariance is zero. This simple example shows why the interpretation of selection as a pure outcome, as well as Lewontin’s well-known formulation, is defective.9 A mere statistical fact by itself has no explanatory role in the study of adaptive evolution. The importance of distinguishing the selection-as-process from its sta- tistical outcome cannot be emphasized too much. In a recent article Sober (2013) correctly argued that differential trait fitness does not entail a selec- tion for Z, but wrongly concluded that it does entail a selection for some 9Note that the case advanced here is to be distinguished from other criticisms of Lewontin’s conditions, such as the? exact cancelation of selective force by other path- ways (Wimsatt, 1980, 1981; Okasha, 2007) or an incidental trait-fitness correlation in a small population (Brandon, 1990). The Lewontin’s conditions may fail even in an infinite population undergoing no opposing evolutionary forces. 25 W Z Z′ E Figure 5: When the phenotype-fitness association is due only to a common cause, Cov(W, Z′) = 0 and no evolutionary response follows. But in such cases Lewontin’s three conditions are satisfied and (falsely) conclude a nonzero evolutionary response. Note that the path W ← E → Z ←···→ Z′ collides at Z and is not a trek. The dashed bidirected arrow represents reproductive pathways. trait P that correlates with Z. It doesn’t, since the phenotype-fitness asso- ciation may be purely spurious, as in the above case. Although a statistical dependence may entail some causal connection,10 this need not be selec- tive and thus may fail to promote adaptive evolution. For that, you need a selection-as-process. 5.2 Causes of Evolutionary Change Another contention in the statisticalist debate is whether fitness and/or se- lection can be regarded as a cause of evolutionary change (Millstein, 2006; Stephens, 2004; Otsuka et al., 2011; Sober, 2013) or not (Matthen and Ariew, 2002, 2009; Walsh et al., 2002; Walsh, 2007, 2010). The causal model defined above provides a clear cut solution to this entangled debate. Since evolution- ary change ∆Z̄ is given by the Price equation which in turn is underpinned by the causal models discussed above, whether fitness causes evolution can be examined by calculating the effect on ∆Z̄ of an intervention on fitness W , using the standard intervention calculus (Spirtes et al., 2000; Pearl, 2000). 10This is a part of the thesis called the Causal Markov condition. 26 The post-intervention distribution can be represented by P (∆Z̄|do(W = w)) where do(•) is Pearl’s intervention operator. This amounts to forcing every individuals in the population to have a certain number of offspring by some external means (e.g. by culling all cubs after the wth birth). But we can of course think of partial interventions that affect only some portion of the population. Assuming no individual gets more than one intervention, the result of such an intervention is given by the weighted average P (∆Z̄|Ω) = |Ω| ∑ i ni N P (∆Z̄|do(ωi)) where Ω := {do(ω1), do(ω2), . . .} is a set of partial interventions, N is the population size and ni is the number of individuals affected by do(ωi). The global intervention is just a special case of such partial interventions where Ω is a singleton. Here we consider only the global intervention. Our question thus amounts to whether P (∆Z̄|do(W = w)) 6= P (∆Z̄|do(W = w′)) for some w 6= w′. So does an intervention on fitness affect evolution? It depends on the types of intervention. An intervention in a causal model is usually repre- sented as a modification of the graph and/or the structural equations. Hard interventions eliminate all the causal inputs to the target variables and im- pose a new set of values or distribution by some external force. In the graph Fig. 3, this amounts to pruning all incoming arrows to fitness W . This effec- tively interrupts all the treks from W to Z′, so the Price covariance will be zero, i.e. no evolutionary response. We can thus conclude that hard inter- ventions on W do not induce evolutionary change. This should not surprise 27 us, for it is just a population-level restatement of Weismann’s principle that no epigenetic surgery on parents would affect offspring phenotype. One can easily show that it holds true for all phenotypes under the standard model, i.e. P (∆Z̄i|do(Zj = zj)) = P (∆Z̄i|do(Zj = z ′ j )) for all hard interventions on Zj . From another perspective, however, this may appear puzzling: isn’t ar- tificial selection conducted by breeders a mixture of partial hard interven- tions? And we know that their efforts had considerably improved a number of phenotypes of economic importance, such as milk yield of cows. In these planned breedings, however, the intervention is a function of the phenotype — the breeder decides how many offspring an animal can have based on its phenotype. This effectively creates a new causal path from Z to W , i.e. another selective pressure, which promotes adaptive response. But as long as they are random and exogenous to the system, hard interventions do not affect evolutionary outcomes. Not all interventions are hard. Soft interventions preserve some of the original causes of the target variables but modify their distribution, usually by adding another cause (Eberhardt, 2007). For example, one may want to know whether students’ economic status affects their academic perfor- mance. In such a case it would be difficult or even impossible to force every students participating the experiment to live with a fixed budget. But we may soft intervene on their economic situation by providing some allowance or scholarship. With respect to fitness, a soft intervention may be carried out through some form of environmental scaffolding (e.g. additional food or provision of a nesting place) which is uncorrelated with the focal phe- 28 notype nor interferes with its effect on the fitness. Such an independent additive intervention does not change the Price covariance, but does affect evolutionary responses through the mean fitness W̄ , the weighting factor in Eqn. 8. If we boost fitness by additive factor α, the post-intervention mean fitness becomes W̄ ′ = W̄ + α, which results in a slower response to selection. In general, additive soft interventions on fitness conserve the direction but affect the rate of adaptive evolution. Hence there are some interventions on fitness that cause evolution. But it is important to note that not all intervention, even soft ones, induce population change. For example if we manipulate only the variance of fitness by adding some noise factor with mean zero or by changing Var(EW ), such interventions will not affect either the Price covariance or the weighting factor 1/W̄ . Hence contrary to Sober (2013)’s claim, the fitness variance does not cause evolutionary change, at least in case of directional selection.11 Finally, let us consider whether selection causes evolution. Selection, as discussed above, is a causal influence of the trait on fitness. Under directional selection this process is represented by a linear coefficient β (Sec. 4.1). This parameter, in turn, should depend on selective environments including biotic (e.g. prey abundance) as well as abiotic (e.g. temperature) factors (Wade and Kalisz, 1990). Intervening on selection-as-process thus amounts to a modification of these fitness-related environments controlling β. Obviously, such interventions affect the Price covariance as well as the mean fitness and 11If selection is acting on higher moments, as in stabilizing or disruptive selection, the fitness variance does matter to evolutionary change. But Sober (2013)’s argument is entirely based on linear selection (i.e. the breeder’s equation). 29 thus make difference in adaptive response. In general, we have P (∆Z̄|do(β)) 6= P (∆Z̄|do(β′)) for any β 6= β′. It thus follows that selection does cause evolution. To sum up: there are some interventions, either on fitness or on selection, that affect evolutionary response. Therefore pace statisticalists the causal model makes it clear that fitness and selection do cause evolution. But not every intervention will do: hard interventions on fitness or manipulations of fitness variance usually do induce linear adaptive response. Recall we have reached this conclusion only with aid of the causal model underlying the evolutionary formulae. One cannot evaluate any intervention claim without an explicit causal model at hand: purely conceptual investigations on the nature of selection or fitness never settle the question. 6 Conclusions In the history of evolutionary genetics, most of its celebrated principles have been formulated in probabilistic terms. The Price equation and Lewontin’s conditions for evolution by natural selection both characterize evolution in terms of statistical, but not causal, features of a population. This gave rise to the philosophical puzzle as to whether evolution, described by these principles, is itself a causal process. The puzzle divided philosophers into two camps, but both sides have accepted the statistical formulae as given and even admitted that the mathematical equations in evolutionary genetics 30 are by nature non-causal or non-empirical. This presumption, however, is incorrect. As shown in this paper these evolutionary principles can be derived from certain causal models, and in this sense not fundamental at all. What are really at the base of population change and are driving evolution are the causal processes generating these statistics. Like in many other cases, philosophers’ standard modus operandi in this debate has been conceptual analysis. That is, the causal nature of selection or fitness was expected to be clarified by the correct interpretation of these concepts. To the eyes of these philosophers the approach taken in this paper may appear unfamiliar or even irrelevant. On the contrary, I argue it is the only way to solve the issue: whether one variable causes another is answered not by identifying the nature of these properties, but by specifying a causal model relating them.12 Once such a causal model is laid out, the answer follows quite straightforwardly. In so arguing I by no means pretend that the above models give the only causal models for adaptive evolution: obviously they are just a few — ar- guably the simplest — examples among many other causal structures. Nor am I trying to improve the predictive ability or performance of the standard evolutionary equations. My goal here is purely foundational, namely to pro- vide causal bases for the existing evolutionary formulae, no more, no less. The causal models, however, may be used to examine sometimes implicit assumptions and/or limitation of these equations, for all the causal assump- 12As I see it, this particular methodology — conceptual analysis —, along with the very theoretical framework that generated the philosophical puzzles like the statisticalist debate, forms the dominant paradigm in today’s philosophy of biology. This issue will be discussed in elsewhere. 31 tions are explicit in the graph. Fig. 3 tells us, for example, that in order to apply the breeder’s equation the phenotype must cause fitness (a mere correlation is not sufficient), that its prediction eventually depends on the genotypic distribution (hence the response may change across generations), and so on. They also provide a basis to analyze more complex phenomena, such as epigenetic inheritance, niche construction, or development. These “non-standard” mechanisms not covered by the traditional models intro- duce additional causal connections in the graph, whose impact on evolution can be directly evaluated through the method used in this paper (Otsuka, forthcoming). In sum, causal modeling provides a promising framework to approach a number of scientific as well as philosophical issues in evolution. Although its history dates back to Sewall Wright (1921), the technique has not received much attention either from biologists (Shipley, 2000) or philosophers (Gly- mour, 2006) until fairly recently. Exploring its possibility and limitation will be important tasks for the future. Acknowledgement I owe great debt to my graduate advisor Lisa Lloyd for her extensive support over and beyond the entire writing process of this paper. Discussions with Bruce Glymour and Jim Griesemer were particularly helpful in forming and sharpening the ideas developed in this paper. I also wish to thank the following people for their comments and discussions: Tyrus Fisher, Clark Glymour, Yoichi Ishida, Roberta Millstein, Samuel Ketcham, the PhilBio 32 group at UC Davis, and the participants of the 2012 POBAM workshop at the university of Wisconsin-Madison. References Brandon, R. (1990). Adaptation and environment. Princeton University Press, New Jersey. De Jong, G. (1994). The fitness of fitness concepts and the description of natural selection. Quarterly Review of Biology, pages 3–29. Eberhardt, F. (2007). Causation and Intervention. PhD thesis, Carnegie Mellon University. Fisher, R. A. (1930). The genetical theory of natural selection. Oxford University Press, New York. Gillespie, J. H. (2004). Population genetics: a concise guide. The Johns Hopkins University Press, Baltimore, 2nd edition. Glymour, B. (2006). Wayward Modeling : Population Genetics and Natural Selection. Philosophy of Science, 73:369–389. Glymour, B. (2011). Modeling Environments : Interactive Causation and Adaptations to Environmental Conditions. Philosophy of Science, 78:448– 471. Lande, R. (1979). Quantitative genetic analysis of multivariate evolution, applied to brain: body size allometry. Evolution, 33(1):402–416. 33 Lewens, T. (2010). The Natures of Selection. The British Journal for the Philosophy of Science, 61(2):313. Lewontin, R. C. (1970). The units of selection. Annual Review of Ecology and Systematics, 1:1–18. Lewontin, R. C. (1974). The Genetic Basis of Evolutionary Change. Columbia University Press, New York. Lloyd, E. A. (1988). The structure and confirmation of evolutionary theory. Princeton University Press, Princeton. Lynch, M. and Walsh, B. (1998). Genetics and analysis of quantitative traits, volume 24. Sinauer, Sunderland, MA. Matthen, M. (2010). What is Drift? A Response to Millstein, Skipper, and Dietrich. Philosophy & Theory in Biology, 2. Matthen, M. and Ariew, A. (2002). Two ways of thinking about fitness and natural selection. The Journal of Philosophy, 99(2):55–83. Matthen, M. and Ariew, A. (2005). How to understand causal relations in natural selection: Reply to Rosenberg and Bouchard. Biology & Philoso- phy, 20:355–364. Matthen, M. and Ariew, A. (2009). Selection and Causation. Philosophy of science, 76:201–224. Mauricio, R. and Mojonniner, L. (1997). Reducing bias in the measurement of selection. Trends in Ecology & Evolution, 12(11):433–436. 34 Millstein, R. (2002). Are Random Drift and Natural Selection Conceptually Distinct? Biology & Philosophy, 17(1):33–53. Millstein, R. (2006). Natural selection as a population-level causal process. The British Journal for the Philosophy of Science, 57(4):627. Morrissey, M. B., Kruuk, L. E., and Wilson, A. (2010). The danger of apply- ing the breeder’s equation in observational studies of natural populations. Journal of Evolutionary Biology, 23(11):2277–2288. Okasha, S. (2007). Evolution and the Levels of Selection. Oxford University Press. Otsuka, J. (forthcoming). Using Causal Models to Integrate Proximate and Ultimate Causation. Biology & Philosophy. Otsuka, J., Turner, T., Allen, C., and Lloyd, E. (2011). Why the Causal View of Fitness Survives. Philosophy of Science, 78(2):209–224. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann. Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge University Press. Pigliucci, M. and Kaplan, J. (2006). Making Sense of Evolution: The Con- ceptual Foundations of Evolutionary Biology. University of Chicago Press. Price, G. R. (1970). Selection and covariance. Nature, 227:520 – 521. 35 Rausher, M. (1992). The measurement of selection on quantitative traits: biases due to environmental covariances between traits and fitness. Evo- lution, 46(3):616–626. Robertson, A. (1966). A mathematical model of the culling process in dairy cattle. Anim. Prod., 8:95–108. Shipley, B. (2000). Cause and Correlation in Biology: A User’s Guide to Path Analysis, Structural Equations and Causal Inference. Cambridge University Press. Sober, E. (1984). The nature of selection: Evolutionary theory in philosoph- ical focus. University of Chicago Press, Chicago. Sober, E. (1993). Philosophy of biology. Westview Press, Boulder. Sober, E. (2013). Trait fitness is not a propensity, but fitness variation is. Studies in history and philosophy of biological and biomedical sciences, 44(3):336–41. Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, Prediction, and Search. The MIT Press, Cambridge, 2nd edition. Stephens, C. (2004). Selection, Drift, and the “Forces” of Evolution. Phi- losophy of Science, 71(4):550–570. Wade, M. J. (1985). Soft selection, hard selection, kin selection, and group selection. The American Naturalist, 125(1):61–73. Wade, M. J. and Kalisz, S. (1990). The Causes of Natural Selection. Evo- lution, 44(8):1947–1955. 36 Walsh, D. M. (2007). The Pomp of Superfluous Causes: The Interpretation of Evolutionary Theory. Philosophy of Science, 74:281–303. Walsh, D. M. (2010). Not a Sure Thing: Fitness, Probability, and Causation. Philosophy of Science, 77(2):147–171. Walsh, D. M., Lewens, T., and Ariew, A. (2002). The Trials of Life: Natural Selection and Random Drift. Philosophy of Science, 69(3):452–473. Wimsatt, W. (1980). Reductionistic research strategies and their biases in the units of selection controversy. In Nickles, T., editor, Scientific Discovery: Historical and Scientific Case Studies, volume 2, pages 213– 259. Reidel, Dordrecht. Wimsatt, W. (1981). The units of selection and the structure of the multi- level genome. In Asquith, P. and Giere, R., editors, PSA 1980, Vol. 2, pages 122–183. Philosophy of Science Association. Woodward, J. B. (2003). Making Things Happen: A Theory of Causal Explanation. Oxford University Press. Wright, S. (1921). Correlation and causation. Journal of agricultural re- search, 20:557–85. 37