1 Forthcoming in the European Journal for the Philosophy of Science A Critique of Empiricist Propensity Theories Mauricio Suárez 1 Keywords: Propensity, Probability, Empiricism, Humphreys’ paradox. Abstract: I analyse critically what I regard as the most accomplished empiricist account of propensities, namely the long run propensity theory developed by Donald Gillies (2000). Empiricist accounts are distinguished by their commitment to the ‘identity thesis’: the identification of propensities and objective probabilities. These theories are intended, in the tradition of Karl Popper’s influential proposal, to provide an interpretation of probability (under a suitable version of Kolmogorov’s axioms) that renders probability statements directly testable by experiment. I argue that the commitment to the identity thesis leaves empiricist theories, including Gillies’ version, vulnerable to a variant of what is known as Humphreys’ paradox. I suggest that the tension may be resolved only by abandoning the identity thesis, and by adopting instead an understanding of propensities as explanatory properties of chancy objects. 1 Institute of Philosophy, School of Advanced Study, London University, Senate House, Malet Street, London WC1E 7HU, UK, and Department of Logic and Philosophy of Science, Faculty of Philosophy, Complutense University, 28040 Madrid, Spain. Email: Mauricio.suarez@sas.ac.uk and msuarez@filos.ucm.es mailto:Mauricio.suarez@sas.ac.uk mailto:msuarez@filos.ucm.es 2 1. Empiricist Propensity Theories Karl Popper is largely responsible for the contemporary meaning of the term ‘propensity’ in the philosophy of probability. He introduced it in a series of papers in the late 1950’s, which simultaneously inaugurated an empiricist tradition in thinking about propensities – one characterised by its focus on the testability of propensity statements. In those foundational papers, Popper emphasised the falsifiability of statistical hypotheses as the key to the empirical character of propensities. Thus in (Popper, 1959, p. 36), propensities are favoured because they ensue quantitative empirical predictions: “The estimate of the measure of a possibility – that is, the estimate of the probability attached to it – has always a predictive function, while we should hardly predict an event upon being told no more than that this event is possible […] In other words, we do not assume that a possibility as such has any tendency to realise itself; but we do interpret probability measures, or ‘weights’ attributed to the possibility, as measuring its disposition, or tendency, or propensity to realise itself; and in physics (or in betting) we are interested in such measures, or ‘weights’ or possibilities, as might permit us to make predictions.” Later on, he adopted a single case version of the theory, but continued to claim the virtues of falsifiability for it: “The greater or smaller frequency of occurrences may be used as a test of whether a hypothetically attributed [propensity] weight is, indeed, an adequate hypothesis.” (1990, p. 11). And indeed “some typical experiments measure propensities fairly directly” (ibid, p. 15). The empiricist tradition founded by Popper eventually gave rise to the contemporary views of Donald Gillies, whose ‘long-run’ theory appropriately links 3 propensities to statistical hypotheses subject to empirical tests. 2 The key to empiricist approaches is their provision of a plausible interpretation of the (classical, or Kolmogorov) probability calculus, including the fourth axiom of conditional probability. Gillies claims that his theory enjoys a number of advantages over other propensity theories; I review – and, by and large, endorse – these advantages in the next section of the paper. In the third section of the paper, however, I raise some difficulties and, in the fourth section, I argue that empiricist theories, including Gillies’, are committed to what I call the identity thesis – the identification of propensities with objective probabilities. This commitment threatens empiricist theories with incoherence – as is shown by versions of what is known in the literature as Humphreys’ paradox. In the fifth and main section of the paper I review an ingenious way around this threat inspired by Gillies’ distinction between fundamental and non-fundamental conditional probabilities, and his development of a propensity ‘system’. I go on to criticise this solution, and argue that it leads to a dilemma: The long run theory is either committed to the identity thesis, and the ensuing incoherence; or it must adopt a radical departure from empiricism – a move anathema to the Popperian tradition. In the final and concluding section I briefly outline the main features of an alternative account of chance, in the tradition of Charles Sanders Peirce, which appropriately relinquishes the commitment to the identity thesis, while retaining an important role for propensities in the explanation of empirical probabilities. 2. Long-run and Single-case Propensities 2 See, particularly, Gillies (2000, chapters 6 and 7). For Popper’s original proposal see Popper (1957), (1959). 4 Gillies’ theory enjoys a number of different advantages over other versions of the propensity theory. 3 First, a single case version of the propensity theory is harder to square with the commitment to falsifiability. A long run view, by contrast, is almost automatically in line with falsificationist methodology. Second, the truth makers for propensity statements according to single case versions of the theory (such as Miller’s), are not empirically accessible, but are rather metaphysical entities. Gillies’ own long run version, by contrast, ascribes propensities to large, but finite, sequences. So in his theory the truth makers of propensity statements are, at least in principle, empirically accessible. Finally, Gillies’ long run version accommodates and explains naturally the empirical laws of probability. The difference between long run and single case propensity theories may be described roughly as follows. Suppose we are trying to determine the propensity that a coin possesses to land heads on a particular toss. The single case and the long run theories will both explain the particular outcome as a ‘result’ of some underlying physical fact or property – the ‘propensity’. But they will identify a different fact or set of facts as the ‘propensity’. Thus, a single case theory ascribes a chance (a ‘propensity’) for the coin to land heads in that very experimental set up. The propensity is therefore a property of the entity or entities involved in the single experiment. A long run theory, by contrast, characterises some ‘repeatable conditions’ giving rise to a sequence of events that the particular outcome event belongs to; it then goes on to ascribe a frequency of outcomes of the same type in the sequence, and identifies this frequency value as the ‘propensity’. The point is not so much that single-case propensities are qualitative and long run properties are quantitative, but 3 Gillies’ (2000, Ch. 6 and 7) suggests these advantages, but does not express them in the terms I use above, which is why I choose not to ascribe them to him verbatim in the text. 5 that they are predicated of different entities: the object itself in the former case, and the sequence of outcomes in the latter. Hence in order to falsify a propensity ascription, on the long run theory, we just need to test the corresponding long but finite frequency by repeating the same experiment. This may be a complicated affair – depending on how long the sequence and how difficult in practice to recreate the experimental conditions repeatedly – but it is in principle possible. By contrast, no inspection of any sequence, no matter how long, can conclusively lead to a falsification of a single case propensity ascription. The ascription of a ½ single-case propensity for a coin to land heads, for example, is consistent with any sequence of heads- tails outcomes. Now, this brief argument requires some unravelling, but it already shows that the long run theory is much more in line with Popperian falsificationist methodology. (Whether such a methodology is the correct one to use in assessing different accounts of propensities is of course a different issue altogether – and I partly address it in the last section of the paper). For similar reasons, Gillies charges single case accounts with being metaphysical and not scientific (Gillies, 2000, p. 127). The worry here is connected with truth makers. Gillies is quite right to stress that, on the single case account, what makes a propensity statement true is a state of affairs that we can have no direct empirical access to, namely a dispositional and non-observable property, or a ‘propensity weight’. In a single run of an experiment this weight is not actualised at all, except as a particular outcome (an outcome which would have equally actualised any other ‘propensity weight’). Thus a coin in such an experiment is said to have the propensity irrespective of whether it is actually tossed, and how many times. So the ascription of the propensity remains ‘metaphysical’ – in the sense 6 that a single-case propensity statement is not strictly speaking an empirical statement, i.e. not one that can be tested directly by experiment. This obviously marks a profound difference with the long run understanding of propensity, since on the latter propensities are features of sequences of outcomes in chancy experiments which can be tested directly – hence they are empirically accessible properties. The truth maker of a propensity statement, according to the long run view, is part of our empirical knowledge of the world. Finally, the two empirical laws of probability are what Gillies calls (following Keynes – see Gillies, 2000 p. 92) the law of stability of statistical frequencies; and the law of excluded gambling systems. The latter was due to Von Mises (1928, p. 20), and may be stated as follows: “it is impossible to improve one’s chances of winning by using a gambling system”, where a ‘gambling system’ is any rule that selects systematically a subsequence in a sequence of outcomes defining a probability (a ‘collective’ in von Mises’ terminology) such that the value of the probability in the subsequence differs from that in the long sequence. This entails that an outcome sequence is only a collective if its structure does not contain such sub-sequences with different probability values. So for instance, in tossing a coin the sequence HTHTHTHTHT… has the limiting frequency of heads ½. However such a sequence is not a properly defined collective since the rule ‘pick up the n-place member of the sequence, for n even’ systematically selects a subsequence with a limiting frequency of heads other than ½ (namely, 1). The law is then empirical in the sense that any genuine chance set up (e.g. any genuinely random tossing of a coin) yields sequences that obey it, and this is an empirical fact that may be determined by experimental means (as long as genuine chancy set ups may be distinguished). 7 As regards the law of stability of statistical frequencies, it may be expressed concisely as follows. Suppose that A is a possible attribute of a particular sequence. And suppose further that amongst the first n members of the sequence, m(A) measures the relative frequency of those with attribute A. Then the law of stability of statistical frequencies states that as n increases m (A) / n gets closer to a fixed value. Gillies then argues that both laws receive a natural explanation in his long run propensity theory. The law of stability of statistical frequencies is explained as convergence to the actual propensity value, while the law of excluded gambling systems is explained by the fact that probabilities are defined to comply with the axiom of independence (Gillies 2000, p. 154). Now, while this does not confer it a decisive advantage (single case propensity theories prima facie explain at least the law of stability just as well), it certainly adds to the attraction of Gillies’ view. 3. Some Difficulties for the Long Run View We may summarise the two views as follows: In a single case theory the propensity is a property of (the entity or entities involved in) the single case itself; whereas in a long run version of the theory, the propensity is a property of the sequence of events that the outcome event belongs to, which is in turn defined by a set of repeatable conditions. Hence in the long run version of the theory, the single case – or the particular outcome event – does not literally possess or display the propensity, but it may only be said to display it in a derivative sense as part of a sequence of events. The propensity is only fully displayed in the whole long (perhaps infinite) sequence. 8 This brings the long run version of the propensity theory perilously close to a frequency theory of probability. Indeed Gillies acknowledges that a long run theory is much closer to the frequency theory than the single case version of the propensity theory (Gillies, 2000, p. 137). Yet, there are differences. On a frequency theory, probability is exclusively a property of the sequence, and this generates the notorious reference class problem – the probability of a single case depends on how that single case is represented as belonging in one sequence or another. Thus, the probability that an individual dies of a heart attack at the age of 40 year depends on whether the individual is a man or a woman, whether a smoker or not, whether living in Europe or Africa or, etc. Each of these prescriptions determines a different population, and the relative frequency of heart attacks at 40 differs across populations. So, frequency theories must come up with a story to fix the appropriate reference class. A single case version of the propensity theory overcomes this difficulty in a straightforward way since it ascribes the probability to the outcome event itself independent of any sequence – hence the single case displays a probability, namely the probability of the outcome event, regardless of what class or classes this event may be said to belong to. 4 Now, a long run propensity theory is supposed to ascribe the propensity to the sequence, but only relative to the set of ‘repeatable generating conditions’ that actually generate the sequence. 5 Thus the outcome event ultimately displays the propensity in 4 A complication is that the outcome event itself may only be identified relative to the experimental set up – indeed Popper thought that propensities were relational properties of the entire experimental set-up – but this does not alter the fundamental point. 5 Gillies is not always entirely clear on this point. Sometimes he seems to ascribe propensities to the set of repeated conditions relative to the sequence, rather than the other way around, as, for instance, when he notes: “The propensity theory claims that some sets of repeatable conditions have a propensity to produce in a long sequence of repetitions frequencies which are approximately equal to the probabilities” (Gillies, 2000, p. 161). On the other hand, he is also very clear that single case chances are not propensities, but may only be interpreted as subjective (ibid., pp. 119-120). The interpretation in the text above seems to make most sense 9 virtue of its having been generated by a particular set of conditions. The distinction is subtle but the underlying thought is sound, namely that the difference between frequency and propensity accounts in general must ultimately rest upon a key difference between their respective truth makers. On one account (frequency) probabilities are made true by sequences of events simpliciter, while on the other account (propensity) they are made true at least in part by certain facts regarding the physical entities that generate such sequences. Nevertheless, the long run propensity theory ascribes the propensity to a sequence, however relative, and as a result it does inherit some features of frequency accounts of probability. First, it must be determined whether a propensity is ascribed to a long but finite sequence, or an infinite hypothetical one. Gillies’ development of a falsifying rule (Gillies, 2000, p. 145-150) seems to me to entail that the sequences be actual – hence finite. While the choice is necessitated by this kind of empiricism, it also imports a plethora of well- known difficulties associated with finite frequencies. When is the sequence long enough for the frequency to be representative? Tossing a coin just a few times can be very misleading regarding the underlying propensity of a coin to land heads. The law of large numbers, or some version thereof, is supposed to provide an answer, but why accept the law in the first place? From an empiricist point of view of the kind defended by Gillies and Popper, the law must be falsifiable, but the only way to go about falsifying a law regarding the nature of long run frequencies, and their putative convergence onto the right numbers, involves a comparison with probabilities, and, on the proposal under consideration, these already require the frequencies to be in place. I won’t rehearse all the arguments against finite of Gillies’ various commitments, but in any case nothing much hinges on this subtlety. Most of the arguments raised in this paper, and in particular those in sections 4 and 5, go through regardless. 10 frequentism here: 6 it suffices to note that many of these arguments apply to the long run propensity theory too. The solution to these problems supposedly comes in hand with the insistence on ascribing propensities relative to the set of ‘repeatable generating conditions’. In tossing a coin, for example, the propensity is supposedly a property of the sequence relative to the entire chance set up, including the mechanisms involved in the tossing, the friction of the air molecules against the coin’s movement, the physical features of the surface upon which the coin lands, etc. But if this relativity is required in order to solve the aforementioned problems, one wonders why propensities must be ascribed to the sequence at all. Why this remnant of the frequency account, when it has already been established that the key to the ascription lies with the physical situation rather than any feature of the sequence (or ‘collective’ in Von Mises terminology)? Why not do away with the need to refer to any sequences altogether? There seems to be an inherent instability in the long run view, which aims to simultaneously enjoy the benefits of both the frequency and single case propensity accounts. But, it seems that one cannot have one’s cake and eat it too – for the account inherits not only the benefits, but also the difficulties incurred by both approaches. 4. The Identity Thesis The approach that I advocate, and describe briefly in section 6, is resolutely single case in ascribing propensities to the chance set up and in no way to sequences. I urge that 6 But see Hajek (1997) for a review. 11 this approach ultimately entails a clean conceptual separation between propensities and probabilities – and this has not been recognized in the literature so far. By contrast, propensity accounts (including Gillies) have traditionally understood propensities to fundamentally provide an interpretation of probability. No wonder, then, that a long run propensity theory seems attractive. On these views, propensities and probabilities are essentially the same kind of thing – the former simply provide a model for the latter. Propensities and probabilities alike then turn out to be features of the sequences generated by those set ups. This conflation, I argue, leads to major difficulties, and should be given up. Instead propensities should be ascribed exclusively to features of chance set ups, while probabilities are distinct manifestations of these propensities – and, if so desired, may be defined as relative frequencies in finite or infinite sequences. 7 I am claiming that empiricist accounts of propensity are implicitly committed to an identity thesis, i.e. the identification of propensities and probabilities. But one has to be careful in describing the thesis, since it has two parts, or halves. There is first the commitment to interpret all probabilities, including all conditional probabilities, as propensities – we may refer to this as the probability-to-propensity half of the identity thesis, or identity1. There is then the converse commitment to treat all propensities as probabilities, or to be more precise to represent all propensities as conditional probabilities. We may refer to this as the propensity-to-probability half of the identity thesis, or identity2. The full identity thesis is then the conjunction of identity1 and identity2. 8 Now, I am not 7 Alternatively, probablities maybe tested by observed experimental frequencies. Note that the view as described above has affinities with single case propensity theories such as those due to Miller and Fetzer. However, these authors do not really relinquish the identity thesis – they do not separate as cleanly as I do propensities from their probabilistic manifestations. 8 These terms were introduced in (Suárez, 2013) with slightly different meanings but amounting to the same idea that the conjunction of both makes up the full identity thesis. Note that strictly speaking the bi- 12 arguing that all empiricist accounts are committed to both halves of the identity thesis full court. I do believe that all empiricist accounts, and many other propensity accounts that are not empiricist in the same strong sense of falsifiability, are committed to the probability-to- propensity half – but only for objective probabilities. Many defenders of propensities have accepted that there is a subjective sense of probability, and I of course would not claim that they have applied identity1 to it. 9 Yet, when it comes to objective probabilities, most propensity approaches in the empiricist tradition have attempted to at least interpret them as propensities – if not altogether reduce them to propensities. So, the claim is that empiricism is committed to the identity thesis for objective probabilities only. As regards identity2, or the propensity-to-probability half, all proponents of propensities have adopted a representation of propensities as probabilities. True, some scholars have argued, in light of some of the challenges and considerations that I discuss below, that conditional probability, as defined by Kolmogorov, is inappropriate for probability in general. 10 However, this is tantamount to relinquishing the classical or Kolmogorov axiomatization, and to adopting an alternative a representation of probability for propensities – without in any case relinquishing identity2. I shall argue below that the right response to the challenges is not to abandon Kolmogorov, but rather to abandon the commitment to the identity thesis. conditional applies to conditional probabilities only but, as I go on to make clear in section 5, there is a formal way to render all probabilities explicitly conditional. 9 Although I am unsure about Popper himself, who sometimes (for instance, in describing how the propensity interpretation ‘takes the mystery out of quantum mechanics’), writes as if he does not countenance any meaningful concept of subjective probability. But this seems rather extreme. Donald Gillies certainly countenances subjective probabilities, as distinct from objective propensities, and defends a kind of pluralism regarding probability (Gillies, 2000, Ch. 8). Similarly Carnap (1966), Hacking (1975, 1990), Mellor (2005) all accept at least two different senses of probability, roughly along the lines of the distinction between the objective (relating to physical states) and the subjective (relating to mental or belief states). 10 Hajek (2003). 13 My main argument for distinguishing propensity and probability ascriptions, and generally for distinguishing these concepts and their role in practice, derives from what is known as “Humphreys’ paradox”. As first described by Salmon (1979), this is the claim that propensities and Kolmogorov probabilities are significantly different – and in the cases usually discussed by Salmon, propensities can be seen to exhibit some asymmetry which is lacking in the corresponding probabilities. Thus in Salmon’s original example (Salmon 1979, pp. 213-14), the propensity of a certain person to die given that he is shot in the head is ¾. This is not a symmetric propensity, since there is no propensity to have had one’s skull perforated by a bullet given that one is dead. Yet the inverse conditional probability (i.e. the conditional probability of someone’s having been shot in the head, given that this someone is dead) is perfectly well defined, and can be calculated easily by means of Bayes theorem – provided that some estimates for the priors are available. This argument shows rather decisively that Identity1 cannot generally hold, even when restricted to objective probability. There are well-defined objective probabilities that receive no propensity interpretation. In fact, it is arguable that for any objective conditional probability that represents a propensity, there is a well-defined objective probability (its inverse under Bayes’ theorem) that does not represent any propensity. The reason is that, in a propensity represented as a conditional probability Prob (A / B), the conditioned upon event B is typically the dispositional property that fires in order to either generate or cause the conditional event A. We then say that B has a propensity to A. Since “having a propensity to” is typically, if not always, an asymmetric relation, it follows that Prob (B / A) does not represent a propensity, because A has no propensity to B. 14 The sort of example invoked by Paul Humphreys himself is more involved and, as a matter of fact, shows not the failure of the probability-to-propensity half (Identity1), but rather the failure of the propensity-to-probability half (Identity2). Now, this may come as a surprise, since the paradox is often presented as demonstrating that the propensity interpretation is inconsistent (i.e. that interpreting some objective probabilities as propensities yields some results that are inconsistent with the axioms of probability). This diagnosis is implausible once the relevant distinctions are in place. In particular, Humphreys’ example does not show the concept of propensity itself to be flawed, but rather it shows that propensities may not be identified with probabilities. The simple reason, made evident by the example, is that some propensities lack any plausible probability representation. The example provided is a thought experiment involving photon emission and transmission, in which a number of propensities explain, ex-hypothesis, certain observable statistical features of the example. There does not seem to be any problem with the assumption that these propensities are explanatory – yet I argue that on account of Humphreys’ example, they cannot be meaningfully rendered into probabilities. The thought experiment postulates the emission of photons at time t1, their incidence upon a half-silver mirror at time t2, and their transmission past the mirror at time t3, where t1 < t2 < t3. Thus for any given photon that is transmitted, there is an emission event E (t1), an incidence event I (t2), and a transmission event T (t3). However, it is assumed that not all emitted photons actually reach the mirror, and not all those that reach the mirror are actually transmitted. This is because any given photon has, in the given experimental set up, a propensity to be emitted at t1, let us refer to it as Prop (E (t1)). Any photon emitted at t1 also has a certain propensity of reaching the mirror Prop (I (t2)). And 15 any photon that reaches the mirror has a certain propensity to be transmitted Prop (T (t3)). We may assume that all these propensities are set by the physical facts at the time of emission, and thus we may attempt to represent such propensities as conditional probabilities subscripted at t1 as follows: Prop1 (E (t1)), Prop1 (I (t2) / E (t1)), Prop1 (T (t3) / I (t2) & E (t1)). 11 Now, once we express the propensities this way, we are obliged to represent any additional facts regarding them by means of these conditional probability expressions. Humphreys, in particular, stipulates that the following three facts obtain in this thought experiment: i) 1 > Prop1 (I (t2) / E (t1)) = q > 0. ii) Prop1 (T (t3) / I (t2) & E (t1)) = p > 0. iii) Prop1 (T (t3) / ¬ I (t2) & E (t1)) = 0. The first expression states that the propensity of a photon emitted at the source at t1 to reach the mirror at t2 is finite and non-zero – this explains the fact that some photons always reach the mirror but not all photons do. The second expression states that the propensity of a photon that has reached the mirror to be transmitted is greater than zero – which in turn explains why some photons are transmitted. In the last expression, ¬ I (t2) represents the event of a photon not reaching the mirror. The equality expresses the fact that a photon emitted at t1, but not received at the mirror at t2, has no propensity at all to 11 Humphreys (1985, p. 561) assumes that emission time is t0, and that it is the physical facts at some strictly later time t1 that fix the propensities. Nothing essential in what follows depends on this assumption, so I shall assume that t1 = t0 without loss of generality. 16 be transmitted at t3 (presumably because it is not possible for it to be transmitted). 12 This explains why no photons that fail to reach the mirror are transmitted. Next, Humphreys considers a principle of Conditional Independence (CI) which, for reasons that will soon become clear, I shall refer to as Conditional Independence of Propensities (CIProp): (CIProp) Prop1 (I (t2) / T (t3) & E (t1)) = Prop1 (I (t2) / ¬T (t3) & E (t1)) = Prop1 (I (t2) / E (tt)). “Humphreys’ paradox” is then the fact that (CIProp) is inconsistent with expressions i), ii) and iii) above as long as probabilities obey Kolmogorov’s axioms, and in particular Bayes’ theorem. Later on in the paper I will dispute the appropriateness of expressions i)-iii) for the thought experiment described. But, first, we must consider the status of (CIProp). This principle states that the propensity at t1 of incidence upon the mirror at t2 is independent of transmission at t3. Humphreys grounds (CIProp) on the more general thought that nothing that happens at time t3 can causally affect the propensity at t1 of something else happening at t2. 13 Now, regardless of one’s views on backwards causation, there is a more worrying issue regarding (CIProp), and it comes to the fore when considering the status of an equivalent condition for probabilities in general. The equivalent condition may be referred to as Conditional Independence of Probabilities (CIProb): 12 Strictly speaking it states that its propensity to be transmitted is zero, but I assume throughout that having propensity zero is equivalent to having no propensity. 13 He writes (Humphreys, 1985, p. 561): “[…] The propensity for a particle to impinge upon the mirror is unaffected by whether the particle is transmitted or not”. 17 (CIProb) Prob1 (I (t2) / T (t3) & E (t1)) = Prob1 (I (t2) / ¬T (t3) & E (t1)) = Prob1 (I (t2) / E (tt)). It turns out that (CIProb) may fail regardless of any causal relations between T (t3), Prob1 (I (t2)), and I (t2). This is because, as is well known, statistical independence between variables fails to generally entail the absence of causal relations amongst them. On the contrary, the causal inference literature makes it by now abundantly clear that probabilistic dependence per se is not a sound basis on which to infer a causal connection. The underlying causal structure may be hideously complicated, so that correlated factors may in no way be directly causally related. Conversely, probabilistic independence between variables does not entail that there are no causal connections amongst them. The absence of statistical correlation may be masking an array of carefully balanced causal factors that have no overall statistical effect. 14 Nothing in the nature of propensities seems to rule out such possibilities, however farfetched they may seem. There are thus two different ways in which (CIProb) may fail. First, backwards causation is not in general ruled out. True, this is no argument by itself since there are contingent issues and problems for backwards causation, particularly outside quantum mechanics. But it does suggest (CIProb) depends upon some contingent assumptions. Second, causal independence is not a necessary condition for statistical independence. So (CIProb) does not guarantee that T (t3) is not a cause of either Prop1 (I (t2)) or I (t2). Whatever reasons there are to uphold (CIProp), they must derive from different grounds. Is there anything significant about propensities that makes (CIProp) hold where (CIProb) fails? Certainly, the principle 14 One of the most widely discussed such arrangements in the literature is known as Hesslow’s example, where a particular variable is causally related to an effect via two different intermediate routes, one route involving an inhibitor and the other route involving a producer, so finely balanced that no correlation is apparent at all between cause and final effect (Hesslow, 1976). The example has famously provided grounds against statistical theories of causation as probability raising in general (for instance, see Cartwright, 1989, pp. 99-100 ). 18 seems to have a greater credibility when applied to propensities, and on would seem to be able to come up with examples that do satisfy (CProp). It is hard to see what the difference could possibly be as long as we continue to insist that propensities are probabilities, as Identity2 claims. Humphreys’ own reaction to his example is that the Kolmogorov axioms, and in particular Bayes’ theorem as a definition of conditional probability, are false: “The account thus ought […] to be viewed as […] showing directly the falsity of […] Bayes’ theorem” (Humphreys, 1985, p. 567). Thus he suggests in response to reject the Kolmogorov calculus for probability, and to come up with an alternative to standard probability theory: “[…Standard] probability should be viewed as a contingent theory. […It] does not have the status of a universal theory of chance phenomena with which many have endowed it” (Humphreys, 2004, p. 679). The preceding analysis suggests that this may be an extreme reaction that throws the baby out together with the bathwater. It should suffice instead to reject the identity thesis, in both directions, by giving up both Identity1 and Identity2. Since (CIProb) is generally false, but (CIProp) often holds, the most reasonable response to Humphreys’ paradox is indeed to abandon the identity thesis. There is then no conceptual identity between propensity and probability – and propensity does not then merely interpret probability theory. Certainly, the move requires that we come up with an alternative representation of the propensities in the example that does not employ probability theory. But as long as such a thing is possible – and there seem to be myriad ways of doing this, as I discuss in section 6 – this is the most sensible, or at least the most conservative, response to Humphreys’ paradox. It sticks to the Kolmogorov axioms for probability, while insisting that propensity requires no probability interpretation at all – under any calculus. 19 5. Fundamental Probability Systems Donald Gillies has developed an ingenious response to some of these challenges, involving a distinction between fundamental and non-fundamental conditional probabilities. In this section I describe the response, and proceed to criticize it. Gilllies’ main tenet is what we may refer to as the universality of conditional probability. He claims that all objective probabilities are implicit if not explicitly conditional – so there exist no genuinely unconditional, or “absolute”, objective probabilities. First of all, in the long-run theory, propensities are necessarily conditional probabilities since they are by definition relative to a set of repeatable conditions S. Thus the propensity of an event A is to be identified with the conditional probability of A, conditional on the set of repeatable conditions that generate it, i.e. with Prob (A / S). This commits the long run theory to Identity2 in a particular version that states that all propensities may be represented as conditional probabilities, and I argue below that it remains problematic in light of Humphreys’ paradox. However, Gillies is rather more concerned with avoiding the converse condition, which would make the long run theory prey to Salmon-type counterexamples. Recall that the probability-to-propensity half of the identity thesis, or Identity1, claims that all probabilities may be given a propensity interpretation. But as Salmon’s counterexamples demonstrate, in those cases where a conditional probability may receive a propensity interpretation, its inverse conditional probability is well defined, but almost always not amenable to a propensity interpretation. Gillies attempts to get around this problem by appealing to the 20 universality thesis, and his argument goes roughly as follows. The universality thesis tells us that all probabilities are implicitly or explicitly conditional upon a set of repeatable conditions S. What appear to be “absolute” or unconditional probabilities are in fact fundamental conditional probabilities (Gillies, 2000, p. 132): “Probabilities like P (A) are often called absolute probabilities, but really since P (A) is an abbreviation for P (A / S) it would be more accurate to refer to them as fundamental conditional probabilities.” We may identify fundamental probabilities by means of a subscript. Probf (A) then stands for the fundamental probability of A, which is only conditional upon its set of repeatable conditions, and may be fully spelled out as an ordinary conditional probability: Probf (A) = Prob (A / S). Now, not all conditional probabilities are fundamental. In particular, ordinary conditional probabilities, or event-conditional probabilities, such as Prob (A / B), where B is an arbitrary event, are not fundamental. Yet, they are also conditional upon a set of repeatable conditions, but only implicitly, and they may be fully and explicitly rendered as: Prob (A / B & S). So what distinguishes fundamental conditional probabilities is the fact that they are solely conditional upon their set of repeatable conditions. Other conditional probabilities are ‘event-conditional’ in the sense that they are conditional upon further events in addition to their set of repeatable conditions. Now, one possible response to the objection raised by Humphreys’ paradox would reject Identity1, by insisting that only fundamental conditional probabilities can be interpreted as propensities. 15 In particular the inverse of a fundamental probability, however numerically well defined, would not be interpretable as a propensity, since it is not conditional upon a set of repeatable conditions. Thus Prob (S / A) may be 15 This may not be Gillies’ considered response, however, since in (Gillies, 2000, p. 132), it is asserted that ‘to say that P (A / B&S) = q means that there is a propensity if …’, which shows that Gillies gives a propensitiy interpretation to both fundamental and non fundamental conditional probabilities, insisting only on the non- reversibility of the former. 21 numerically well defined, but it does not represent any meaningful propensity since only fundamental conditional probabilities do, and Probf (S / A) and Probf (S) are meaningless expressions. This seems to be a novel suggestion, in the spirit of Gillies’ proposal, to get around the Salmon-type counterexamples. The conditional probability of someone dying given that they are shot in the head may represent a propensity, but this does not require the inverse conditional probability of someone being shot in the head given that they are dead to represent a propensity too. The reason is presumably that shooting is part of the repeatable conditions that give rise to the propensity of death, but not the other way round. The conditional probability of shooting given dying is not fundamental, and the question does not arise as to whether this conditional probability may be interpreted as propensity. The proposal is in effect to restrict the scope of Identity1, so as to make it applicable only to fundamental conditional probabilities. In the remainder I wish to argue that it is not possible to answer the objection raised by Humphreys in this manner, i.e. by selectively relinquishing only a part of one half of the identity thesis. A full rejection of the identity thesis is rather called for. Gillies further formalises these notions, by means of what he call a “probability system”. 16 A probability system is a set (SS, Ω, F, P) where (Ω, F, P) is an ordinary probability space 17 and Ss is a “sequence of repetitions”: a sequence of events satisfying a set of ‘repeatable conditions’ S and separated in agreement to some spacing condition s, which fixes the essential variable parameters that separate the events generated by the ‘repeatable 16 Gillies, ibid, pp. 161ff. The exposition in the text differs slightly from the original, on account of our present interest to draw out the consequences of probability systems for the identity thesis. 17 An ordinary probability space (Ω, F, P) contains a set of possible outcomes of chance trials, or outcome space (Ω), a Borel field F of subsets of the outcome space, and a probability function or measure P defined over the elements of the outcome space. 22 conditions’ in the sequence. For example, in analysing the propensity of 40 year-old men to survive to 41, we are not only interested in people who satisfy the repeatable conditions of being men and 40. We are also interested in a time-slice of those men at this or any other particular time. The propensity to survive to 41 may well differ in different epochs, cultures, ages and conditions. The spacing condition is supposed to take account of such variability by generating a stable set Ss of events with similar features amongst those that are legitimately generated by the repeatable conditions. The net effect of introducing the sequence of repetitions Ss is precisely to restrict Identity1, as previously suggested, to only those probabilities defined over elements of Ω included in Ss. Yet, the restriction does not dispense with all the problems associated to the probability-to-propensity half of the identity thesis. To see this, let me briefly invoke a different Salmon-type example where, unlike what is the case in Salmon’s own example, the time order of the events is not determined (Suárez, 2013, pp. 79-80). Consider my propensity to fly out to North America (F) in the Spring (S). The frequency of my travelling there during the last 10 years, gives Prob (F / S) = 0.9. Now estimate further the probability for it to be spring on any given day of the year as Prob (S) = ¼, and the probability of my flying out to North America on any season, again on account of my last 10 years’ travelling there, as Prob (F) = 0.4. By means of Bayes’ theorem we can easily calculate the probability of it being spring in North America given that I fly out as Prob (S / F) = 0.56. However, while Prob (F / S) may be said to represent a propensity, it does not seem to make any sense to say that Prob (S / F) too represents a propensity. This judgement agrees with the intuition underlying all Salmon-type examples: propensities inherit the asymmetry of causation but probabilities do not. There is something about Spring in North American that causes my 23 travelling there, but not the other way round: nothing in my travelling plans alters the North American seasons. The example makes it also clear that it is the causal asymmetry itself and not any time-asymmetry that is inherited by propensities. For the underlying intuition can not this time be grounded upon the time order of events – since Spring is an extended event that comprises my flying out there, but can not be said to occur either before or after it. Propensities are asymmetric in the way causation is – and this is irrespective of any time- asymmetry that the causal relation maybe judged to possess. The obvious response from the point of view of the suggestion that we are considering is to assert that while Prob (F / S) is a fundamental probability (and can thus be written as Probf (F)), while Prob (S / F) is not. This would explain the asymmetry in our intuitions. However, the distinguishing grounds for the fundamentality of Prob (F / S) seem lacking – it is just not the case that S is in the set of repeatable conditions that give rise to F, since I fly out to North America not only in the spring, but throughout the year, in any season. Are we to say that “F” is a distinct event depending on whether it happens in the spring or any other season? Why? The distinction between different types of F seems arbitrary in this context unless it is related to the exercise of distinct underlying propensities. But to say that “F” is a distinct event depending on the propensities that generate it is just to say that it is not always the result of the same set of generating repeatable conditions. Hence it is circular as an elucidation of what probabilities are fundamental and may be freely interpreted as propensities – since it presupposes the very notion of propensity that it is intended to capture. Moreover, the set of repeatable conditions are meant to include the physical features of the chance set up that generate the long run sequence of outcomes of a particular experimental type. But, if so, then neither F nor S seem to be part of each other’s 24 set of repeatable conditions. F certainly does not cause S, in the sense that S maybe generated without F. But then, neither does S cause F in this sense. Hence, on Gillies’ prescription neither Prob (F / S) nor Prob (S / F) are fundamental probabilities – so neither should receive a propensity interpretation. Yet, ex-hypothesis Prob (F /S) measures my propensity to travel to North American in the spring in this example – and our intuitions seem as solid regarding this propensity ascription as any other we know. So the causal asymmetry of propensities does not seem to be captured by the distinction between fundamental and non-fundamental conditional probabilities after all. It seems that some non-fundamental conditional probabilities are propensities too, while some fundamental conditional probabilities are perhaps not propensities at all. The argument extends to the formal version of Gillies’ distinction in terms of probability systems. The introduction of the sequence of repetitions Ss as part of the probability system is a genuine innovation. It is supposed to take account of just those features that are the distinguishing key to fundamental probabilities – and it thus under the present suggestion serves to distinguish propensities from other probabilities. An implicit commitment to Identity2 drives through the idea that such repeatable conditions must be part of the formally defined probability system. The same commitment underlies the thought that sequences of repetitions Ss ultimately are sets of events generated by the repeatable conditions S. In other words, instead of representing such conditions directly as physical facts regarding the chance set up and its properties (which is what they presumably are), we represent them indirectly as some set of the events that result out of the operation of this chance set up. Why? The reason is intimately connected with the identity thesis: We aim to define the probability function over the repeatable conditions so as to preserve the identity 25 thesis (Identity2). Since probabilities are mathematically defined only over events or propositions, we better find a representation of the generating conditions (i.e. the actual propensities) in terms of events – and Ss is the best we can come up with. Now a crucial question opens up: Is the sequence of repetitions Ss included in the outcome event set Ω? There are several reasons to think that any long run propensity theory must be committed to a positive answer to this question. First, the introduction of probability systems only helps to distinguish fundamental probabilities if indeed Ss Ω, as I remarked above. Second, the long run theory is empiricist in the sense that propensities are understood to be subject to empirical refutation or confirmation directly by experiment. This requires Ss Ω, since Gillies shows how the falsifying rule follows from an Axiom of Independent Repetitions that may be formally stated as a variety of a probability system. 18 Although he does not make it explicit, an assumption in this derivation is that the sequence of repetitions Ss is included amongst the outcome events, and in fact it determines which amongst the outcome events, are genuinely ‘independent’. Finally, that Ss Ω is entailed by the fact that the probability function P is explicitly defined over the elements of Ss, as Gillies writes (ibid, p. 167): “P (A / B) is really an abbreviation for P (A / B & Ss), although the underlying repeatable conditions Ss are never written out explicitly within Kolmogorov’s formalism”. The introduction of Ss in Ω is thus entailed by the basic commitments of the long run version of the propensity theory and, in particular, by the empiricism that runs through it. Yet, it is also the source of all its problems and difficulties. Once the repeatable conditions 18 The Axiom of Independent Repetitions states that if (SS, Ω, F, P) is a probability system, then so is (SS n , Ω n , F n , P (n) ), where SS n is the sequence of repetitions formed out of repeatedly choosing the same n-tuple of elements of SS; Ω n is the n-fold Cartesian product of Ω; F n is the minimum Borel field of subsets of Ω n that contains F; and the measure P (n) on F n is the n-fold product measure of the measure P on F. (Gillies, ibid, pp. 164-167). 26 are included in the outcome space, and the probability function is defined over them, there seems to be no reason at all why probabilities may not be meaningfully reversed in the way described by Humphreys. Yet, as Salmon-type counterexamples show, most inverse conditional probabilities cannot be interpreted as propensities. And, as Humphreys’ own example illustrates, most propensities cannot be represented as conditional probabilities. Both halves of the identity thesis are false – and there is no need to define propensities as probabilities. It is rather time to look for an understanding of propensities that skirts such commitments. 6. Towards a Pragmatist Conception The main aim of this paper is critical – since it is intended to show that empiricist accounts of propensity are essentially committed to the identity thesis and thus confront major objections arising out of Humphreys’ paradox. I have already completed this task, but one may wonder what the alternatives are. In this last section, I briefly sketch the outlines of a possible non-empiricist account. 19 In section 4, it was suggested that the relation between propensities and probabilities is not one of interpretation but of manifestation. Propensities manifest themselves in probabilities, and this ‘manifestation’ relation is sui generis and does not reduce to anything else. It thus seems appropriate to introduce a new symbol “»” to represent it, while leaving open at this stage any questions regarding the connection between this propensity relation and other modal locutions and relations, 19 The account is presented in (Suárez, 2013), which also explains the reasons why this particularly non- empiricist conception of propensities may be said to be “pragmatist”. 27 including probability. Note that the change of representation immediately lets us out of Humphreys’ paradox. The propensities that are actually explanatory in this thought experiment can best be represented by means of the sui generis symbol “»” as follows: i) E (t1) » Prob (I (t2)) = q, where 1 > q > 0. ii) E (t1) & I (t2) » Prob (T (t3)) = p, where p > 0. iii) E (t1) & ~ I (t2) » Prob (T (t3)) = 0. In all these expressions we are assuming that there are different propensities of the chance set up and experiment associated to different configurations expressed by different factual events at different times. 20 The propensities described by E (t1) are those that in any given way relate to the emission event at t1, etc. In the first case the propensity is defined at time t1, while in the second case it is defined at time t2. The last case (iii) is indeterminate – it depends on whether we take the facts at t1 to leave open what happens at t2, and in particular whether or not It2 takes place then. These propensities manifest themselves in probability distributions over observable events. The distributions are subject to empirical testing directly by experiment, but the propensities are not. This is acceptable from the point of view of a non-empiricist understanding of chance and probabilistic dispositions. From this point of view, propensities do not interpret, but rather explain probabilities: They appear as part of an explanatory story. 21 And indeed the above expressions seem to capture all the explanatory power of the propensities invoked in the example. In particular the propensities so described provide as good an explanation of all the relevant facts, namely: i) that some photons always reach the 20 An alternative interpretation of i)-iii) above, which makes the point even sharper, associates the propensities to dynamical properties of either the set up or the photons themselves as they travel from the source, at different instants of time. Thus propensities are not even described as events – so they cannot even in principle be defined as probabilities. 21 Similar views had been voiced earlier by Levi (1980, chapter 12). 28 mirror but not all photons do, ii) that some photons are transmitted, and iii) that no photons are transmitted which fail to reach the mirror. This explanatory work is performed without any recourse to conditional probability. Note also that no problems arise here in connection with conditional probability and, in particular, no inconsistency with the Kolmogorov axioms may be derived from these expressions with the help of any principle such as (CIProp) or (CIProb). In addition, from the point of view of this representation, it becomes clear that (CIProp) is as suspect as (CIProb) and should be rejected. 22 In particular, there is no reason on this representation to expect the probability of incidence given transmission, or Prop1 (I (t2) / T (t3) & E (t1)), to measure or represent any propensity whatever. Nor is there any point in introducing sub-indexes to the probabilities since they, unlike propensities, are not typically time-indexed (and in any case the events that probabilities are defined over already carry implicit indexes). Instead we may simply assume that Prop1 (I (t2) / T (t3) & E (t1)) = 1 if transmission is understood to entail incidence, or if in fact I (t2) takes place, while = 0 if I (t2) fails to take place. 23 The non-empiricist approach outlined here must of course be developed further. Yet, much progress has already been achieved, and it is the aim of this paper to show why. We have seen that the empiricist attempts to identify simpliciter propensities – or objective chances – with probabilities run into difficulties. Nevertheless the efforts and achievements of the empiricist tradition provide us with great insight into the content and the limits of the identity thesis (both ways, as Identity1 and Identity2). Its rejection correspondingly opens up new vistas on the relation between propensity and probability and hence helps to redefine 22 Or, at the very least, it should be replaced with an equivalent principle stating the appropriate causal relations amongst propensities. Since its expression would involve “»” as opposed to conditional probability, there is no reason to expect any contradiction with the principles i)-iii) above, or with the Kolmogorov axioms. 23 These are amongst the different intuitions that the expression elicits – thanks to Alan Hajék for pointing out that they are inconsistent with each other, not just with Humphreys’ intuition. 29 both concepts. We are in need of a more nuanced and subtle theoretical understanding of the explanatory connections between propensities and probabilities, and the terminology provided in this article suggests a possible starting point. Acknowledgements For comments and reactions I thank two helpful EJPS referees, and audiences at the British Society for the Philosophy of Science meeting (2012), at Cologne, Lausanne and Bern Universities, and at the London School of Economics. Financial support is acknowledged from the Spanish Government research project FFI2011-29834-C03-01, and the European Commission under the Marie Curie programme grant PIEF-GA-2012-329430. References Carnap, R. (1966), Logical Foundations of Probability, 2 nd Edition, Chicago: University of Chicago Press. Cartwright, N. (1989), Nature’s Capacities and their Measurement, Oxford: Oxford University Press. Gillies, D. (2000), Philosophical Theories of Probability, London: Routledge. Hacking, I. (1975), The Emergence of Probability, Cambridge: Cambridge University Press. Hacking, I. (1990), The Taming of Chance, Cambridge: Cambridge University Press. 30 Hájek, A. (1997), “ ‘Mises redux’ – redux: Fifteen arguments against finite frequentism”, Erkenntnis, 45, pp. 209’227. Hajek, A (2003), “What conditional probability could not be”, Synthese, 137, pp. 273-323. Hájek, A. (2009), “Fifteen arguments against hypothetical frequentism”, Erkenntnis, 70, pp. 211-235. Hesslow, G. (1976), “Two notes on the probabilistic approach to causality”, Philosophy of Science, 43, pp. 290-92. Humphreys, P. (1985), “Why propensities can not be probabilities”, The Philosophical Review, 94, pp. 557-570. Humphreys, P. (2004), “Some considerations on conditional chances”, British Journal for the Philosophy of Science, 55, pp. 667-680. Levi, I. (1980), The Enterprise of Knowledge: An Essay on Knowledge, Credal Probability, and Chance, Cambridge, Massachusetts: The MIT Press. Mellor, D. H. (2005), Probability: A Philosophical Introduction, London: Routledge. Popper, K. (1957), “The propensity interpretation of the caluculus of probability, and the Quantum theory” in S. Körner (ed.), Observation and Interpretation, Proceedings of the Ninth Symposium of the Colston Research Society, University of Bristol, pp. 65-70. Popper, K. (1959), “The propensity interpretation of probability”, British Journal for the Philosophy of Science, 10, pp. 25-42. Popper, K. (1990), A World of Propensities, Bristol: Thoemmes. 31 Salmon, W. (1979), “Propensities: A discussion review of D. H. Mellor’s The Matter of Chance”, Erkenntnis, 14, pp. 183-216. Suárez, M. (2013), “Propensities and Pragmatism”, The Journal of Philosophy CX (2), pp. 61- 92. Von Mises, R. (1928), Probability, Statistics and Truth, 2 nd Edition, New York: Dover, 1957.