axl022 755..780 Brit. J. Phil. Sci. 57 (2006), 755–779 Generalizing the Lottery Paradox Igor Douven and Timothy Williamson ABSTRACT This paper is concerned with formal solutions to the lottery paradox on which high probability defeasibly warrants acceptance. It considers some recently proposed solu- tions of this type and presents an argument showing that these solutions are trivial in that they boil down to the claim that perfect probability is sufficient for rational acceptability. The argument is then generalized, showing that a broad class of similar solutions faces the same problem. 1 An argument against some formal solutions to the lottery paradox 2 The argument generalized 3 Some variations 4 Adding modalities 5 Anticipated objections Over the past decades, there has been a steadily growing interest in utilizing probability theory to elucidate, or even analyze, concepts central to tradi- tional epistemology. Special attention in this regard has been given to the notion of rational acceptability. Many have found the following thesis at least prima facie a promising starting point for a probabilistic elucidation of that notion: Sufficiency Thesis (ST) A proposition w is rationally acceptable if Pr(w) > t, where Pr is a probability distribution over propositions and t is a threshold value close to 1. 1 Another plausible constraint is that when some propositions are rationally acceptable, so is their conjunction: Conjunction Principle (CP) If each of the propositions w and c is rationally acceptable, so is w ^ c. 1 We think of the Pr-function as representing the probability of the various propositions on the relevant evidence. We are neutral as to whether such evidential probabilities should be conceived as the degrees of belief of a rational agent or more objectively, for example in the manner of Williamson ([2000], pp. 209–37). In any case, we assume that they satisfy the standard axioms of probability theory. � The Author (2006). Published by Oxford University Press on behalf of British Society for the Philosophy of Science. All rights reserved. doi:10.1093/bjps/axl022 For Permissions, please email: journals.permissions@oxfordjournals.org a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ From CP we can easily derive its generalization to any finite number of conjuncts, by mathematical induction. Of course, one can think of readings of ‘rationally acceptable’ on which CP fails. Suppose that we have generalized the consequence relation beyond deduction to include the results of good abductive, inductive, statistical and probabilistic reasoning too. Call the generalized relation between premise sets and conclusions ‘general consequence’. Thus a deductive consequence of a premise set is also a general consequence of that set, but a general conse- quence of a premise set need not be a deductive consequence of the set. A proposition is generally consistent with a premise set if and only if its negation is not a general consequence of that set. We assume a weak form of transi- tivity for general consequence: if each proposition in the set D is a general consequence of the premise set G, then any general consequence of the com- bined premise set G [ D is already a general consequence of G itself.2 The rationale for this principle is to make general consequence accumulative, in 2 Full transitivity would require that if each proposition in a set D is a general consequence of a set G, then any general consequence of D is a general consequence of G. Our weak transitivity principle is the special case of this in which for ‘D’ we substitute ‘G [ D’ (note that every member of G is a deductive consequence and therefore a general consequence of G). We do not assume full transitivity for general consequence because it makes general transitivity monotonic, in the sense that any general consequence of a set Gwould also count as a general consequence of G [ D, no matter what extra premises D contains, whereas most forms of non-deductive reasoning are non- monotonic: for example, the best explanation of some evidence may be inconsistent with an enlarged evidence set. Full transitivity implies monotonicity because each proposition in G is a deductive consequence and therefore a general consequence of G [ D, so by full transitivity any general consequence of G is a general consequence of G [ D too. By contrast, the weak transitivity principle in the text does not imply monotonicity. To show this, we give an artificial interpreta- tion of ‘general consequence’ on which it properly extends deductive consequence and weak transitivity holds but monotonicity does not. Consider a language of propositional logic with only finitely many atomic letters. Thus there are only finitely many models (assignments of truth- values to atomic letters). Assign each model a real number as its ‘value’; different models may be assigned the same value, but at least one model must be assigned a higher value than some other. The ‘best’ models in a set are those assigned the highest ‘value’ of any in the set; thus any non- empty set of models has at least one best member. Interpret ‘w is a general consequence of G’ to mean that w is true in each of the best members of the set of models in which every member of G is true (if G is empty, w is a vacuous general consequence of G). In brief, w is a general consequence of G iff every best model of G is a model of w. On this interpretation, general consequence extends deductive consequence, for if w is a deductive consequence of G, then every model of G is a model of w; a fortiori, every best model of G is a model of w. Moreover, weak transitivity holds. For suppose that every member of D is a general consequence of G, and that w is a general conse- quence of G [ D. Let M be a best model of G, so M is a model of D. Thus M is a model of G [ D. If M were not a best model of G [ D, another model M * of G [ D would be better; but M * would be a model of G, so M would not be a best model of G. Hence M is a best model of G [ D. As w is a general consequence of G [ D, M is a model of w. Thus every best model of G is a model of w, so w is a general consequence of G. Nevertheless, we can show that monotonicity fails, as follows. For each model M, let a(M) be the conjunction of the atomic letters true in M and the negations of the atomic letters false in M; thus a(M) is true in M and in no other model. Let b be the disjunction of a(M) for each best model M (best member of the set of all models). Hence b is true in each best model and in no other. So b is a general consequence of the null set. But b is no general consequence of {:b}, for :b has some models (by hypothesis, not every model takes the maximum value). Therefore general consequence is non-monotonic. Since full transitivity implies monotonicity, full transitivity also fails on this interpretation. 756 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ the sense that we can freely use any general consequences that we have already drawn from a premise set in drawing further general consequences from that set, for otherwise what use are chains of non-deductive reasoning? Now stipulate that a proposition is rationally acceptable in given circumstances if and only if it is generally consistent with the evidence avail- able in those circumstances. Then we can expect CP to fail. For in virtually all circumstances the available evidence will leave some proposition w undecided, in the sense that neither w nor :w is a general consequence of the evidence available. By our stipulation, each of w and :w is rationally acceptable in those circumstances, because it is generally consistent with the evidence. But their conjunction w ^ :w is not rationally acceptable, for its tautological negation :(w ^ :w) is a deductive consequence and therefore a general consequence of any evidence. However, we could have made the alternative and perhaps more natural stipulation that a proposition is rationally acceptable in given circumstances if and only if it is a general consequence of the evidence available in those circumstances. Then CP holds. For suppose that each of the propositions w and c is rationally acceptable in given circumstances. Let E be the evidence available in those circumstances. Of course w ^ c is a deductive consequence, and therefore a general consequence, of E [ {w,c}. But, by our new stipula- tion, each of w and c is a general consequence of E. Hence, by the accumu- lation principle, w ^ c is already a general consequence of E. Therefore, by the new stipulation again, w ^ c is rationally acceptable in the given circum- stances, as CP requires. Note that ST still sounds plausible on this under- standing of being ‘rationally acceptable’. For one might think that if the probability of w on the evidence available in given circumstances exceeds a high enough threshold, then w is beyond reasonable doubt in those circum- stances, and so should count as a general consequence of the evidence, in which case it is rationally acceptable by the new stipulation. For example, perhaps it is a general consequence of our present evidence that the earth has existed for more than ten thousand years, because that proposition is so probable on that evidence. In what follows, we do not assume this or any other particular account of rational acceptability, but instead rely on the reader’s informal understanding of the notion. We hope that the preceding remarks indicate the attractions of ST and the structural difficulty of giving up CP. It has long been known, however, that ST, when combined with CP, leads to the untoward conclusion that ?, the inconsistent proposition, can be ratio- nally acceptable. A simple argument for this goes as follows. Consider an n-ticket lottery known to be fair and to have exactly one winner, and with 1 � 1/n > t. Given ST, all propositions in the set LOT ¼ fhTicket #i will losei j 1 � i � ng Generalizing the Lottery Paradox 757 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ are rationally acceptable. 3 The same is true of the proposition that some ticket will win, of course, for that is assumed to be known and hence to have probability 1. But the conjunction of the latter proposition and all the members of LOT forms an outright contradiction, which should now, given CP, be rationally acceptable, too.4 The foregoing argument is attributable to Kyburg ([1961]), and since its first presentation has commonly been known as ‘the lottery paradox’. Kyburg’s own response to it, which is now almost generally regarded as being too drastic, was to abandon CP. A currently more popular type of response emanates from the (correct) idea that, by itself, the lottery paradox does not show ST to be completely off the mark; something in the vicinity of that thesis might still be tenable. Virtually all proposals that start from this idea let high probability defeasibly warrant acceptance, and can be schematically represented as follows: w is rationally acceptable if Pr (w) > t, unless defeater D holds of w. (1) Another general feature of these proposals is that they aim to define a defeater that applies as selectively as possible to ‘lottery propositions’, such as the elements of LOT; many or even all other propositions that have a probability above the threshold are supposed still to qualify as rationally acceptable on account of their high probability. This paper will be concerned with proposals of this type, and only with those that are formal in the sense that they define the defeater in terms that are probabilistic or broadly logical. In particular it argues that such solutions either are trivial in that they boil down to the claim that probability 1 is sufficient for rational acceptability or still have as a conse- quence that ? or some almost equally discreditable proposition is rationally acceptable (even if perhaps they solve the lottery paradox in the narrower sense that they succeed in blocking Kyburg’s argument). To underline the significance of this result, despite its being restricted to formal solutions, let us first say that in analytic philosophy the prima facie attractiveness of a formal approach should hardly need mentioning. Moreover, because differences in philosophers’ understanding of the notion of rational acceptability are sometimes quite subtle and hard to detect, and therefore harbour some danger of leading to equivocation and spurious debate, the use of relatively strong analytic tools seems 3 Throughout the paper, a sentence surrounded by angle brackets refers to the sentence’s propositional content. 4 In terms of the understanding of rational acceptability as the property of being a general consequence of the evidence, the upshot of the argument is that if general consequence satisfies ST and CP, it violates the following consistency constraint: a contradiction is a general consequence of a set only if it is a deductive consequence of that set. 758 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ particularly called for in the present debate. Also, as it is easier to implement notions that are formal in the indicated sense, those working in the field of Artificial Intelligence, or at any rate those agreeing with Pollock ([1995], p. xi) that ‘The implementability of a theory of rationality is a nec- essary condition for its correctness’, have an especially good reason for aspir- ing to formality here. For only a formal solution to the lottery paradox can be embedded in (or perhaps even serve as a basis for) a formal theory of rational acceptability. 1 An argument against some formal solutions to the lottery paradox In this section we briefly review some recent formal proposals of what D in (1) should be and show why they fail. Call a set of propositions minimally inconsistent iff it is inconsistent and has no proper subset that is inconsistent. Then, glossing over some details that are inessential for present purposes, Pollock’s ([1995]) proposal for the defeater is (2) being a member of a minimally inconsistent set of propositions each of which has a probability above t. 5 Another proposal, which can be distilled from Ryan ([1996]), is this: (3) being a member of a set of propositions such that (i) each member of the set has a probability above t and (ii) the probability that every member of the set is true is not above t. And our final example reads as follows: (4) being a member of a probabilistically self-undermining set, where a set of propositions F with cardinality |F| is defined to be probabilis- tically self-undermining iff for all w 2 F: Pr(w) > t and Pr(w|F � w) � t (where F � w is the conjunction of all members of F except w). This is essentially Douven’s ([2002]) proposal. One readily verifies that substituting any of these proposals for the schematic letter D in (1) yields a thesis on which none of the elements of LOT comes out as being rationally acceptable. Consequently, CP can be combined with any of those theses without engendering Kyburg’s paradox. As adumbrated, however, the challenge is not just to define a defeater that applies to the members of LOT and to similar propositions; the challenge is to 5 A detail still worth mentioning is that Pollock’s ([1995], p. 66) full proposal appeals to a notion of projectibility. His general formal approach notwithstanding, however, this notion nowhere receives a formal definition; it in effect is not properly defined at all but instead is said to be related to the notion of the same name in Goodman’s ([1954]) work on induction (a notion that is notoriously vague). Generalizing the Lottery Paradox 759 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ define a defeater that does so selectively. And the argument now to be presented shows that in this respect the above proposals do quite badly. 6 Let w be any proposition such that t < Pr(w) < 1. Then consider the set G ¼ f:w _ hTicket #i of lottery L will losei j 1 � i � ng: again with 1 � 1/n > t. Now for all i: Pr(:w _ hTicket # i will losei) > t, (5) for the second disjunct has a probability above the threshold and the probability of a disjunction is never less than that of its most probable dis- junct. But given that it is part of the background knowledge that one of the tickets #1 – #n will win: Pr w j :w _ hTicket #1 will losei; . . . ; :w _ hTicket #n will loseið Þ ¼ 0 ð6Þ and for all i: Prð:w _ hTicket #i will losei j :w _ hTicket #1 will losei; . . . ; :w _ hTicket #i � 1 will losei; :w _ hTicket #i þ 1 will losei; . . . ; :w _ hTicket #n will losei; wÞ ¼ 0: ð7Þ After all, given the background knowledge we are supposing, the set G [ {w} is inconsistent. Now note, first, that there must be a G0 � G such that G0 [ {w} is minimally inconsistent. 7 Secondly, because G [ {w} is inconsistent we know that at least one of its members must be false, so that the probability that all members are true equals 0 and thus is not above t. And finally note that from (5), (6) and (7) it follows that G [ {w} is a probabilistically self-undermining set. Thus, as w is an arbitrary proposition having a probability above the threshold without having perfect probability, it appears that the combination of (1) with any of (2), (3) and (4) constitutes a thesis that tells us no more than that propositions having probability 1 are rationally acceptable. That is to say, not only lottery propositions, but all propositions having non-perfect probability fail to qualify as rationally acceptable on the theses resulting from the above proposals. 6 Similar arguments are to be found in Korb ([1992]), Pollock ([1995], pp. 64–5) and Olin ([2003], pp. 93–4). 7 For as G [ {w} is a finite inconsistent set, it has a minimally inconsistent subset D. But G itself is consistent because every member of it is true if w is false, which can be since Pr(w) < 1. Thus w 2 D, so D ¼ G0[{w} for some G0 � G. 760 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ 2 The argument generalized The argument of the foregoing section may seem to present the sort of problem that can be overcome by tinkering further with the definition of the defeater. Evidently none of (2), (3) and (4) specifies a defeater that is strict enough; given any of those definitions, too many things count as defeaters, so that too few high probability propositions qualify as rationally acceptable. But it might seem not too difficult to amend these definitions so that, to begin with, the members of G [ {w} do not have a defeater. To mention a very simple amendment that already seems to do the trick, we could have Pollock’s solution read that a proposition is rationally acceptable if it has high probability and besides is not an element of a minimally inconsistent set of propositions each two members of which are probabilistically negatively relevant to one another, and similarly for the other two proposals. That would certainly block the above argument, for nothing in it excludes that, for instance, Pr w j :w _ hTicket #14 will loseið Þ > Pr wð Þ: It thus could no longer be concluded that every proposition that is highly but not perfectly probable is a member of a set of propositions each of which is defeated. A first worry one may have about this and similar amendments is that they are ad hoc. That worry aside, however, with such amendments there remains the nagging doubt that there might be some presently overlooked ‘trivializa- tion argument’ similar to the one propounded previously. As it turns out, such a doubt would be justified, for the argument of Section 1 generalizes: it can be proved that a large class of proposals similar to the ones considered above fail for what is at root the same reason for which those were seen to fail. 8 To show this, we need some terminology. DEFINITION 2.1. Let W be a set of worlds, and think of propositions as subsets of W. Further assume a probability distribution Pr on ˆ(W). Then f is an 8 Incidentally, the amendment just mentioned already comes to grief over the following argument: let w again be any proposition such that t < Pr(w) < 1. Suppose the same holds true of each of c1, . . . , cn, and let each element of the set {w, c1, . . . , cn} be probabilistically independent of each consistent truth-function of the other elements of the set. Then, just provided n is large enough, it will hold that Pr(w^c1^ � � � ^cn ) < 1 � t and hence that Pr(:w_:c1_ � � � _:cn) > t. Then, if we add the suggested clause about negative probabilistic relevance to the proposals of Pollock, Ryan and Douven, all elements of {w, c1, . . . , cn,:w_:c1_ � � � _:cn} are rationally acceptable on those proposals. But as the set is inconsistent, it follows from CP that ? is rationally acceptable. Generalizing the Lottery Paradox 761 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ automorphism of hW, Pri iff f is a function from ˆ(W) onto itself that satisfies these conditions: 1. f (w ^ c) ¼ f (w) ^ f (c); 2. f (:w) ¼ :f (w); 3. Pr(w) ¼ Pr(f (w)), for all propositions w, c 2 ˆ(W). a A structural property of propositions is any property P such that for any proposition w and any automorphism of propositions f, w has P iff f(w) has P. 9 An aggregative property of propositions is any property P such that whenever w has P and c has P, so has w ^ c. Call a probability distribution Pr on a set W of worlds equiprobable iff, for all w, w0 2 W, Pr({w}) ¼ Pr({w0}). Because in most of what follows a finite probability space will be assumed, it is useful to note that if W is finite and Pr an equiprobable distribution on W, then Pr(w) ¼ |w|/|W|, for all w 2 ˆ(W); similarly, Pr(wjc) ¼ |w ^ c|/|c|, for all w, c 2 ˆ(W). Finally, we define w to be inconsistent iff w ¼ ; ¼ ?. We then have the following: PROPOSITION 2.1. Let W be finite and let Pr be an equiprobable distribution on W. Further, let P be structural, Q aggregative and P sufficient for Q. Then if some proposition w such that Pr(w) < 1 has P, then ? has Q. PROOF. Assume the conditions hold for properties P and Q, and that Pr(w) < 1 and w has P. As Pr(W) ¼ 1 6¼ Pr(w), W 6¼ w so for some w* 2 W, w* =2 w. Then for all wi 2 W, let pi be the permutation on W such that pi (wi) ¼ w*, p(w*) ¼ wi and pi(w) ¼ w for all other w 2 W. Define fi (c) [ {pi(w) | w 2 c} for all c 2 ˆ(W). Each such fi automatically satisfies the first two conditions of Definition 2.1. It satisfies the third because, given that W is finite and Pr equiprobable, any two propositions with the same number of worlds have the same probability. Thus each fi is an automorphism of propositions. Since, by assumption, w has P, and P is structural, P also holds of fi(w), for all i: 1 � i � n. As P is sufficient for Q, every fi(w) has Q, too. Note that for each i, wi =2 fi (w). Let W ¼ {w1, . . . ,wn}. Then f1(w) ^ � � � ^ fn(w) ¼ ?. Because Q is aggregative, Q holds of ?. a 9 Note that strictly speaking a property is (or fails to be) structural only relative to a given probability model. So we should really say that, for instance, a property P is hW, Pri-structural, for a certain probability model hW, Pri. However, below context will always make it obvious relative to which model a property is said or assumed to be structural, so that explicit reference to the model can be suppressed. 762 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ Before showing how this bears on solutions to the lottery paradox of the kind considered in this paper, we first point to a simple corollary of the above result: COROLLARY 2.2. Let Pr be an equiprobable distribution on a finite set W and let P be both structural and aggregative. Then if some proposition w such that Pr(w) < 1 has P, then ? has P. PROOF. From the proof of Proposition 2.1 by taking P and Q to be identical. a Now note that to require rational acceptability to validate CP is to require, in the above terminology, that it be an aggregative property. It then follows immediately that if propositions with imperfect probability can be rationally acceptable while the inconsistent proposition is not then rational acceptability is not a structural property. Of course instances of (1) are attempts not to give a necessary and sufficient condition for rational acceptability, but only a sufficient one. The existence of a structural sufficient condition for rational acceptability, unlike that of a structural necessary and sufficient condition, does not imply that rational acceptability is itself structural. So, to address such proposals, we need Pro- position 2.1, and not just the above corollary. Assuming that rational accept- ability is aggregative, the proposition tells us that if there is a sufficient condition for rational acceptability that is both structural and non-trivial, in the sense that at least one proposition with probability less than 1 has it, then the inconsistent proposition is rationally acceptable. Hence, any pro- posal properly called a solution to the lottery paradox—which cannot allow the inconsistent proposition to be rationally acceptable—is, if structural, triv- ial, just as the proposals depicted in Section 1 were seen to be. To appreciate exactly how damaging this is to the project of finding a for- mal solution to the lottery paradox, extend the term ‘structural’ to relations and predicates as well, in the following obvious way: a relation R between propositions is structural if it holds for all propositions w1, . . . , wn and all automorphisms of propositions f that R(w1, . . . ,wn) iff R(f(w1), . . . , f(wn)), and a predicate is structural if it denotes either a structural property or a structural relation. We further need the notion of degree of a predicate, which for an n-ary, mth-order predicate R is defined inductively as follows: 1. d R X1; . . . ; Xnð Þð Þ ¼ 0 if R is primitive; 2. d R X1; . . . ; Xnð Þ _ Q X1; . . . ; Xnð Þð Þ ¼ max d R X1; . . . ; Xnð Þð Þ; d Q X1; . . . ; Xnð Þð Þ½ � þ 1; 3. d :R X1; . . . ; Xnð Þð Þ ¼ d R X1; . . . ; Xnð Þð Þ þ 1 4. d 8Xi1 � � � 8Xik tð Þ ^ w 2 S½ �: But letting ‘CON(w)’ mean that w is consistent and ‘HP(w)’ that w has a probability above t, we can redefine M(w) simply—through somewhat tediously—using only first-order quantifiers as M0ðwÞ [ 8wi1 � � � 8wi2n�1 h w 6¼ wi1 ^ � � � ^ w 6¼ wi2n�1 !ð:CON w ^ wi1 � � ^ HP wð Þ ^ HP wi1 � � Þ _ :CON w ^ wi1 ^ wi2 � � ^ HP wð Þ ^ HP wi1 � � ^ HP wi2 � �� � _ � � � _ :CON w ^ wi1 ^ � � � ^ wi2n�1 � � ^ � � � ^ HP wð Þ ^ � � � ^ HP wi2n�1 � �� �i : Clearly M(w) iff M0(w), for any w 2 ˆ(W). It should also be clear that basically the same trick will work just as well for any other predicate defined by means of higher-order quantifiers, and that it will do so given any finite cardinality of W. Generalizing the Lottery Paradox 765 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ To see the generality of Proposition 2.1, then, one only needs to go through the list of what can reasonably be regarded as the primitive predicates from (meta-)logic, set theory and probability theory and check that they define (on our model) structural properties or relations. 11 For example, given an automorphism f, w is consistent iff w 6¼ w ^ :w iff f wð Þ 6¼ f w ^ :wð Þ iff f wð Þ 6¼ f wð Þ ^ :f wð Þ iff f wð Þ is consistent; similar procedures work for the other (meta-)logical and set-theoretic predicates. And the predicate ‘proba- bility’ (and so, concomitantly, ‘conditional probability’, ‘high probability’, ‘probability above t’, etc.) is, of course, by definition a structural predicate, for we defined automorphisms as mappings that are, among others, probability-preserving. The above result pertains to any structural sufficient condition. Neverthe- less, as all extant formal solutions to the lottery paradox we are aware of instantiate (1), it may be useful to point out what exactly the result implies for schema (1): first, from Proposition 2.3 it follows that if having a particular defeater D is a structural property, then not having that defeater is structural as well. As having a probability above t is structural, it follows from the same proposition that the combination of having a probability above t and lacking a structural defeater is a structural property. Thus, if the defeater D in (1) is to be defined in terms that denote structural properties or relations, then any instance of that schema defines a sufficient condition for rational acceptabil- ity that is structural itself. It is easy to see that the predicates used in (2), (3) and (4) are all structural. But the foregoing shows that however complicated we make the definition of a defeater, we will not end up with an adequate solution to the lottery paradox if that definition is to be cast entirely in structural terms. 3 Some variations We consider two variations on the above result. These will show that some possible responses to the lottery paradox other than those of the schematic form (1) fare no better than the latter. One type of alternative response is, for any proposition w such that Pr(w) > t, to take its high probability as a defeasible reason for holding it rationally acceptable provided that the high probability resulted from learn- ing certain specific propositions or a certain type of proposition. The idea, in 11 Or one may consult Tarski ([1986]), where the logical notions are characterized as precisely those that are invariant under all 1:1 transformations of the domain of discourse onto itself. As Tarski (p. 151) remarks, properties concerning the number of elements in a class are, given this characterization, logical as well. And as already hinted at in the text, given a finite number of worlds and an equiprobable probability distribution, marginal probabilities just measure cardinalities and conditional probabilities ratios between cardinalities. See also van Benthem ([2002]). 766 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ other words, is to define a sufficient condition for rational acceptability (partly) in dynamic terms. Assume, however, that the learning proceeds by the following rule: Conditionalization (COND) For any proposition w such that Pr(w) > 0 , let Prw represent the probability distribution that results from Pr when w (and no stronger proposition) is learnt. Then we say that Prw comes from Pr by conditionalization on w iff, for all propositions c: Prw(c) ¼ Pr(cjw). Then if the defeater or defeaters are assumed to be both structural and non-trivial, we can still derive something just as bad as that the inconsistent proposition is rationally acceptable. To see this, we need some more terminology. Given a sequence S ¼ hPr0, . . ., Prni of probability distributions on W, we call f an automorphism of hW,Si iff f is an automorphism of each hW, Prii. In the context of sequences of probability distributions we understand a structural property as one that is preserved under automorphisms in the redefined sense. Furthermore, call a distribution Pr on W quasi-equiprobable iff for all w, w0 2 W, if Pr({w}) > 0 and Pr({w0}) > 0, then Pr({w}) ¼ Pr({w0}). It is clear that any equiprobable distribution is quasi-equiprobable but that the converse is not true. Finally, given a set W and a sequence S ¼ hPr0,. . .,Prni of probability distributions on W, define W[i] [ {w 2 W | Pri({w}) > 0}. Then we say that a permutation p on W accords with S iff, for all w 2 W and all i: 0 � i � n, w 2 W[i] iff p(w) 2 W[i]. Observe that if Priþ1 comes from Pri by COND, then Priþ1(w) ¼ 0 whenever Pri(w) ¼ 0; thus if i � j,W[j] � W[i]. In order to facilitate our proof showing that the idea broached at the beginning of this section cannot succeed, we first prove two lemmas: LEMMA 3.1. Let W be a finite set of worlds and let Pr be a quasi-equiprobable distribution on W. Then a distribution Pr0 on W comes from Pr by COND iff (i) Pr0 is a quasi-equiprobable distribution on W, and (ii) for all w 2 W, if Pr({w}) ¼ 0, then Pr0({w}) ¼ 0. PROOF. ()) Assume that Pr is a quasi-equiprobable distribution on W and that Pr0 comes from Pr by COND on w, for some proposition w 2 ˆ(W). Then (ii) is obvious, because we have already noted that COND preserves probability 0. For (i): Pr0{w}) ¼ Pr({w}jw) ¼ Pr({w} \ w) / Pr(w). If w =2 w, then {w} \ w = ; so Pr0({w}) ¼ 0. If w 2 w, then {w} \ w ¼ {w} so Pr0({w}) ¼ Pr({w})/Pr(w). Hence if Pr0({w}) > 0 and Pr0({w*}) > 0, then w, w* 2 w, Pr({w}) > 0 and Pr({w*}) > 0, so Pr({w}) ¼ Pr({w*}) because Pr is Generalizing the Lottery Paradox 767 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ quasi-equiprobable. Consequently, Pr0({w}) ¼ Pr({w})/Pr(w) ¼ Pr({w*})/ Pr(w) ¼ Pr({w*}). (() Assume (i) and (ii), and let W0 ¼ {w 2 W| Pr0({w}) > 0}. We show that Pr0 comes from Pr by COND on W 0. Let Pr* come from Pr by COND on W0. Then for all w 2 W, if Pr*({w}) > 0, then w 2 W0 and hence, by the definition of W0, Pr0({w}) > 0. Conversely, if Pr0({w}) > 0 then w 2 W0. Thus, by (ii), Pr({w}) > 0, so that, by the nature of COND, Pr*({w}) > 0. Moreover, by ()) Pr* is a quasi-equiprobable distribution on W. Since by (i) the same holds for Pr0, Pr* and Pr0 are both quasi-equiprobable distributions on W that give a nonzero probability to exactly the same worlds. Hence Pr* ¼ Pr0. a LEMMA 3.2. Let W be finite and S ¼ hPr0, . . . ,Prni be a sequence of probability distributions on W such that Pri comes from Pri�1 by COND, for all i: 1 � i � n, with Pr0 quasi- equiprobable, and p be a permutation on W. Let f be defined by f(w)[{p(w)jw 2 w}. Then f is an automorphism of hW, Si iff p accords with S. PROOF. ()) If w 2 W, 0 � i � n and f is an automorphism of hW,Si, then Pri({w}) ¼ Pri(f({w})) ¼ Pri({p(w)}), so w 2 W[i] iff p(w) 2 W[i]. (() Suppose p accords with S. To see that f is an automorphism of hW, Prii for all i: 0 � i � n, first observe that because p is a permutation on W, f automatically satisfies the first two clauses of Definition 2.1 for all i. In order to see that f satisfies the third as well for all i, notice that it follows from the assumptions about W and S together with Lemma 3.1 that Pri is quasi-equiprobable for all i: 0 � i � n Therefore, because p accords with S, for all w 2 w and all i: 0 � i � n Pri({w}) ¼ Pri({p(w)}). Then by Finite Additivity and the definition of f we have for any w 2 ˆ(W) and all i: 0 � i � n; Pri wð Þ ¼ P w2w Pri fwgð Þ ¼ P w2w Pri fp wð Þgð Þ ¼P w02f wð Þ Pri fw 0gð Þ ¼ Pri f wð Þð Þ. a We are now set to prove the main proposition: PROPOSITION 3.3. Let W be a finite set of worlds and S = hPr0,. . .,Prni a sequence of probability distributions on W such that Pr0 is quasi-equiprobable and Pri comes from Pri�1 by COND, for all i: 1 � i � n. Let P be structural, Q aggregrative and P sufficient for Q. Then if some proposition w such that Prn(w) < 1 has P, some proposition c such that Prn(c) ¼ 0 has Q. PROOF. Consider any w 2 ˆ(W) such that Prn(w) < 1 and w has P. Then for some world w* 2 W[n], w* =2 w. It follows that for each wi 2 W[n], there is a 768 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ permutation pi on W such that pi(wi) ¼ w* and pi(w*) ¼ wi and pi(w) ¼ w for all other w 2 W. Again let fi(c) [ {pi(w)|w 2 c} for all propositions c. As pi merely swaps wi and w*, where both wi 2 W[n] and w* 2 W[n], it is clear that pi accords with S. Thus, by Lemma 3.2, fi is an automorphism of hW,Si. Let W[n] ¼ {w1,. . .,wm}. Then, as P is structural, fi(w) has P for all i: 1 � i � m. And since P is sufficient for Q, and Q is aggregative, f1(w)^ � � � ^fm(w) has Q. But (f1(w)^ � � � ^fm(w)) \ W[n] ¼ ; and thus Prn(f1(w)^ � � � ^fm(w)) ¼ 0. a As an immediate and obvious consequence we state, without proof, the following corollary: COROLLARY 3.4. Same assumptions about W as in Proposition 3.3. Let P be both structural and aggregative. Then if some proposition w such that Prn(w) < 1 has P, some proposition c such that Prn(c) ¼ 0 has P. But of course the most relevant consequence of Proposition 3.3 is that saying that, unless a defeater D applies to it, a proposition is rationally acceptable if it is highly probable as a result of n (specific) learning events by means of COND, for any n 2 N, will not help to avoid trivialization as long as the defeater is to be defined in structural terms and rational acceptability is supposed to be an aggregative property. For while discussions of the lottery paradox have focused on the (unwanted) implication that contradictions can be rationally acceptable, that a proposition to which we assign probability 0 is rationally acceptable is an implication we will want just as much to exclude. The foregoing may be as interesting for what it suggests as for what it shows. For it suggests looking at rules for updating probabilities other than COND. Here the most obvious alternative is Jeffrey conditionalization. COND applies only in cases in which some proposition’s probability is raised to 1. But Jeffrey argued that there may be learning events in which we become certain of no proposition, even though we do learn something in them. To use one of Jeffrey’s examples (Jeffrey [1983], pp. 165–6), a glimpse of a cloth by candlelight may raise our probability that the cloth is green without raising it to 1. In order to be able to represent formally the effects of such events on our probability assignments, Jeffrey proposed the following generalization of COND: Jeffrey Conditionalization (JCOND) Let {ci} be a countable collection of propositions which partition logical space and which all have some posi- tive probability for a given agent. Further let Prold and Prnew be the agent’s pre-experience and post-experience probability function, respectively. Generalizing the Lottery Paradox 769 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ Then the change from the former to the latter accords with Jeffrey conditionalization iff for all propositions w Prnew wð Þ ¼ X i Prnew cið ÞProld wjcið Þ: The cis are to be thought of as being directly affected by the agent’s experi- ence; in the cloth example they are plausibly thought of as propositions about the cloth’s color. Note that if one of the cis gets probability 1, then JCOND reduces to COND. Call a case of JCOND essential if it does not reduce to COND. To see why this is relevant to the above result, notice that in the proof of Proposition 3.3 it was crucial that all worlds that after the n supposed learning events had some positive probability had the same positive proba- bility. If changes of credences are by COND, then, as explained in the proof of Lemma 3.1, that is preserved, at least on our probability model. But if changes of credences are or may be by JCOND, then, on a finite probability space, quasi-equiprobability is no longer preserved. In fact, it follows from Lemma 3.1 that on a finite probability space no essential applica- tion of JCOND to a quasi-equiprobable distribution results in a quasi- equiprobable distribution: condition (ii) of that lemma always holds in cases of JCOND, so if the new distribution is quasi-equiprobable it comes by COND from the old distribution and therefore not essentially by JCOND. EXAMPLE 3.1 Let W be finite and consider any case of change of credences where for all w 2 W, Prold({w}) > 0 and Prnew({w}) > 0 and Prold 6¼ Prnew. Then Prnew does not come from Prold by COND, as the set conditionalized on would have to be W itself, in which case Prold ¼ Prnew. But Prnew does come from Prold by JCOND as we can take our partition as being that into all singletons of worlds. These facts establish a large range of examples. See further Williamson ([2000], p. 216 n). a One thing this implies is that Proposition 3.3 does not extend to the supposition that at least some elements of the sequence of probability distributions Pr0, . . . ,Prn are derived from their predecessors by means of JCOND. And unless we have quasi-equiprobability of all elements of S ¼ hPr0, . . . ,Prni, we have no guarantee that there exists any automor- phism of hW, Si—other than the ‘trivial’ one, that is, which maps each pro- position onto itself. Consequently, if changes of credences may be by JCOND, then stipulating that a proposition is rationally acceptable if it is highly probable as a result of certain changes of credences, provided a defeater D does not apply to it, might help to avoid trivialization, even if 770 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ the defeater is defined strictly in structural terms. As, however, for reasons given in Williamson ([2000], pp. 216–8), JCOND is of doubtful epistemic significance at best, we doubt that the foregoing can be welcomed as offering an escape. Now to the second variation. Earlier we said that abandoning CP is nowadays generally found to be too drastic a response to the lottery paradox. Some might think, however, that replacing the principle by the following just slightly stricter principle is both tolerable and sufficient to rid us of the paradox: Restricted Conjunction Principle (RCP) If each of the propositions w and c is rationally acceptable and w ^ c 6¼ ?, then w ^ c is rationally acceptable. From RCP we can easily derive the generalization that if each of finitely many propositions is rationally acceptable, and their conjunction is consistent, then it is rationally acceptable too. Supplanting CP by RCP would not be an unprecedented move. It is well known that the so-called Principle of Indifference, according to which (roughly put) mutually exclusive propositions ought to be assigned equal initial probability, at least absent any reason to the contrary, is inconsistent. But because of its intuitive appeal, and its many successful applications, 12 it has seemed the best strategy to some not to abandon the principle altogether but to try and salvage as much as possible of it by searching for a restricted consistent version. 13,14 Yet switching to RCP offers no solace. For call P a C-aggregative property of propositions if whenever w has P and c has P, and in addition w ^ c is consistent, then w ^ c has P. We then still have PROPOSITION 3.5. Same assumptions as in Proposition 2.1, except that Q is now assumed to be only C-aggregative. Then if some proposition w such that Pr(w) < 1 has P, some proposition c such that Pr(c) � 1/|W| has Q. PROOF: Let everything be as in the proof of Proposition 2.1, but now consider the sequence of propositions c1,. . .,cn, where c1 ¼ f1(w) and ciþ1 ¼ ci ^ fiþ1(w) and W ¼ {w1, . . . ,wn}. As before, cn ¼ ?. Consider the least k such that ck = ?. If k ¼ 1, then w ¼ f1(w) ¼ ? has P and therefore Q and Pr(w) ¼ 0, so we are done. Suppose that k > 1. Thus cj 6¼ ? for j < k. For each i, fi(w) has P (because P is structural) and therefore Q, so for each 12 See Jaynes ([1973]) and Uffink ([1995]). 13 See e.g. Keynes ([1921]) and Castell ([1998]). 14 Douven and Uffink also follow the strategy of restricting CP (though not quite in the way suggested here) in their ([2003]) solution to Makinson’s ([1965]) preface paradox. Generalizing the Lottery Paradox 771 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ j < k, cj has Q by C-aggregativity. Observe that each fjþ1(w) lacks at most one member that fj(w) may have, namely wjþ1. Consequently, |cj| � |cjþ1| þ 1. Therefore |ck�1| ¼ 1. As Pr is equiprobable, Pr(ck�1) ¼ 1/|W|. a Again we state, without proof, a simpler corollary: COROLLARY 3.6. Same assumptions as Corollary 2.2, except that P is now assumed to be structural and C-aggregative. Then if some proposition w such that Pr(w) < 1 has P, some proposition c such that Pr(c) � 1/|W| has P. Assuming RCP is tantamount to assuming that rational acceptability is a C-aggregative property. Given this assumption, it is thus a further simple corollary of Proposition 3.5 that if some sufficient condition for rational acceptability is structural, a proposition can be rationally acceptable even if one is as good as certain that it is false. After all, the cardinality of W can be assumed as large as we want; accordingly, 1/|W| can be assumed to be as close to 0 as we want. Again this is a possibility one would just as much want to exclude as the possibility that the inconsistent proposition is rationally acceptable. Finally note that combining the variations depicted above will not help either: PROPOSITION 3.7. Let W be finite and let hPr0,. . .,Prni be as in Proposition 3.3. Furthermore, let P be structural, Q C-aggregative, and P sufficient for Q. Then if some proposition w such that Prn(w) < 1 has P, then some proposition c such that Prn(c) � 1/|W[n]| has Q. COROLLARY 3.8. Same assumptions about W as in Proposition 3.3. Let P be both structural and C-aggregative. If some proposition w such that Prn(w) < 1 has P, then some proposition c such that Prn(c) � 1/|W[n]| has P. The proofs of these results can be obtained basically just by combining the proofs of Propositions 3.3 and 3.5 (and, for the corollary, making the requi- site substitutions); for that reason they are omitted. 4 Adding modalities We have seen that any instance of (1) that is both structural and nontrivial, or indeed any other sufficient condition for rational acceptability that meets those criteria, leads straight to the conclusion that the inconsistent proposi- tion is rationally acceptable. And what prima facie may have seemed prom- ising escape routes still within the confines of a formal approach to the lottery paradox did not work. While we cannot claim to have considered all possible 772 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ escape routes of a formal variety, the above does seem to warrant the conclusion that the prospects for a purely formal solution to the paradox are dim. It may therefore be instructive to see, if only in rough outline, how an appeal to informal notions might be of help. Especially when such notions can be given a possible worlds semantics or natural axiomatization, a solution to the lottery paradox formulated in terms of them might still be somewhat to the liking of those who had hoped for a formal solution. One obvious way to try to increase the expressive power of our model is by adding modal operators. So far we cannot represent modalities in our model, for we are not assuming an accessibility relation on the elements of W. But of course we might extend the model by defining a relation R � W � W; we call w accessible from w* iff wRw*. That an operator can be axiomatized or given a possible worlds semantics does not automatically make that operator purely formal. For instance, the accessibility relation in its semantics may itself be defined in informal terms, such as similarity or knowledge. A natural criterion for an accessibility relation to be purely formal is that it should be invariant under all permutations of W. That is equivalent to requiring it to be homogeneous in the following sense: DEFINITION 4.1 Let W be a set of worlds and R � W � W. Then the frame hW, Ri is homogeneous iff 1. for all w, w0 2 W, wRw iff w0Rw0; 2. for all w, w0, x, x0 2 W such that w 6¼ x and w0 6¼ x0, wRx iff w0Rx0. a As can be readily verified, the class of homogeneous frames is exhausted by those whose accessibility relation is defined by any of the following (where W is the set of worlds of the given frame): (a) R ¼ ;; (b) R ¼ {hw,wi j w 2 W}; (c) R ¼ {hw,w0i j w,w0 2 W}; (d) R ¼ {hw,w0i j w,w0 2 W ^ w 6¼ w0}. The accessibility relation of a homogeneous frame is structural in the above sense, because it is preserved under all permutations. Thus adding modal notions defined in terms of a homogeneous frame will not affect our previous results. We pause to identify the logic of the class of homogeneous frames. As is well known, to obtain the logic of the class of frames of type (a) one adds to the weakest normal modal logic K the Ver schema &w; for the class of frames Generalizing the Lottery Paradox 773 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ of type (b) one adds to K the Triv schema &w $ w; for the class of frames of type (c) one adds to K the schemas characteristic of the modal logic S5, in particular the T schema &w ! w, the B schema w ! &�w and the 4 schema &w ! &&w. For the class of frames of type (d), one adds the B schema and the weakening 40 of the 4 schema: (w^&w) ! &&w (Segerberg [1980]). It is easy to check that all instances of the B and 40 schemas are also derivable in the other three logics. Consequently, the logic of the class of all homogeneous frames is the same as the logic of the class of all frames of type (d) and can therefore be axiomatized in the same way. By contrast, the accessibility relation of a non-homogeneous frame is not preserved under all permutations of W, and therefore is not preserved under all automorphisms of hW, Pri when W is finite and Pr equiprobable. Consequently, our previous results do not generalize to the non- homogeneous case, because automorphisms as defined above need not pre- serve modal properties of propositions. EXAMPLE 4.1 Let f be the automorphism that interchanges {w1} and {w2} in Figure 1. In that model, w has the property of implying a possibility because w ! �w is true at all worlds, whereas f(w) lacks that property, because f(w) ! �c is false at the ‘blind’ world w2 for every proposition c. a In sum, on the modal approach we might be able to define a defeater in terms that are to some extent formally constrained, although they will not be purely formal. Epistemic and doxastic modalities typically correspond to accessibility relations that generate non-homogeneous frames. For example, knowledge violates the B and 4 schemas, and therefore the 40 schema too, which is equivalent to 4 in the presence of the T schema, which knowledge trivially satisfies (Williamson [2000]). However, on the approach of Williamson ([2000]), known propositions have probability 1 on the evidence, and only propositions with probability 1 on the evidence are fully rationally acceptable, so no definition of a defeater is forthcoming of the kind for which many have hoped. It is far from obvious how to employ non-homogeneous epistemic or doxastic modalities in order to define a plausible and illuminating sufficient condition for rational acceptability short of probability 1. However, we will not attempt here to argue that it cannot be done. The foregoing simply indicates one direction in which some Figure 1. Models with a non-homogeneous underlying frame 774 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ may seek a solution to the lottery paradox—with no guarantee that there is one. 5 Anticipated objections In closing we present, and try to defuse, two objections that might be raised against the model we used in obtaining our negative results. First, it may be pointed out that the proofs of Propositions 2.1, 3.3, 3.5 and 3.7 and the associated lemmas and corollaries heavily depend on the fact that our model is a finite probability space. It must be admitted that there is no straightforward generalization to infinite probability spaces. A crucial fact for the proofs of those propositions is an obvious consequence of Finite Additivity: when all worlds in a subset W* of the finite set W have equal probability, all subsets of W* of equal cardinality have equal probability (for Proposition 2.1, let W* be W itself). But all subsets of equal cardinality of an infinite set W* have equal probability only in the trivial case in which Pr(W*) ¼ 0. For any infinite W* set can be partitioned into two disjoint subsets X and Y each of equal cardinality to W* itself; thus Pr(W*) ¼ Pr(X) þ Pr(Y). If equal cardinality implies equal probability for subsets of W*, then Pr(X) ¼ Pr(W*) ¼ Pr(Y), so Pr(W*) ¼ Pr(W*) þ Pr(W*), so Pr(W*) ¼ 0. Thus we cannot make progress simply by considering probability distributions that assign equal probability to all worlds in an infinite set of positive probability, either by assigning probability 0 to all worlds (which requires the abandonment of Countable Additivity if W* is countable) or by using non-standard analysis, for such distribu- tions still do not yield what we want for our proofs at the level of subsets of W*. Of course, we could consider switching to ‘non-normalizing’ probabilities (see Renyi [1970]). But that option is controversial. A better response, in our view, is to give the model we employed a kind of contextualist twist by noting that our results do not require the finitely many equiprobable worlds to be maximally specific. It is enough to assume that they are ‘specific enough’ for whatever purposes may be at hand—that is, to be more precise, a set of mutually exclusive and jointly exhaustive states that determine answers to all the questions that happen to be relevant to a particular application. In addition to this, it should be noted that the case of finitely many equiprobable worlds is the simplest non-trivial case, and a good treatment of the lottery paradox should at least work for the simple cases—especially as the phenom- ena of rational acceptability in which we are interested do not seem to arise only for infinite probability spaces. Generalizing the Lottery Paradox 775 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ But we can even do better than this. For we can obtain something hardly less destructive than our previous results if the finiteness assumption is dropped. First, one more definition: DEFINITION 5.1 Let Pr be a probability distribution on a set W of worlds and e 2 R such that 0 � e � 1. Pr is e-equiprobable iff for some finite W* � W: 1. Pr(W*) � 1�e; 2. for all w,w0 2 W*, Pr({w}) ¼ Pr({w0}). a EXAMPLE 5.1 Let w1, w2, w3,. . ., be an enumeration of the elements of some infinite set W of worlds. Then for any n 2 N, the following defines an 1/n-equiprobable distribution on W: Pr fwigð Þ ¼ 1 n if 1� i � n – 1; 1 n 1 2 1þi�n � � if i � n. 8 < : a Now consider PROPOSITION 5.1. Let Pr be an e-equiprobable distribution on a set W, P be structural, Q aggregative and P sufficient for Q. Then if some proposition w such that Pr(w) < 1�e has P, then some proposition c such that Pr(c) � e has Q. PROOF. Let W* satisfy conditions 1 and 2 of Definition 5.1. Suppose that w has P and Pr(w) < 1�e. Thus Pr(w) < Pr(W*), so W* 6� w, so for some w* 2 W*, w* =2 w. Now suppose that wi 2 W*. Let pi be the permutation of W such that pi(wi) ¼ w*, pi(w*) ¼ wi and pi(w) ¼ w for all other w 2 W. Define fi from ˆ(W) to ˆ(W) in the usual way: fi(c)[{pi(w)jw 2 c} for all c 2 ˆ(W). As usual, fi is an automorphism. We need only check the third condition: fi(w) ¼ fi ((w^W*) _ (w^:W*)) ¼ [(fi(w) ^ fi (W*)) _ fi (w^:W*)] ¼ [(fi (w)^W*) _ (w^:W*)], so Pr(fi(w)) ¼ Pr(fi(w) ^ W*) þ Pr(w^:W*); by conditions 1 and 2, if c � W*, then Pr(c) ¼ Pr(W*)|c|/|W*|, so Pr(fi(w) ^ W*) ¼ Pr(W*) |fi(w) ^ W*|/|W*| ¼ Pr(W*)|w ^ W*|/|W*| ¼ Pr(w^W*); thus Pr(fi (w)) ¼ Pr(w^W*) þ Pr(w^:W*) ¼ Pr(w). As P is structural, each fi (w) has P. As P is sufficient for Q, each fi (w) has Q. As Q is aggregative, f1(w)^ � � � ^fn(w) has Q. For wi 2 W*, wi =2 fi(w), so f1 (w)^ � � � ^fn (w) � :W* where W* ¼ {w1, . . . ,wn}. So, Pr(f1 (w)^ � � � ^fn(w)) � Pr(:W*) ¼ 1�Pr(W*) � e by the first condition of Definition 5.1 a 776 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ COROLLARY 5.2. Let Pr be an e-equiprobable distribution on a set W and let P be both struc- tural and aggregative. Then if some proposition w such that Pr(w) < 1�e has P, some proposition c such that Pr(c) � e has P. PROOF. From the proof of Proposition 5.1 by taking P and Q to be identical. a Note that we do not require all worlds to have positive probability, so the results apply both to countable and uncountable W. Informally, the point of e-equiprobability is that the smaller e is, the more likely we are to be in the subset W* for whose subsets equal cardinality entails equal probability. As already noted, no infinite set W* of positive probability has this property. In a sense, therefore, e-equiprobability as e tends to 0 is the best possible approximation for our purposes to strict equiprobability over infinite domains. To illustrate, consider the 1/n-equiprobable distribu- tion Pr[n] as defined in Example 5.1, and let W*[n] ¼ {w1, w2,. . ., wn}. For m < n, Pr[n] is closer than Pr[m] to equiprobability in at least two ways. First, the probability of being in the set of equiprobability is higher, because Pr(W*[m]) ¼ (m�1)/m < (n�1)/n ¼ Pr(W*[n]). Second, the set of equiprobability is larger, because W*[m] W*[n]. Now Proposition 5.1 yields a sort of trivialization result ‘in the limit’. For by taking n larger and larger or, in the general case, e smaller and smaller, we have better and better approxi- mations to Proposition 2.1 (with probability 0 in place of inconsistency). It will be obvious how to get increasingly good approximations to Propositions 3.3, 3.5 and 3.7 for infinite models too. The second possible objection we want to consider is that it is a rather serious drawback of our model that we are working with a coarse-grained conception of propositions according to which propositions are individuated solely by their truth conditions. For—it might be said—rational acceptability is, just like belief for instance, plausibly thought of as a hyperintensional notion; that is to say, it seems to matter to our verdicts regarding the rational acceptability of a proposition how that proposition is presented to us (and so not just what worlds it is true in). A first thing to note in this connection is that insofar as the objection points to a limitation of our model, it is one that is inherited from the very analytic tool that is central to all probabilistic approaches to rational accept- ability (whether or not they are fully formal), namely, probability theory. For although a pre-theoretic conception of probability also seems to be a hyper- intensional notion, it must by way of idealizing assumption be considered an intensional (but not hyperintensional) one. For example, it would from an intuitive viewpoint seem entirely reasonable, at least presently, to believe Goldbach’s conjecture to some non-extreme degree. Nevertheless, Generalizing the Lottery Paradox 777 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ the conjecture is either necessarily true or necessarily false and so expresses the same proposition as either ‘2 þ 2 ¼ 4’ or ‘2 þ 2 ¼ 5’. As it is a theorem of probability theory that Pr(w) ¼ Pr(c) whenever w and c are logically equiva- lent, 15 it follows that anyone believing the conjecture to a degree different from the one to which she believes that 2 þ 2 ¼ 4 —if the conjecture is true— or the one to which she believes that 2 þ 2 ¼ 5 —if the conjecture is false— counts as being incoherent. More generally, according to probability theory it is immaterial how propositions are presented to us. So if cutting propositions coarsely is a problem here, it is simply the price to be paid for using proba- bility theory in the analysis of rational acceptability. 16 Secondly, and equally importantly, it appears far from implausible to assume that in many ordinary situations people know the identities and differences of propositions under all contextually relevant modes of presentation. So at least in such situations there seems to be no impediment to cutting propositions coarsely. And, to make a point similar to one made a few paragraphs back, any adequate solution to the lottery paradox should work for those situations as well. Acknowledgments An earlier version of this paper was presented at the Popper Seminar at the LSE. Thanks to the audience for helpful questions and remarks. We are also grateful to Leon Horsten and to two anonymous referees for this journal for their comments. Further we would like to thank Albert Visser for a useful discussion about the subject matter of this paper. Igor Douven Institute of Philosophy University of Leuven Leuven, Belgium igor.douven@hiw.kuleuven.be and Timothy Williamson Oxford University Oxford, OX1 2JD, UK timothy.williamson@philosophy.oxford.ac.uk 15 Logical equivalence is in this setting standardly taken to comprise mathematical equivalence; see e.g. Howson and Urbach ([1993], p. 20). 16 Indeed, it seems to be a price that comes with probabilistic analyses of notions from mainstream epistemology generally; see for this point in connection with probabilistic analyses of the notion of coherence Douven and Meijs ([forthcoming]). 778 Igor Douven and Timothy Williamson a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/ References Castell, P. [1998]: ‘A Consistent Restriction of the Principle of Indifference’, British Journal for the Philosophy of Science, 49, pp. 387–95. Douven, I. [2002]: ‘A New Solution to the Paradoxes of Rational Acceptability’, British Journal for the Philosophy of Science, 53, pp. 391–410. Douven, I. and Meijs, W. [forthcoming]: ‘Measuring Coherence’, Synthese, in press. Douven, I. and Uffink, J. [2003]: ‘The Preface Paradox Revisited’, Erkenntnis, 59, pp. 389–420. Goodman, N. [1954]: Fact, Fiction, Forecast, London: The Athlone Press. Howson, C. and Urbach, P. [1993]: Scientific Reasoning, 2nd edn, La Salle, IL: Open Court. Jaynes, E. T. [1993]: ‘The Well-Posed Problem’, Foundations of Physics, 4, pp. 477–92. Jeffrey, R. [1983]: The Logic of Decision, 2nd edn, Chicago: University of Chicago Press. Keynes, J. M. [1921]: A Treatise on Probability, London: Macmillan. Korb, K. [1992]: ’The Collapse of Collective Defeat: Lessons from the Lottery Paradox’, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, Vol. I, pp. 230–6. Kyburg, H. [1961]: Probability and the Logic of Rational Belief, Middletown, CT: Wesleyan University Press. Makinson, D. [1965]: ‘The Paradox of the Preface’, Analysis, 25, pp. 205–7. Olin, D. [2003]: Paradox, Chesham: Acumen. Pollock, J. [1995]: Cognitive Carpentry, Cambridge, MA: MIT Press. Renyi, A. [1970]: Foundations of Probability, Boca Raton, FL: Holden-Day, Inc. Ryan, S. [1996]: ‘The Epistemic Virtues of Consistency’, Synthese, 109, pp. 121–41. Segerberg, K. [1980]: ‘A Note on the Logic of Elsewhere’, Theoria, 46, pp. 183–7. Tarski, A. [1986]: ‘What are Logical Notions?’, History and Philosophy of Logic, 7, pp. 143–54. Uffink, J. [1995]: ‘Can the Maximum Entropy Principle be Explained as a Consistency Requirement?’, Studies in History and Philosophy of Modern Physics, 26, pp. 223–61. van Benthem, J. [2002]: ‘Logical Constants: The Variable Fortunes of an Elusive Notion’, in W. Sieg, R. Sommer and C. Talcott (eds), 2002, Reflections on the Foundations of Mathematics (ASL Lecture Notes in Logic 15), Wellesley, MA: AK Peters, Ltd., pp. 426–46. Williamson, T. [2000]: Knowledge and Its Limits, Oxford: Oxford University Press. Generalizing the Lottery Paradox 779 a t R u tg e rs U n ive rsity o n S e p te m b e r 5 , 2 0 1 1 b jp s.o xfo rd jo u rn a ls.o rg D o w n lo a d e d fro m http://bjps.oxfordjournals.org/