What Is the Point of Confirmation?* Franz Huber†‡ ——————————————————————————— * First received: February 2004, last revised version received: August 2004. † Franz Huber, Philosophy, Probability, and Modeling research group, Center for Junior Research Fellows of the University of Konstanz, P.O. Box M 682, D-78457 Konstanz, Germany. E-mail: franz.huber@uni-konstanz.de ‡ My research was supported by the Alexander von Humboldt Foundation, the Federal Ministry of Education and Research, and the Program for the Invest- ment in the Future (ZIP) of the German Government through a Sofja Kovalevskaja Award. 1 Abstract Philosophically, one of the most important questions in the enterprise termed confirmation theory is this: Why should one stick to well confirmed theories rather than to any other theories? This paper discusses the answers to this question one gets from absolute and incremental Bayesian confirmation theory. According to absolute confirmation, one should accept “absolutely well confirmed” theories, because absolute confirmation takes one to true theories. An examination of two popular measures of incremental confirmation suggests the view that one should stick to incrementally well confirmed theories, because incremental confirmation takes one to (the most) informative (among all) true theories. However, incre- mental confirmation does not further this goal in general. I close by presenting a necessary and sufficient condition for revealing the confirmational structure in almost every world when presented separating data. 2 1. Introduction Philosophically, one of the most important questions in the enterprise traditionally termed confirmation theory is this: Why should one stick to well confirmed the- ories rather than to any other theories? In other and more mundane words: What is the point of confirmation? In what follows I will examine whether and how absolute and incremental Bayesian confirmation theory answer this question. According to absolute Bayesian confirmation theory, an agent’s degree of ab- solute confirmation of some hypothesis or theory H by a piece of evidence E relative to a body of background information B equals the probability of H given E and B, Pr (H | E ∧ B), where Pr : L → < is the agent’s actual degree of belief function on some language L (see section 2). According to incremental Bayesian confirmation theory, an agent’s degree of incremental confirmation of H by E relative to B is measured by a relevance measure rPr based on the agent’s actual degree of belief function Pr; i.e. a possibly partial function rPr : L×L×L → < such that for all H, E, B ∈ L with Pr (E ∧ B) > 0: rPr (H, E, B) > = < 0 ⇔ Pr (H | E ∧ B) > = < Pr (H | B) . 2. The Point of Absolute Confirmation The traditional answer to our question is something like this: Science aims at true theories, and one should accept well confirmed theories, because confirmation takes one to true theories. Indeed, if arriving at true theories is our (only) goal, then there is a point to absolute confirmation. In the long run, absolute confirma- tion almost surely takes one to true theories. This is the content of the following theorem (Gaifman and Snir 1982, 507): Theorem 1 (Gaifman and Snir) Let S = {Ai ∈ L : i = 0, 1, . . .} separate M odL, let Aωi be Ai if ω |= Ai and ¬Ai otherwise, and let [B] (ω) be 1 if ω |= B and 0 otherwise. Then for every B ∈ L, Pr ( B | ∧ 0≤i 0. Pr is regular iff the converese of 2. holds as well, 6. Pr (A) = 1 ⇒ |= A. A set of sentences S ⊆ L separates a set of models X ⊆ M odL just in case for any two distinct ω1, ω2 ∈ X there is an A ∈ S such that ω1 |= A and ω2 6|= A. The set of all atomic empirical sentences separates M odL (Gaifman and Snir 1982, 507). 1 However, absolute confirmation has long been abandoned in favour of incre- mental confirmation. Is there another goal for incremental confirmation that is different from arriving at true theories? If so, what is this goal? 4 3. What Is the Point of Incremental Confirmation? Two popular measures of incremental confirmation are the distance measure d (Earman 1992) and the Joyce-Christensen measure s (Joyce 1999, Christensen 1999): dPr (H, E, B) = Pr (H | E ∧ B) − Pr (H | B) , sPr (H, E, B) = Pr (H | E ∧ B) − Pr (H | ¬E ∧ B) . What do these measures measure? Reformulating d and s shows that d increases with • the plausibility of H given E and B, p = Pr (H | E ∧ B), and • the evidence neglecting or data independent semantic informativeness of H relative to B, i0 = Pr (¬H | B). Similarly, s increases with • the plausibility of H given E and B, p = Pr (H | E ∧ B), and • the evidence based or data dependent semantic informativeness of H rela- tive to E and B, i.e. the amount to which H informs about E relative to B, i1 = Pr (¬H | ¬E ∧ B). This is clearly seen by rewriting d and s as follows: dPr (H, E, B) = Pr (H | E ∧ B) + Pr (¬H | B) − 1, sPr (H, E, B) = Pr (H | E ∧ B) + Pr (¬H | ¬E ∧ B) − 1. p and i0 as well as p and i1 are conflicting in the sense that p decreases, whereas i0 and i1 increase with the logical strength of the hypothesis to be assessed. So d and s weigh between two conflicting aspects, viz. the plausibility and the infor- mativeness of the hypothesis to be assessed. In section 4 I will argue in more detail that i0 and i1 measure two different, but equally sensible kinds of informativeness. Section 5 provides another argument for the thesis that (i) d and s do nothing but weigh between the two conflicting goals of plausibility and informativeness; (ii) that they are exactly alike in the way they weigh between these two aspects; and (iii) that they differ from each other just in the respect that d is based on data independent informativeness whereas s 5 is based on informativeness about the data. All this suggests the following answer to the question what goal incremental confirmation is supposed to further: Sci- ence aims at informative true theories, and one should stick to incrementally well confirmed theories, because incremental confirmation takes one to (the most) in- formative (among all) true theories. However, as shown in section 6, incremental confirmation does not further this goal in general. I close by giving a necessary and sufficient condition for revealing the confirmational structure in almost every world when presented separating data. 4. Measuring Semantic Information In a subjective Bayesian framework it is clear that p = Pr (H | E ∧ B) measures the plausibility of H in view of E and B. It is still rather obvious that i0 = Pr (¬H | B) measures the data independent informativeness of H relative to B. i0 was already considered by Carnap and Bar-Hillel (1952), Bar-Hillel and Carnap (1953), Hempel (1960, 1962), and Hintikka and Pietarinen (1966) (for the notion of semantic information cf. Bar-Hillel 1952, 1955). The second measure that was discussed in this connection is i2 = − log2 Pr (H | B) = log2 1 Pr (H | B) . i2 is ordinally equivalent to i0, and so does not differ from i0 in the respects of interest for the present discussion. It is less obvious that i1 = Pr (¬H | ¬E ∧ B) measures how much H in- forms about the data E relative to background B. Following the above mentioned literature, one would expect something like2 i3 = Pr (¬H | E ∧ B) , cont = Pr (E) · Pr (¬H | E ∧ B) , inf = log2 1 Pr (H | E ∧ B) = − log2 Pr (H | E ∧ B) . As is often the case, a picture says more than a thousand words3: 6 B % % % % % % % % % % %% &% '$ H E The background information B determines the set of possibilities in the inquiry, and thus is nothing but a restriction on the set of possible worlds over which inquiry has to succeed (cf. Hendricks 2004). H is the hypothesis whose informa- tiveness about the data E is to be assessed (relative to B). Suppose you are asked to strengthen H by deleting possibilities verifying it, that is, by shrinking the area representing H. Would you not delete possibilities outside E? After all, given E, those are exactly the possibilities known not to be the actual one, whereas those possibilities inside E are still alive options. Indeed, i1 increases when H shrinks to H′ as depicted in the second figure, because it measures how much of ¬E is occupied by ¬H. B % % % % % % % % % % %% &% '$ H \ H′ H′ E % % %% As a consequence, the information H provides about E is maximal if H log- ically implies E (in this case H is completely within E, and so ¬H covers all of ¬E). So according to i1, two hypotheses both logically implying all of the data – say, a complete theory about the world, and a theory-like collection of the data – carry the same maximal amount of information about E. In a sense, this is odd, because one would like the complete theory to come out as more informative than the theory-like collection of the data. This is what i0 yields. For i0 it does not matter which possibilities one deletes in strengthening H (provided all possibili- 7 ties have equal weight on the probability measure Pr). i0 neglects whether they are inside or outside E. The other candidates for measuring semantic information do rather poorly on this count: they require the deletion of the possibilities inside E. (Another reason why i3, cont, and inf seem to be inappropriate in the present context is presented in the next section.) The background information B plays a role different from that of the evidence E for i0 and i1, but not for i3, cont, or inf. Clearly, there is a difference between data on the one hand and background assumptions on the other; and this differ- ence should show up somewhere. Apart from the above mentioned point that B determines the set of possibilities over which inquiry has to succeed, whereas E is gathered in order to indicate which of these possibilities is the actual one, there is the following difference: Hypothese are supposed to inform about the world, and hence also about the data, but they are usually not supposed to inform about the background assumptions. (If one holds there should be no difference between E and B as far as measuring information is concerned, then one can nevertheless adopt the above measures by substituting E′ = E ∧ B and B′ = > for E and B, respectively.) In order to avoid that one has to take sides between i0 and i1 let us call a possibly partial function i = fi0,i1 : L × L × L → [0, 1] a strength indicator (based on i0 and i1) if and only if f is non-decreasing in both and increasing in at least one of its arguments i0 and i1, and fi0,i1 = 1 for i0 = i1 = 1. 5. Expected Informativeness as One Way of Weighing Having tried to make plausible that i0 and i1 measure informativeness per se and informativeness about the data, respectively, let us now turn back to the distance measure d and the Joyce-Christensen measure s. The two conflicting goals of informativeness and plausibility are equally important for d and s – and they are all what matters for them. Hence, other things being equal – these other things being the probabilities (plausibility values) of the hypotheses given the data E and the background information B – the overall d- or s-value of hypothesis H relative to E and B is the greater, the higher the informativeness of H (in the respective sense). Clearly, if one knows the truth values of the theories one is assessing, then the plausibility of a theory’s being true is of no interest anymore. In this case all what matters is how informative the theories are. Yet in general we do not know these truth values. Hence we consider how plausible it is that they are true in the 8 world we are in, and how informative they are (about this world). Then we form their overall value by combining these two parameters in some suitable way. One such way immediately suggests itself: assign H as its overall value its expected informativeness: E (i0) = Pr (¬H | B) · Pr (H | E ∧ B) − Pr (¬¬H | B) · Pr (¬H | E ∧ B) , E (i1) = Pr (¬H | ¬E ∧ B) · Pr (H | E ∧ B) − − Pr (¬¬H | ¬E ∧ B) · Pr (¬H | E ∧ B) . A little bit of reformulation shows that E (i0) = dPr (H, E, B) and E (i1) = sPr (H, E, B) . So once again, d and s are exactly alike in the way they combine or weigh between informativeness and plausibility – which is to form the expected informativeness (cf. Hintikka and Pietarinen 1966 and Levi 1961, 1963, but also Hempel 1960). Their sole difference lies in the way they measure informativeness. In this sense, part of the discussion about the right measure of incremental confirmation is a discussion about the right measure of semantic information. The measures i3, cont, and inf do again poorly: E (i3) = E (cont) = 0 and E (inf) > = < 0 ⇔ Pr (H | E ∧ B) > = < Pr (¬H | E ∧ B) . Hence only inf gives a non-trivial answer, viz. to maximize probability. But then we can simply stick to probabilities and need not employ inf. 6. Revealing the Confirmational Structure The preceding suggests the following answer to the question what goal incremen- tal confirmation is supposed to further: Science aims at informative truth, and one should stick to incrementally well confirmed theories, because incremental confir- mation takes one to (the most) informative (among all) true theories. The question is, of course, whether and in what sense this holds true. 9 When is one theory at least as informative as another? Well, if the first theory logically implies the second one, then the first theory is at least as informative as the second one. When else? In general, there is no further condition that applies equally to all probability measures Pr. Just as the only Pr-independent condition for theory H1 to be at least as probable as theory H2 is that H2 logically implies H1, so the above is the only Pr-independent condition for H1 to be at least as informative as H2. Hence, given a possible world (possibility, model) ω ∈ M od (B), H1 is to be preferred over H2 in ω if H1 is true in ω, but H2 is false in ω; or if H1 and H2 have the same truth value in ω, and H1 logically implies H2 but H2 does not logically imply H1. If H is logically true, then H is preferred in ω over any H2 which is false in ω. On the other hand, any contingent H1 that is true in ω is preferred over H, because these H1s are not only true in ω; they are also more informative than H. Similarly, if H is logically false, then H is worse in ω than any theory that is true in ω, but better than any theory that is false in ω (because they are all less informative than H). In this way each possible world ω induces a partial order among all theories4: On the positive side one has all theories that are contingently true in ω, and on the negative side there are all theories that are contingently false in ω. In be- tween there are the logically determined theories. Among the true theories on the positive side, the most informative, i.e. the complete theory about ω, is on top, followed by all true hypotheses it logically implies, partially ordered according to the logical consequence relation. This order goes all the way down to the least informative among all true theories, the tautology, which is placed at the bottom of the positive side. On that same level is the most informative among all false theories, the contradiction, followed by all contingently false hypotheses, again partially ordered according to the logical consequence relation. Let us call this partial order the confirmational structure of ω. For a given possibility ω, we would like a function f to stabilize to the correct answer in the sense that f gets the confirmational structure of ω right after finitely many steps (pieces of evidence from ω), and continues to do so forever without necessarily halting (or giving any other sign that it has arrived at the true answer) – cf. Kelly (1996). In general, stabilisation to the correct answer is a stronger requirement than convergence to the correct answer. However, the Gaifman and Snir convergence theorem actually gives rise to a measure 1 stabilisation result (assign 1 to H if its probability exceeds .5, and 0 otherwise). Let e0, . . . , en, . . . be a sequence of sentences all of which are true in ω ∈ M od (B). A possibly partial function f : L × L × L → < reveals the confirma- 10 tional structure of ω when presented (ei)i∈N iff for any contingent H1, H2 ∈ L, and any H ∈ L: 1. ω |= H1, ω 6|= H2 ⇒ ∃n∀m ≥ n: f (H1, Em, B) > 0 > f (H2, Em, B) 2. ω |= H1, ω |= H2 H1 ` H2 6` H1 ⇒ ∃n∀m ≥ n: f (H1, Em, B) > f (H2, Em, B) > 0 3. ω 6|= H1, ω 6|= H2 H1 ` H2 6` H1 ⇒ ∃n∀m ≥ n: 0 > f (H1, Em, B) > f (H2, Em, B) 4. |= H or |= ¬H ⇒ ∀m ≥ n : f (H, Em, B) = 0, where Em = ∧ 0≤i 0 > rPr (H2, E ω m, B) , r = r, l, where Eωm = ∧ 0≤i 0 ∃δε > 0 ∀s1, s2, t1, t2 ∈ [0, 1] : s1 > s2 + ε & t1 > t2 − δε ⇒ f (s1, t1) > f (s2, t2) . (The si are possible values of i, and the ti are possible values of p.) Continuity in this general form is not necessary. It suffices that Demarcation is conjoined with Continuity in Certainty. 3. Continuity in Certainty: Any surplus in informativeness succeeds, if plau- sibility becomes certainty. ∀ε > 0 ∀ti, t′i ∈ [0, 1] : ti, t′i →i { 1 0 ∃n∀m ≥ n ∀sm, s′m ∈ [0, 1] : sm > s ′ m + ε ⇒ f (sm, tm) > f (s′m, t′m) . Theorem 2 Let Pr be a regular probability on L, let {ei : i ∈ N} ⊆ L sepa- rate M odL, let f be a function of, among others, i and p satisfying Continuity 12 in Certainty and Demarcation, and let Pr∗ be the unique probability measure on the smallest σ-field A containing the field {M od (A) : A ∈ L} such that for all H ∈ L: Pr (H) = Pr∗ (M od (H)), where M od (A) = {ω ∈ M odL : ω |= A}. Then there exists X ∈ A with Pr∗ (X) = 1 such that the following holds for every ω ∈ X, any two contingent H1, H2 ∈ L, and every H ∈ L: 1. ω |= H1, ω 6|= H2 ⇒ ∃n∀m ≥ n : f (H1, Eωm) > 0 > f (H2, Eωm) 2. ω |= H1, H1 ` H2 6` H1 ⇒ ∃n∀m ≥ n : f (H1, Eωm) > f (H2, Eωm) > 0 3. ω 6|= H2, H1 ` H2 6` H1 ⇒ ∃n∀m ≥ n : 0 > f (H1, Eωm) > f (H2, Eωm) 4. |= H or |= ¬H ⇒ ∀m : f (H, Eωm) = 0. However, even Continuity in Certainty is not necessary. The necessary and suf- ficient condition for revealing the confirmational structure in almost every world when presented separating data is this: Definition 1 A possibly partial function f : L×L×L → < is a Gaifman and Snir assessment function iff for every probability Pr on a Gaifman and Snir language L (as described in section 2) and every {ei : i ∈ N} ⊆ L separating M odL there is X ∈ A with Pr∗ (X) = 1 such that for all ω ∈ X and all m ∈ N : I. H1 |= H2 6|= H1 Pr (H1 | Eωm) →m { 1 0 ⇒ ∃n∀m ≥ n : f (H1, Eωm) > f (H2, Eωm) . II. |= H1, |= ¬H2, Pr (Eωm) > 0 ⇒ f (H1, Eωm) = f (H2, Eωm) = 0. Definition 2 Let Pr be a probability on a Gaifman and Snir language L and let {ei : i ∈ N} ⊆ L separate M odL. A possibly partial function f : L×L×L → < reveals the confirmational structure of Pr∗-almost every world ω ∈ M odL when presented separating (ei)i∈N iff there is X ∈ A with Pr ∗ (X) = 1 such that for all ω ∈ X, all contingent H1, H2 ∈ L, and all H ∈ L: 1. ω |= H1, ω 6|= H2 ⇒ ∃n∀m ≥ n : f (H1, Eωm) > 0 > f (H2, Eωm) . 2. ω |= H1, H1 |= H2 6|= H1 ⇒ ∃n∀m ≥ n : f (H1, Eωm) > f (H2, Eωm) > 0. 3. ω 6|= H2, H1 |= H2 6|= H1 ⇒ ∃n∀m ≥ n : 0 > f (H1, Eωm) > f (H2, Eωm) . 4. |= H or |= ¬H ⇒ ∀m : f (H, Eωm) = 0. 13 f reveals the confirmational structure of almost every world when presented sep- arating data iff for any probability Pr on a Gaifman and Snir language L and any {ei : i ∈ N} ⊆ L separating M odL: f reveals the true assessment structure of Pr∗-almost every world ω ∈ M odL when presented separating (ei)i∈N . Theorem 3 A possibly partial function f : L × L × L → < reveals the confir- mational structure of almost every world when presented separating data iff f is a Gaifman and Snir assessment function. One reason why I still opt for the more general Continuity condition is that it depends on the underlying convergence theorem which conditions are necessary and sufficient for revealing the confirmational structure in so and so many worlds when presented such and such data. More importantly, in the context of theory assessment (Huber 2005) the idea behind the use of these limit considerations is that they provide a theoretical justification for adopting the proposed conditions in the here an now. When assessing theories we cannot wait until we have arrived at the point of stabilisation for these theories. In fact, in general we will not know when we have reached that point. We need to make our evaluations here and now, where the probability values are somewhere in between their maximal and minimal values, and we have no idea in which direction they will eventually converge (if they do so at all). Hence a theory of theory assessment needs to answer the question what to do when facing such a situation. Continuity gives an answer, but Continuity in Certainty does not. However, we also need to justify this answer – and we do so by appealing to the fact that when we satisfy Continuity in the special case when the probability values converge, we almost surely reveal the confirmational structure. As we usually do not know whether our probabilities have started to converge, we should always satisfy Continuity. 7. Conclusion I started from the question: Why should one stick to well confirmed theories rather than to any other theories? The answer we got from absolute Bayesian confirmation theory is that one should stick to absolutely well confirmed theories, because absolute confirmation almost surely takes one to true theories. I con- tinued by looking for an answer from incremental Bayesian confirmation theory. This answer should be different from the previous one in order for incremental confirmation to improve on absolute confirmation. 14 It turned out that three popular measures of incremental confirmation, viz. the distance measure d, the Joyce-Christensen measure s, and the Carnap measure c, give an interesting answer: One should stick to incrementally well confirmed theories, because incremental confirmation almost surely takes one to (the most) informative (among all) true theories. However, although all measures of incremental confirmation separate contin- gently true from contingently false theories, not all of them distinguish between informative and uninformative true and false theories. The log-ratio measure r does not distinguish between informative and uninformative false theories, and log-likelihood ratio measure l neither distinguishes between informative and un- informative true nor between informative and uninformative false theories. A suf- ficient condition for revealing the confirmational structure of almost every world when presented separating data is the conjunction of Continuity and Demarcation, the core principle of the plausibility-informativeness theory of theory assessment (Huber 2005). Notes 1 The Gaifman and Snir framework is not rich enough for proper theory assess- ment. The reason is that the “theories” whose truth values one converges to by conditioning on some separating set of data sentences are formulated within the same “empirical” vocabulary as are the data sentences. So there is no room for theoretical terms in the sense that the probability of a theory whose formulation contains theoretical terms not occurring in any data sentence does not necessarily converge to its truth value (in ω) when one conditionalizes on these observational data sentences (from ω). 2 In Levi (1967), i3 is proposed as, roughly, a measure for the relief from agnos- ticism afforded by accepting H as strongest relative to total evidence E ∧ B. For cont and inf the reader is referred to Hintikka and Pietarinen (1966). 3 I owe this graphical illustration to Luc Bovens. 4 Here and elsewhere one should, of course, speak of axiomatizations of theories. I also ignore that for each sentence there are infinitely many distinct, but logically equivalent sentences. 15 References [1] Bar-Hillel, Yehoshua (1952), “Semantic Information and Its Measures”, in Transactions of the Tenth Conference on Cybernetics. Josiah Macy, Jr. Foun- dation, New York, 33-48. Reprinted in Bar-Hillel, Yehoshua (1964), Lan- guage and Information. Selected Essays on Their Theory and Application. Reading, MA: Addison-Wesley, 298-310. [2] —— (1955), “An Examination of Information Theory”, in Philosophy of Science: 22, 86-105. Reprinted in Bar-Hillel, Yehoshua (1964), Language and Information. Selected Essays on Their Theory and Application. Reading, MA: Addison-Wesley, 275-297. [3] Bar-Hillel, Yehoshua, and Rudolf Carnap (1953), “Semantic Information”, in The British Journal for the Philosophy of Science: 4, 147-157. [4] Carnap, Rudolf (1962), Logical Foundations of Probability, 2nd ed. Chicago: University of Chicago Press. [5] Carnap, Rudolf, and Yehoshua Bar-Hillel (1952), An Outline of a Theory of Semantic Information, Technical Report 247, Research Laboratory of Electronics, Massachusetts Institute of Technology. Reprinted in Bar-Hillel, Yehoshua (1964), Language and Information. Selected Essays on Their The- ory and Application. Reading, MA: Addison-Wesley, 221-274. [6] Christensen, David (1999), “Measuring Confirmation”, in Journal of Philos- ophy: 96, 437-461. [7] Earman, John (1992), Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory. Cambridge, MA: MIT Press. [8] Fitelson, Branden (1999), “The Plurality of Bayesian Measures of Confir- mation and the Problem of Measure Sensitivity”, in Philosophy of Science: 66, S362-S378. [9] Fitelson, Branden (2001a), “A Bayesian Account of Independent Evidence with Applications”, in Philosophy of Science: 68, S123-S140. [10] Fitelson, Branden (2001b), Studies in Bayesian Confirmation Theory. PhD Dissertation. Madison, WI: University of Wisconsin-Madison. 16 [11] Gaifman, Haim, and Marc Snir (1982), “Probabilities over Rich Languages, Testing, and Randomness”, in Journal of Symbolic Logic: 47, 495-548. [12] Hempel, Carl Gustav (1960), “Inductive Inconsistencies”, in Synthese: 12, 439-469. Reprinted in Hempel, Carl Gustav (1964), Aspects of Scientific Explanation and Other Essays in the Philosophy of Science. New York: The Free Press, 53-79. [13] —— (1962), “Deductive-Nomological vs. Statistical Explanation”, in Feigl, Herbert, and Grover Maxwell (eds.), Minnesota Studies in the Philosophy of Science, vol. 3. Minneapolis: University of Minnesota Press, 98-169. [14] Hendricks, Vincent F. (2004), Forcing Epistemology. Cambridge: Cam- bridge University Press. [15] Hintikka, Jaakko, and Juhani Pietarinen (1966), “Semantic Information and Inductive Logic”, in Hintikka, Jaakko, and Patrick Suppes (eds.), Aspects of Inductive Logic. Amsterdam: North-Holland, 96-112. [16] Huber, Franz (2005), “Assessing Theories”, in Duncan Pritchard and Vincent F. Hendricks (eds.), New Waves in Epistemology. Aldershot: Ashgate. [17] Joyce, James M. (1999), The Foundations of Causal Decision Theory. Cam- bridge: Cambridge University Press. [18] Kelly, Kevin T. (1996), The Logic of Reliable Inquiry. Oxford: Oxford Uni- versity Press. [19] Levi, Isaac (1961), “Decision Theory and Confirmation”, in Journal of Phi- losophy: 58, 614-625. [20] —— (1963), “Corroboration and Rules of Acceptance”, in The British Jour- nal for the Philosophy of Science: 13, 307-313. [21] —— (1967), Gambling With Truth. An Essay on Induction and the Aims of Science. London: Routledge. [22] Milne, Peter (1996), “log [P (h | eb) /P (h/b)] is the One True Measure of Confirmation”, in Philosophy of Science: 63, 21-26. 17