An Axiomatic Theory of Inductive Inference∗ Luciano Pomatto † Alvaro Sandroni ‡ April 17, 2017 Abstract This paper develops an axiomatic theory of induction that speaks to the recent debate on Bayesian orgulity. It shows the exact principles associated with the belief that data can corroborate universal laws. We identify two types of disbelief about induction: skepticism that the existence of universal laws of Nature can be determined empirically, and skepticism that the true law of Nature, if it exists, can be successfully identified. We formalize and characterize these two dispositions towards induction by introducing novel axioms for subjective probabilities. We also relate these dispositions to the (controversial) axiom of sigma-additivity. ∗We are grateful to Nabil Al-Najjar, Frederick Eberhardt and Alvaro Riascos. All remaining errors are our own. †Division of the Humanities and Social Sciences, Caltech, Pasadena, CA, 91125. (e-mail: luciano@caltech.edu) ‡Department of Managerial Economics and Decision Sciences, Kellogg School of Management, Northwestern University, Evanston, IL 60208 (e-mail: sandroni@kellogg.northwestern.edu). 1 1 Introduction We seek an axiomatic understanding of specific problems of induction. Informally, induction is taken to mean the process of using empirical evidence to validate general claims, and, for our purposes, it is critical to differentiate between two types of epistemic skepticism about induction. One may doubt it is possible to know whether Nature abides by any law.1 Any empirical regularity may be a temporary fluke. Hence, patterns can suggest, but not prove, the existence of universal laws. So, one may ascribe non-vanishing odds to the idea that Nature does not follow any law, no matter how numerous and consistent the data may grow to be. We refer to this disposition as Humean skep- ticism, with the caveat that we do not claim to provide a complete representation of Hume’s (and other authors) actual statements. In addition, even if it is taken for granted that Nature abides by a law, one may be skeptical that such law can be inferred with arbitrarily high precision, even when the data grows without bounds. Let’s say that in each period either 0 or 1 must occur and that 1 has been observed every period, over a long time, say t periods. The data is consistent with the law “Nature produces only 1,” and with the law “Nature produces 1 until period t and 0 afterwards,” among (infinitely) many other laws. So, one may mantain a non-vanishing doubt that empirical evidence can validate a specific law, even under the assumption that the data follow one. We refer to this form of skepticism as Goodman’s skepticism, with the same caveat as above. We consider a probabilistic framework in which an agent, named Bob, is en- dowed with a coherent view of world (i.e., a finitely additive probability measure) over paths (i.e., infinite binary sequences). As the data unfolds, Bob updates his view of the world through Bayes’ rule.2 No restrictions are placed on which 1The definition of “law” is subjective as we make clear in the main text. 2This framework follows de Finetti’s (1970) viewpoint that inference involves personal judge- ments of likelihood that must be formalized in a coherent way. See de Finetti (1970) for a connection between coherent views of the world and Dutch books. We make no original attempt to justify Bayesianism and subjectivism. 2 paths may be produced. So, no relationship between past and future are, a priori, required (apart from the idea that either 0 or 1 occurs each period). Bob is not dogmatic about induction either. He believes that the data may or may not follow eternal laws. Under the lenses of this formal framework, we formalize Hume’s and Good- man’s skepticisms by introducing two novel axioms for subjective probabilities. These axioms refer to Bob’s belief as the data unfolds and becomes arbitrarily numerous. Bob’s view of the world is inductive in the sense of Hume, as we define it, if, under data compatible with laws, Bob expects to become almost convinced that Nature indeed follows laws. This axiom rules out Humean skepticism about induction. Bob’s view of the world is inductive in the sense of Goodman if he expects to successfully identify Nature’s law up to a vanishing degree of error, conditional on Nature abiding by one. This axiom rules out Goodman’s skepti- cism about induction. A natural starting line of inquiry is the extent of the connection between the two problems of induction. We start by asking whether Goodman’s skepticism implies Hume’s skepticism and the converse implication. Neither is true. Some coherent views exhibit Goodman’s skepticism, but not Hume’s skepticism and, conversely, some coherent views exhibit Hume’s skepticism, but not Goodman’s skepticism. Thus, these two types of skepticism are not logically nested. Of particular interest are the coherent views that express skepticism in the sense of Goodman, but not in the sense of Hume. If, say, confronted with the question of whether or not the data is generated by a Turing machine, such views of the world express conviction that with enough data it is possible to make this determination with near certainty. In spite of this remarkable confidence on the ca- pacity of Bayes’ rule to address this apparently insurmountable inference problem, the same view of the world, if confronted with the (arguably simpler) question of which Turing machine generates the data, assuming that one does, remains skeptical that this determination can be made with arbitrarily high precision. The celebrated theorems of Levy (1937), Doob (1949) and Blackwell and Du- bins (1962) make clear that under σ-additivity a Bayesian must believe that his opinion about a given hypothesis will converge to the truth. In particular, σ- additivity excludes both Hume’s and Goodman’s skepticism, and therefore it im- 3 plies a form of “Bayesian orgulity” (Belot, 2013). Different results were obtained by Elga (2015), Juhl and Kelly (1994), and Kelly (1996), among others, who have shown that there exist non σ-additive coherent views of the world which allow for Humean skepticism. Hence, in the absence of σ-additivity, epistemic skepticism is allowed. Our results reveal a complex relationship between skepticism and subjective probability. There are non σ-additive coherent views that rule out Humean skep- ticism and others yet that rule out Goodman’s skepticism. Thus, the spectrum of coherent views is rich enough to allow, at the same time, both orgulity and skepticism about induction. In particular, in the absence of σ-additivity, orgulity and skepticism are allowed. Orgulity is not an exclusive property of σ-additivity and may hold with or without it. This is a difficulty for a clear-cut theory of induction that seeks the root causes of orgulity and skepticism about induction. We show that while lack of σ-additivity does not assure Hume’s skepticism and Goodman’s skepticism, it always assures skepticism in at least one of these two ways. This is demonstrated by the structure theorem for coherent views of the world. It shows that a coherent view is inductive in the sense of Hume and in the sense of Goodman if and only if it is σ-additive. Thus, σ-additivity is the definitive condition that assumes away both Hume’s and Goodman’s skepticism about induction. It is not necessary to rule out either type of skepticism, but it is required to rule out both types simultaneously. The interpretation of the structure theorem requires considerable care. The equivalence between induction and σ-additivity may suggest that the problems of how to conceptually justify either induction or σ-additivity are, in fact, one and the same problem and that σ-additivity is the root and only cause of the conviction in the ultimate success of induction. This reading of the structure theorem may prove incomplete. Consider an alternative approach, where the focus is not on using data to ultimately, i.e. in the limit as data grows, uncover eternal laws of Nature, but on making predictions within a practical (i.e., bounded) future. Consider the case where a long sequence of 1’s has been observed. One may wonder if “Nature produces only 1’s.” One may also wonder whether or not “Nature will produce only 1’s for the next 1000 periods.” Our last result concerns the latter case, where Bob remains agnostic about the validity of universal claims, but asks 4 whether regularities in the past can be used to make sharp predictions about a bounded future. This result shows that any coherent view of the world, no matter how it is formed, must be confident that multiple repetitions of Bayes’ rule transform pattern data into a near infallible guide to a bounded future. Moreover, after numerous enough data there must be high confidence on lim- ited, but correct, inductive inferences. This holds even if, a priori, no assumption is made on the relationship between past and future in the sense that the data may unfold according to any path, including those without patterns. It also holds even if Bob is a skeptic in regards to the use of data to ultimately validate spe- cific or general laws. Eventually there must be high confidence that the past is a limited, but successful, guide to the future. This conclusion follows from con- ditional probability alone, and holds for any coherent view of the world. Thus, some confidence in inductive inference follows from coherence. 1.1 Literature on Bayesian Orgulity The paper speaks to the recent debate on “Bayesian orgulity,” originated with Belot (2013, 2015). Central to Belot’s thesis is the argument that the convergence results of Levy, Doob and Blackwell and Dubins are proof that Bayesianism implies epistemic arrogance. The debate has spurred different views. Huttegger (2015) argued that the issue of convergence to the truth should be put in the context of a long, but finite horizon. Weatherson (2015) revisited Belot’s argument from the perspective of Bayesian imprecise probability. The work closest to this paper is Elga (2015), who showed the existence of non σ-additive subjective probabilities expressing epistemic humility. This paper is also connected to the work of Kelly (1996), who formalized the connection between inductive inference and finitely additive probabilities, to the work of Gilboa and Samuelson (2012), who analyzed how subjectivity can enhance inductive inference, and to Al-Najjar, Pomatto and Sandroni (2014), who study how different dispositions towards induction can affect incentive problems. 5 2 Basic Concepts and Results 2.1 Patterns and Coherence An agent, named Bob, observes, in every period, one of two possible outcomes, 0 or 1. The set Ω = {0, 1}∞ is the set of all paths or infinite histories of outcomes. Given a path ω and a time t, we denote by ωt the set of paths that share with ω the same first t outcomes. We call ωt a finite history. We fix an algebra Σ of subsets of Ω (subsets of Ω mentioned in the text belong to Σ, even when not stated explicitly). The agent is endowed with a finitely additive probability P on Σ.3 The measure P captures Bob’s subjective viewpoint on how the outcomes will evolve. We refer to P as a coherent view of the world. Some paths are governed by a law or pattern and some are not. For instance, the path 1∞ = (1, 1, 1, ...) follows the law “Nature produces only the outcome 1.” A classic example of a pattern is given by periodic paths, defined by repeated cycles as in (1, 0, 1, 0, ...) or (1, 1, 0, 0, 1, 1, ...) or, more generally, eventually periodic path, i.e., sequences that are periodic after some point in time. Both examples are subsumed by the class of computable paths, which consists of all sequences that can be generated by a Turing machine (i.e., all paths that are the output of some finite program running on a computer with unlimited storage). In order to speak of induction it is critical to demarcate between paths gov- erned by a law from paths that do not follow any discernible pattern. This dis- tinction can be made in many different ways and the precise way in which this determination should be made is orthogonal to the central questions in this pa- per. So, we need not take a definitive stance of this matter. Instead, we assume that the final determination of what constitutes a law is subjective. That is, Bob determines which set of paths A ⊆ Ω are the ones that abide by a law. The complement of A are the set of paths that according to Bob do not follow any pattern. For simplicity, we often refer to paths in A as laws and to paths not in A as non-laws. We make the following assumptions on A and P . 3That is, a function P : Σ → [0, 1] such that P (Ω) = 1 and for every pair of disjoint sets E1 and E2 in Σ it satisfies P (E1 ∪E2) = P (E1) + P (E2). 6 Assumption 1. A is countable. While flexible enough to capture many formal definitions of pattern, including the set of periodic, eventually periodic or computable paths, the assumption is not, however, without loss of generality. It greatly simplifies the analysis because it rules out both conceptual and technical difficulties that are outside the scope of this paper. The main implication of Assumption 1 is that it allows a view of the world to assign strictly positive probability to each lawlike path. If, for example, A was uncountable, then Bob would have to assign zero probability to most individual laws. Formally: Remark 1 For any coherent view of the world, there can be, at most, countably many paths with strictly positive probability. The result applies to Bob’s view of the world both before and, by Bayes’ rule, after the data is observed. An alternative approach, which allows to capture more complex inference prob- lems, is to consider non-deterministic laws. In Section 7 we discuss this alternative approach and, in particular, the difficulties it involves. Assumption 2. P({ω}) > 0 for every ω ∈ A, and P(Ac) > 0. Bob believes that any law in his set A is, a priori, possible. Bob also does not rule out the possibility that Nature does not follow any pattern. This assump- tion enables Bayesian inferences about universal laws.4 Assumption 2 simplifies the notation and the statement of some of the results, but can be substantially weakened. Formally, all results in the paper continue to hold if their statements are modified by replacing the condition “for every ω ∈ A” with “for every ω ∈ A such that P ({ω}) > 0”. Assumption 3. Given any finite history ωt, A∩ωt 6= ∅ and Ac ∩ωt 6= ∅. Given any finite history ωt, no matter how complex or simple it may be, there are infinitely many laws that are compatible with it (i.e., there are infinitely many 4As is well known, Bayesian inference about an hypothesis requires the latter to have initial positive probability. See, for example, Broad (1918), Wrinch and Jeffreys (1919) and Edgeworth (1922), among others. See also Zabell (2011) for these and other references. 7 laws ω ∈ A such that the first t outcomes are equal to ωt) as well as uncountably many non-laws that are also compatible with it. So, for any data, Bob can never rule the hypothesis that Nature abides by laws nor the hypothesis that it does not. This captures the idea that there are many different ways in which past and future can relate to each other. The history 1, 1, 1, 1, 1 is equally compatible with the law “always 1” and with the law “1 in the first 5 periods and 0 afterwards.” In sum, the assumption ensures that it is not possible to deduce conclusively, from any finite data, whether or not Nature abides by laws, nor if so, to which law. Hence, it makes clear that induction, in this paper, refers to probabilistic inferences that can approach certainty, but never reach it in finite time. This assumption is also satisfied by all canonical definitions of patterns, and so is useful for the interpretation of the results. However, our results remain unchanged under the weaker condition that there exists (at least) one law ω̄ with the property that for every t there is a law ω ∈ A distinct from ω̄ such that ωt = ωt. So, upon observing t outcomes matching the path ω̄, Bob cannot conclude with certainty that the law, if it exists, must be ω̄. Finally, we emphasize that while our main examples of laws and patterns refer to celebrated ideas such as Turing machines and periodicity, our results would continue to hold even if Bob had an eccentric understanding of what is a law or pattern. That is, none of our results depend on the labels given to laws and non-laws, neither do they depend on the nature of the paths that are categorized as laws and non-laws (provided that Assumption 1 on the existence of at most countably many laws hold). The key point is that whatever Bob’s understanding of what constitutes laws and patterns might be, he privileges paths in the set A by assigning strictly positive probability to each of them. This is a non-judgemental, but meaningful, differentiation of laws and non-laws because, as we discussed, only countably many paths can have strictly positive probability. We fix for the remainder of the paper a set of paths A satisfying Assumptions 1 and 3. We also restrict the attention to views of the world that satisfy Assumption 2. 8 2.2 Induction and the Separation Theorem We now formalize specific forms of induction. Definition 1 A coherent view of the world P is inductive in the sense of Hume if for every path ω ∈ A, P ( A | ωt ) → 1 as t →∞. (1) From sufficient data with a pattern, Bob ultimately concludes, with proba- bility approaching certainty, that Nature must follow some law. A view of the world that violates (1) is such that the probability of the set A of lawlike paths remains bounded away from 1, regardless of the number of realizations. Any such worldview captures what we refer to as Humean skepticism: Bob maintains a non-vanishing doubt that perhaps Nature does not work through eternal laws, no matter how consistent and numerous the data he observes. At each point in time, the observed finite history ωt is consistent with a path following a pattern as well as with a path that does not follow a pattern. So, by Bayes’ rule, even a view of the world that is inductive in the sense of Hume will always attach non-zero probability to the event that Nature does not abide by laws (under Assumption 2). What distinguishes between skepticism and inductivity in the sense of Hume is whether or not Bob’s doubt on the regularity of Nature vanishes, as the number of observations that exhibit a pattern goes to infinity. Definition 2 A coherent view of the world P is inductive in the sense of Good- man if for every path ω ∈ A, P ( {ω} | A∩ωt ) → 1 as t →∞. (2) If it is granted that Nature abides by some law and sufficient data with a pattern is observed, Bob infers Nature’s true law with increasing precision, and ultimately concludes it is eternal. A view of the world that violates (2) captures what we refer to as Goodman’s skepticism: even assuming that an underlying law exists and that extensive evidence is available, Bob remains skeptical he will 9 ever be able to perfectly single out the data generating law with arbitrarily high confidence. As in the case of induction in the sense of Hume, at no point in time Bob’s inference is solved perfectly. He will always attach nonzero odds to multiple paths. However, a view of the world that is inductive in the sense of Goodman is confident he must ultimately put almost all mass on the law generating the data. The distinction we make here need not be seen as the formal counterpart of the classic and the new riddle of induction (see Goodman (1955) and Stalker (1994), for a discussion) and the above terminology is used mostly as a mnemonic device. Fundamentally, we ask two direct inference questions: Within the present proba- bilistic framework can one tell, from sufficient data and with arbitrary precision, (1) whether Nature must abide by a law and (2) if so, which law? We now examine the logical connection between these two questions. The Separation Theorem There exist coherent views of the world that are in- ductive in the sense of Hume but not in the sense of Goodman, and views that are inductive in the sense of Goodman but not in the sense of Hume. The Separation Theorem shows that Hume’s skepticism and Goodman’s skep- ticism are not logically nested. One does not imply the other. In the Appendix we provide simple examples of views satisfying only one of the properties. Given the separation theorem, it is meaningful to consider those coherent views of the world that express both types of faith in induction. Definition 3 A coherent view of the world P is inductive if it is inductive in the sense of Hume and is inductive in the sense of Goodman. Under an inductive view of the world, skepticism about induction vanishes. Bob interprets evidence consistent with a pattern as a sign of the existence of an underlying law of Nature, and expects further evidence to allow him to single out the correct law with virtually exact precision. So, inductive views express great confidence in the power of empirical evidence to predict the future. This can be expressed as follows: 10 Definition 4 A coherent view of the world P is confident that enough pattern data transforms the past into a near infallible guide to the future if for every path ω ∈ A, P ( {ω} | ωt ) → 1 as t →∞. (3) So, conditional on a sufficient long pattern data ωt, the future is forecasted with arbitrarily high degree of certainty. Remark 2 A coherent view of the world P is inductive if and only if it is confident that enough pattern data transforms the past into a near infallible guide to the future. So, partial induction is the necessary and sufficient condition for confidence that sufficient pattern data is a near perfect guide to the future. Remark 2 delivers an initial characterization of induction that will prove useful. 3 Orgulity and σ-additive Coherent Views This section examines inductive properties of σ-additive coherent views. These results are known and adapted to our framework. We refer to known results as “propositions” and to novel ones as “theorems.” Proposition 1 If a coherent view of the world is σ-additive then it is inductive. The proof of this result can be found in Kelly (1995). Under σ-additivity, after multiple observations consistent with a pattern, Bob infers Nature’s underlying law with arbitrary accuracy and concludes with almost certainty that Nature cannot follow a different law. However, σ-additivity entails even stronger forms of faith in induction. Definition 5 A coherent view of the world P is completely inductive in the sense of Hume if P ( A | ωt ) → 1 as t →∞, for every path ω ∈ A and P ( Ac | ωt ) → 1 as t →∞, for P-almost every path ω in Ac (4) 11 A coherent view of the world that is completely inductive in the sense of Hume expresses full confidence that, with sufficient data, laws and non-laws can be distinguished empirically and with near certainty. So, complete induction in the sense of Hume is an expression of confidence that a remarkably difficult inference problem can be resolved with arbitrarily high precision. Proposition 2 Any σ-additive coherent view of the world is completely induc- tive in the sense of Hume. This result has led Belot (2013) to speak of “Bayesian orgulity.” The basic inference problem is difficult. Yet, σ-additive coherent views are confident that finite, but long enough, data suffices to determine with arbitrarily high precision whether or not Nature is governed by a law. In addition, Definition 6 A coherent view of the world P is completely inductive if it is completely inductive in the sense of Hume and inductive in the sense of Goodman. Combining propositions 1 and 2 yields: Corollary 1 If a coherent view of the world P is σ-additive then it is completely inductive. Under σ-additivity, Bob must express the following viewpoint on induction: “I do not know whether Nature works through laws or not, but given sufficient data I will find out with an arbitrarily high degree of certainty. If Nature generates the data based on a law, I will ultimately conclude that Nature works through laws and uncover the law Nature abides by, up to a vanishing error. If the data is not governed by a law, then, in the long run, I will become near certain that Nature does not follow laws. This is true even though any finite data is simultaneously consistent with countably many laws and uncountably many non-laws.” So, under σ-additivity, Bob believes that Bayes’ rule resolves these essential problems of induction. With sufficient data, Nature’s law is eventually uncovered. A false inference of laws, when Nature follows none, is unlikely. The intuition behind these results is as follows: First, let’s assume, for simplicity, that Nature 12 abides either by the law “always 1” or to a law “1 until period t and 0 thereafter,” for some t > 0. No sequence of 1’s, either large or small, suffices to infer Nature’s law conclusively, but there is a crucial difference between a short and a long sequence. Ex-ante, the odds of the law “always 1” are fixed and strictly positive. The odds of the laws “1 until some period t ≥ m and 0 thereafter” are arbitrarily small if m is sufficiently high. It is here that the assumption of σ-additivity is used. Under σ-additivity, such tail events must be unlikely. It now follows, by Bayes’ rule, that conditional on a sufficiently long sequence of 1’s, the likelihood of the law “always 1” eventually dominates the likelihood of any competing standing theories. Thus, under σ-additivity, Bob cannot express Goodman’s skepticism. The intuition regarding Hume’s skepticism is related, but not identical. As- sume, for simplicity, that Nature either abides by the law “always 1” or does not abide by any law. Once again, no sequence of 1’s, either large or small, suffices for conclusive inference. For any sequence of 1’s, no matter how long, there are still many non-laws that are consistent with it. However, the set of non-laws that are consistent with consecutive 1’s until period t, shrinks monotonically to the empty set as t goes to infinity. This follows because no non-law is consistent with an infi- nite sequence of 1’s. So, under σ-additivity, the ex-ante odds of the set of standing non-laws (i.e., those consistent with data of consecutive 1’s until period t) goes to zero as t goes to infinity. Hence, by Bayes’ rule, conditional on a sufficiently long sequence of 1’s, the relative likelihood of the law “always 1” is much higher than the competing and still standing non-laws. Thus, under σ-additivity, Bob cannot express Hume’s skepticism. Finally, the intuition regarding property (4) is also similar. The set of laws consistent with non-pattern data of length t shrinks monotonically to the empty set as t goes to infinity (because no law is consistent with an infinite sequence of non-pattern data). Thus, under σ-additivity, it is unlikely that laws are consistent with long non-pattern data. Hence, property (4) holds and so does complete induction in the sense of Hume. 4 Orgulity and General Coherent Views As shown, σ-additive coherent views rule out skepticism about induction. We now consider Bob’s conclusions about the ultimate fate of multiple repetitions of 13 Bayes’ rule for general, not necessarily σ-additive, coherent views of the world. We start with an important result, a corollary of Elga (2015) (related results can also be found in Juhl and Kelly (1994) and Kelly (1996)): 5 Proposition 3 Let ε > 0. There exists a coherent view of the world P such that P ( A | ωt ) ≤ ε for every t and every ω ∈ A. The view P displays a complete failure of induction in the sense of Hume. Un- der P , no evidence can overturn Bob’s initial pessimistic belief on the existence of laws. Hence, σ-additivity suffices to rule out Hume’s skepticism about induc- tion, and this condition cannot be completely disposed of. Elga (2015) shows that not all coherent views are inductive in the sense of Hume. On the other hand, the separation theorem shows that some non σ-additive coherent views are inductive in the sense of Hume. Moreover, there are also coherent views that are not σ-additive, but nevertheless are inductive in the sense of Goodman. Lack of σ-additivity does not assure skepticism in the sense of Hume and does not assure skepticism in the sense of Goodman either. Other strong forms of induction can also be obtained without σ-additivity. The Complete Humean Induction Theorem There exists a coherent view of the world P that is not σ-additive but is completely inductive in the sense of Hume. Insomuch as confidence about Humean induction must be granted under σ- additivity, the same confidence must also be granted without σ-additivity, for some coherent views of the world. An example of such a view can be found in the Appendix. Consider, for instance, the case where A is the set of computable paths. The Complete Humean Induction Theorem shows that some, but not all, coherent views of the world express the belief that even a fundamental problem such as whether or not Nature can be reduced to a Turing machine can be solved (up to 5The construction in Elga (2015) does not immediately apply to our framework (where As- sumptions 1-3 hold). For completeness, we provide an alternative construction in the Appendix. 14 a vanishing error) empirically, even in the absence of σ-additivity. In this sense, Bayesian orgulity is not restricted to σ-additivity. It extends to other coherent views of the world as well. 5 The Axiomatization of Induction The Separation and the Complete Humean Induction theorems present a difficulty for the development of a crisp theory of inductive inference. The difficulty is that confidence on solving induction problems is not only a product of well understood conditions such as σ-additivity, but also of properties coherent views might have, which are less understood and intuitively less clear. The Complete Humean In- duction Theorem is particularly challenging because it shows that confidence on empirical solutions to strong forms of inference problems can be obtained under conditions other than σ-additivity. However, let Σ̄ be the smallest algebra that contains all finite histories and all singletons {ω} for ω ∈ A. This is the small- est algebra which allows to express property (3), which is equivalent to a view P being inductive. The key point of this algebra is as follows: it is possible to obtain property (1) and also property (2) without σ-additivity. It is even possi- ble to combine properties (1) and (4) without σ-additivity (and, hence, produce complete Humean induction). However, on Σ̄, it is not possible for property (3) without σ-additivity. This makes σ-additivity not only sufficient, but necessary for partial induction (and, hence, for complete induction as well). Thus, The Structure Theorem A coherent view of the world P is inductive if and only if is σ-additive on Σ̄. The Structure Theorem is a full characterization result that delivers an ax- iomatic understanding of induction. The key result is the demonstration that while lack of σ-additivity does not assure skepticism in the sense of Goodman and it does not ensure skepticism in the sense of Hume either, it always assures skep- ticism in at least one of these two senses. So, on Σ̄, any result that holds without σ-additivity holds under some skepticism over induction. Conversely, results that require σ-additivity, also require induction. 15 The collection Σ̄ is smaller than the σ-algebras commonly used in probability theory. While σ-algebras are mathematically convenient under σ-additivity, they do not play a particular role under finite additivity. What makes Σ̄ appealing in the context of induction is that Σ̄ is the simplest (i.e. the smallest) algebra that allows to distinguish between inductive and non-inductive views of the world. Small algebras such as Σ̄ have an additional advantage. Because Σ̄ is countable, finitely additive measures can be defined on P using only elementary mathematics, and without invoking the (uncountable) Axiom of Choice. 6 Pragmatism, Induction and de Finetti So far, we have focused on induction in the sense of the empirical validation of eternal laws of Nature. There are, however, other perspectives on induction, such as the one in which Bob is concerned with making accurate predictions about the practical future, rather than uncovering universal laws of nature, or even questioning their existence.6 If a law or theory makes predictions that are accurate within some finite hori- zon then the theory predicts as if it were correct. Thus, the argument goes, data need not uncover the actual data generating process. Nor does it need to reveal whether or not a law exists. It only needs to allow for accurate predictions for the practical future. To fix ideas, we refer to this perspective as pragmatism, with no claim that our narrow use of this terminology comprehends most associations with this word. We now revisit the different problems on induction, but from a more pragmatic perspective. In doing so we take a shortcut in the conceptual development. We define pragmatic inductive views as requiring that enough pattern data leads to a near infallible guide to a bounded future, instead of first making a distinction be- tween induction in the sense of Hume and Goodman and then obtaining accurate predictions as a result of both conditions as we did in Remark 2. Definition 7 A coherent view of the world P is pragmatically inductive if, for 6See Russell (1912, Ch. VI) for a discussion of induction which clearly distinguishes between the two perspectives. 16 every path ω ∈ A and every natural number k, P ( ωt+k | ωt ) → 1 as t →∞. (5) So, with enough pattern data, Bob is convinced that the next outcomes can be predicted with near certainty. This follows, in Bob’s belief, even if Nature abides by no laws or if it abides by a law that cannot be inferred from the data. The only claim is that after enough pattern data Nature behaves as if it abides by a (data-inferred) law for a bounded, but arbitrarily long future. We now turn to the concept of complete induction in the sense of Hume, from the pragmatic perspective. Let U be the set of unions of finite histories. So, a set U ∈ U is a union of finite histories such as ωt, where ω ∈ Ω and t is a natural number. Any arbitrarily complex set E ⊆ Ω can be approximated in terms of finite histories by choosing a set U ∈U such that E ⊆ U. 7 Definition 8 A coherent view of the world P is pragmatically completely inductive in the sense of Hume if for any set U ∈U such that A ⊆ U, P ( U | ωt ) → 1 as t →∞ on every ω in A and for any set V ∈U such that Ac ⊆ V, P ( V | ωt ) → 1 as t →∞ on P -almost every ω in Ac. Given the requirement for any set in U that contains A or Ac, there is, in particular, the same requirement for sets arbitrarily close to A or Ac. Sufficient pattern data leads to near certainty of finite histories associated with laws and sufficient non-pattern data leads to near certainty of finite histories associated with non-laws. Combining the two definition yields, Definition 9 A coherent view of the world P is pragmatically completely inductive if it is pragmatically inductive and pragmatically completely inductive in the sense of Hume. 7For instance, the set Ac of paths not following a pattern can be written as Ac = ⋂ n Un, where (Un) is a decreasing sequence in U. 17 So, in particular, enough pattern data leads to a near infallible guide to a bounded future and enough non-pattern data leads to near certainty of future finite histories associated with non-laws. The Pragmatic Induction Theorem Every coherent view of the world is prag- matically completely inductive. Unlike the previous results, the Pragmatic Induction Theorem holds for all coherent views of the world. No matter how coherent beliefs are formed, they must express confidence that mechanical repetitions of Bayes’ rule transform sufficiently numerous pattern data into a near infallible guide to a bounded future. In the case of non-pattern data then, provided that the data is sufficiently long, there must be confidence, approaching certainty, of an observable future associated with non- laws. This holds without any other assumption such as σ-additivity. Therefore, any coherent view of the world contains a seed of orgulity. The concerns one may have about the orgulity of Bayesians, may not go away, at least completely, by abandoning σ-additivity. The Pragmatic Induction Theo- rem relies on multiple repetitions of Bayes’ rule alone, hence it holds with or with- out σ-additivity. However, the extent to which this remaining form of orgulity is a difficulty for the Bayesian paradigm is a question beyond the scope of this paper. According to one viewpoint, the cases of successful inference that follow from the repetition of Bayes’ rule can be seen as a desideratum that provide support to the Bayesian approach. According to a different viewpoint, the Pragmatic Induc- tion Theorem can be seen as an expression of excessive confidence of the same paradigm. This paper does not resolve this fundamental tension but it helps to make precise the conditions under which orgulity holds. While the Pragmatic Induction Theorem relies only on coherence and Bayes’ rule, it is embedded in a standpoint that can be traced back to de Finetti. The key conceptual point advanced by de Finetti is that the Bayesian perspective on inference effectively solves the problem of induction. As he wrote in de Finetti (1970):8 8See chapters 11.1.5 and 11.2.1. For de Finetti’s (1970) perspective on induction, see also de Finetti (1970b, 1972). 18 In the philosophical arena, the problem of induction, its meaning, use and justification, has given rise to endless controversy, which, in the absence of an appropriate probabilistic framework, has inevitably been fruitless, leaving the major issues unresolved. It seems to me that the question was correctly formulated by Hume [...] In our formulation, the problem of induction is, in fact, no longer a problem: we have, in effect, solved it without mentioning it explicitly. Everything reduces to the notion of conditional probability [...] In this sense, the Pragmatic Induction Theorem can be seen as formalization of de Finetti’s viewpoint on induction. However, to the best our knowledge, de Finetti never made a distinction between the two basic inference problems (i.e., does Nature abides by laws, and if so which one?) and never examined these problems in a formal model. While pragmatism is the additional element necessary for the formalization of this viewpoint, there is a yet more basic contribution. de Finetti mostly wrote about induction in the context, as in de Finetti (1969), of exchangeable beliefs (i.e. beliefs such that the order in which different outcomes occur over time is irrelevant). Exchangeability not only rules out elementary laws such as “1 until period t and 0 afterwards,” it is also a critical assumption on the data and, hence, an assumption on how past and future must relate to each other. In contrast, in the Pragmatic Induction Theorem, confidence on limited, but successful, inductive inference about the future holds without assumptions on how the past and the future must relate to each other. The conclusions about the future depends on the data, but there is no restriction on the data generating process itself. 7 Extensions This paper dealt with some inductive inference problems, but left others un- examined. Perhaps the most basic limitation in this paper is that the data- generating processes are deterministic. A natural extension could go as follows: The Blackwell-Dubins theorem extends Proposition 1 to stochastic data generat- ing processes. Let’s say that there are countably many (possibly stochastic) data 19 generating processes P1,P2,P3, ... and Bob’s belief (a prior over {P1,P2,P3, ...}) assigns, ex-ante, strictly positive probability to each of them. If all probabilities are σ-additive then Bob’s predictions will become eventually indistinguishable from the data generating process, no matter which one. In spite of the power of the Blackwell-Dubins theorem, new difficulties arise in the case of stochastic data generating processes. For example, if two processes are identical in all but the first period, then it may be impossible to empirically determine which process runs the data. This determination is not relevant for predicting the future after period 1 (see Lehrer and Smorodinski (1996) and Ace- moglu, Cherzonukov and Yildiz (2016) on this problem). Other difficulties may prove currently intractable. The Blackwell-Dubins theorem relies heavily on σ- additivity. For general coherent views, there are some conceptual advances and some analytical methods for Bayesian learning were developed in Pomatto, Al- Najjar, and Sandroni (2014). With some effort, these techniques can be applied to prove a version of the Pragmatic Induction Theorem for stochastic data-generating processes. The Complete Humean Induction and the Separation Theorems are existence results and so still hold when the set of data generating processes is expanded. The main hurdle is the Structure Theorem. For a counterpart of that result, one must find an algebra on which induction is equivalent to σ-additivity when the data generating processes can be stochastic. This is a (very) difficult problem. 8 Appendix 8.1 Proof of the Separation Theorem We now provide examples of views that are inductive in the sense of Hume, but not in the sense of Goodman, or are inductive in the sense of Goodman but not in the sense of Hume. Fix a σ-additive measure Pσ = ∑ ω∈A βωδω, where each δω is the measure putting probability 1 on a path ω and (βω) are strictly positive weights such that∑ ω∈A βω = 1. Being σ-additive, it is inductive by Proposition 1. We start with the following result. 20 Lemma 1 There exists a finitely additive probability S satisfying the following two properties: • S (ωt) = Pσ (ωt) for every ωt; • S (A) = 0. So, any finite history has the same probability under S as under P . However, under S almost every path will eventually cease to follow a pattern. Proof of Lemma 1. Let F be the algebra generated by all finite histories. Consider the algebra A generated by F and the set A. As proved in Loś and Marczewski (1949), a set E ⊆ Ω belongs to A if and only if it is of the form E = (F1 ∩A) ∪ (F2 ∩Ac) where F1,F2 belong to F. Let M be defined as M ((F1 ∩A) ∪ (F2 ∩Ac)) = Pσ (F2) for every set (F1 ∩A) ∪ (F2 ∩Ac) in A. It can be easily verified that M is a well defined probability measure on A. Let S be any measure extending M from A to Σ (see, for example, Loś and Marczewski (1949) for a proof that such an extension exists). By construction, S satisfies the desired properties. The mixture Q = 1 2 Pσ + 1 2 S satisfies assumptions 1 and 2. It is inductive in the sense of Goodman but not in the sense of Hume. The intuition for why S is inductive in the sense of Goodman is as follows: when conditioning on A the measure Q reduces to the σ-additive measure Pσ, which is inductive. Formally, because S (A) = 0 then for every ω ∈ A we have Q ( {ω}|A∩ωt ) = 1 2 Pσ ({ω}∩A) 1 2 Pσ (A∩ωt) + 12S (A∩ω t) = Pσ ( {ω}|ωt ) for each t. The measure Pσ is σ-additive hence inductive, so Pσ ({ω}|ωt) converges to 1 for every ω ∈ A. Hence, Q is inductive in the sense of Goodman. To see that it is not inductive in the sense of Hume, notice that for every ω ∈ A, we have Q ( A|ωt ) = Pσ (A∩ωt) + S (A∩ωt) Pσ (ωt) + S (ωt) = Pσ (A∩ωt) 2Pσ (ωt) = 1 2 21 Hence, Q (A|ωt) remains equal to 1 2 no matter how large t is. So, Q is not inductive in the sense of Hume. We now construct an example of a measure inductive in the sense of Hume but not in the sense of Goodman. As implied by assumption 3, we can fix a path ω̄ ∈ A with the property that for every t we can find another path ω̄t ∈ A distinct from ω̄ such that ω̄tt = ω̄ t (so ω̄t and ω̄ coincide on the first t outcomes but differ on some later outcome). As is well known, there exist finitely additive probability measures that assign probability 0 to each single path but probability 1 to the whole set {ω̄1, ω̄2, ...}(see, for example, Rao and Rao 1983). Let R be such a a measure. We consider the mixture P = 1 2 Pσ + 1 2 R It satisfies assumptions 1 and 2. In addition, P ( A|ωt ) = Pσ (A∩ωt) + R (A∩ωt) Pσ (ωt) + R (ωt) = 1 since Pσ (A) = R (A) = 1. To see that P is not inductive in the sense of Goodman consider the finite history ω̄t. Bayes’ rule implies P ( {ω̄}|A∩ ω̄t ) = Pσ ({ω̄}) Pσ (ω̄ t) + R (ω̄t) . By definition the measure R assigns probability 0 to every finite set of paths. Hence R ({ω̄k : k ≥ 1, ω̄k ∈ ω̄t}) = R ({ω̄k : k ≥ 1}) for every t, so that R (ω̄t) = 1. Therefore P ( {ω̄}|A∩ ω̄t ) = Pσ ({ω̄}) Pσ (ω̄ t) + 1 As t →∞, σ-additivity implies that Pσ (ω̄t) converges to Pσ ({ω̄}), so P ({ω̄}|A∩ ω̄t) converges to 1 2 . Hence, P is inductive in the sense of Hume but not in the sense of Goodman. 22 8.2 Proof of the Complete Humean Induction Theorem The proof follows the same argument in the second part of the proof of the Separa- tion Theorem. Let Pσ and R be defined as in the above proof, and let ω̃ be a path such that ω̃ /∈ A and ω̃1 6= ω̄1 (since A is countable, such a path exists). Consider the mixture P = 1 3 Pσ + 1 3 R + 1 3 δω̃. As shown above, we have P (A|ωt) → 1 as t →∞ for every path ω ∈ A. Given the path ω̃, we have that for every t > 1, P ( Ac|ω̃t ) = Pσ ( Ac ∩ ω̃t ) + R ( Ac ∩ ω̃t ) + 1 Pσ ( ω̃t ) + R ( ω̃t ) + 1 since ω̃t 6= ω̄t then R ( ω̃t ) = R ({ ω̄k : ω̄k ∈ ω̃t }) = 0. Therefore P ( Ac|ω̃t ) = Pσ ( Ac ∩ ω̃t ) + 1 Pσ ( ω̃t ) + 1 since ω̃ /∈ A, then Pσ ( ω̃t ) → 0, so P ( Ac|ω̃t ) → 1. Therefore, P is completely inductive in the sense of Hume. To see that P is not σ-additive, notice that for every n, we have P ({ω̄k : k ≥ n}) = 1 3 ∑ k≥n Pσ ({ω̄k}) + 1 3 R ({ω̄k : k ≥ n}) . Because R assigns probability 0 to every finite set of paths, we have R ({ω̄k : k ≥ n}) = 1 for every n. Hence, P ({ω̄k : k ≥ n}) ≥ 13 for every n, even if ∩n{ω̄k : k ≥ n} = ∅. Hence P is not σ-additive. 8.3 Proof of the Structure Theorem We denote by F the algebra generated by all finite histories. Hence F ⊆ Σ̄ ⊆ Σ. A result related to the next lemma appears in Al-Najjar, Pomatto and Sandroni (2014). Lemma 2 A set E belongs to Σ̄ if and only if there exists a set F belonging to F such that the symmetric difference E4F is finite and included in A. Proof. Let E be the collection of sets E for which there exists a set F ∈ F 23 such that the symmetric difference E4F is finite and included in A. We prove that E ⊆ Σ̄. Let E and F ∈ F be such that E4F is finite and included in A. Because E\F is finite and included in A and Σ̄ is an algebra containing each singleton {ω} for paths in A, then F ∪ (E\F) ∈ Σ̄. Similarly, F\E ∈ Σ̄ and so E = (F ∪ (E\F))\(F\E) ∈ Σ̄. We now show that Σ̄ ⊆ E. It follows from the definition that E satisfies F ⊆ E and {ω} ∈ E for each ω ∈ A. We now prove that E is an algebra. Let E ∈E be such that E 4F is finite and included in A for some F ∈ F. Because Ec 4 Fc = E 4 F and Fc ∈ F , it follows that Ec ∈ E. Now let E1,E2 ∈ E, and fix F1,F2 ∈ F such that E1 4 F1 and E2 4F2 are finite and included in A. Let E = E1 ∪E2 and F = F1 ∪F2. Then E 4F ⊆ (E1 4F1) ∪ (E2 4F2). Hence E 4F is finite and satisfies E 4F ⊆ A. Thus, E is closed under union and complementation. Therefore, E is an algebra. So, Σ̄ ⊆E. Thus Σ̄ = E. We can now proceed with the proof. Let P be σ-additive. As shown in, for instance, Shiryaev (1996) (page 134), σ-additivity implies that P must sat- isfy P (ωt) → P ({ω}) as t → ∞, for every ω ∈ A. Therefore, P ({ω}|ωt) = P ({ω}) /P (ωt) → 1 whenever P ({ω}) > 0. So, by Remark 2, P is inductive in the sense of Hume and in the sense of Goodman. Conversely, suppose P is inductive in both sense. We now show it is σ-additive on Σ̄. Let µ be the re- striction of P on F. The measure µ is σ-additive on F (see the discussion in Example 10.4.2. in Rao and Rao (1983)). So, by Carateodory theorem it admits a σ-additive extension Pµ on the σ-algebra generated by F. In order to show that P is σ-additive (on Σ̄) we prove that Pµ (E) = P (E) for every E ∈ Σ̄. Let E ∈ Σ̄ and choose a set F ∈ F such that E4F is finite and included in A. By additivity, any measure Q satisfies Q (E) = Q (F) + ∑ ω∈E−F Q ({ω}) − ∑ ω∈F−E Q ({ω}) (6) By construction, Pµ and P coincide on F. Hence P (F) = Pµ (F). Since P is in- ductive, for every ω ∈ A, by Remark 2 it satisfies P ({ω}|ωt) = P ({ω}) /P (ωt) → 1, i.e. P ({ω}) = limt P (ωt). The σ-additivity of Pµ and the fact P and Pµ coin- 24 cide on F imply Pµ ({ω}) = lim t Pµ ( ωt ) = lim t P ( ωt ) = P ({ω}) for every ω ∈ A. In particular, this holds for every ω ∈ E4F . We can therefore conclude from (6) that Pµ (E) = Pµ (F) + ∑ ω∈E−F Pµ ({ω}) − ∑ ω∈F−E Pµ ({ω}) = P (F) + ∑ ω∈E−F P ({ω}) − ∑ ω∈F−E P ({ω}) = P (E) . Because E is arbitrary, it then follows that P and Pµ coincide on Σ̄. Hence P is σ-additive on Σ̄. 8.4 Proof of the Pragmatic Induction Theorem Endow Ω with the product topology, and let B be the Borel σ-algebra generated. Let F be, as before, the algebra generated by all finite histories. Given any coherent view of the world P (satisfying, as usual, assumptions 1 and 2) consider the restriction µ of P on F. Following the proof of the Structure Theorem, the measure µ admits a σ-additive extension Pσ on B. We now show that P is pragmatically inductive. For each ω ∈ A we have Pσ ({ω}) > 0. To see this, notice that σ-additivity implies Pσ ({ω}) = limt Pσ (ωt). For each t, we have Pσ (ω t) = P (ωt) ≥ P ({ω}) > 0. Hence Pσ ({ω}) > 0. Therefore, by σ-additivity, Pσ ({ω}|ωt) → 1 as t → ∞. Since Pσ ( ωt+K|ωt ) ≥ Pσ ({ω}|ωt), we conclude that Pσ ( ωt+K|ωt ) → 1 as t →∞. Because Pσ ( ωt+K|ωt ) = P ( ωt+K|ωt ) for every t, we conclude that P is pragmatically inductive. The result that P is pragmatically completely inductive in the sense of Hume can be proved as a consequence of the following general principle: for every set U ∈U and every history ωt, we have P ( U | ωt ) ≥ Pσ ( U | ωt ) . 25 We now prove this claim. The collection U of unions finite histories forms a base for the topology. Since the product topology is separable, each U ∈ U can be written as U = ⋃∞ n=1 hn where each hn is a finite history. For each m, we have that ⋃m n=1 hn belongs to F, hence P (U) ≥ P ( m⋃ n=1 hn ) = Pσ ( m⋃ n=1 hn ) Since ⋃m n=1 hn ↑ U as m → ∞, σ-additivity implies Pσ ( ⋃m n=1 hn) ↑ Pσ (U) as m →∞. Therefore P (U) ≥ Pσ (U). For each t and path ω, the set U∩ωt is open, and the same argument as above implies that P (U ∩ωt) ≥ Pσ (U ∩ωt). Because Pσ and P coincide on F, we also have P (ωt) = Pσ (ωt). Hence P (U | ωt) ≥ Pσ (U | ωt), as claimed. Because Pσ is σ-additive, it is completely inductive in the sense of Hume. So, if A ⊆ U and Ac ⊆ V then Pσ (U|ωt) → 1 for every ω ∈ A and P (V |ωt) → 1 for P- almost every path ω ∈ Ac. Since P (U|ωt) ≥ Pσ (U|ωt) and P (V |ωt) ≥ Pσ (V |ωt), it then follows that P is pragmatically completely inductive in the sense of Hume. 8.5 Proof of other results in the text Proof of Remark 1. The proof of this result is standard, and included only for the sake of completeness. Let D = {ω : P ({ω}) > 0} be the set of paths to which P attaches strictly positive probability. The additivity of P implies that for each positive integer k, the set Dk = {ω : P ({ω}) > k−1} must be finite. Hence D = ∪∞k=1Dk is countable. Proof of Remark 2. Assumptions 1, 2 and 3 imply that for each ω and t, the conditional probabilities P (·|ωt) and P (·|ωt ∩A) are well defined. In addition, by the law of total probability, for each ω ∈ A we have P ( {ω}|ωt ) = P ( {ω}|ωt ∩A ) P ( A|ωt ) for each ω ∈ A. Hence, as t → ∞, it follows that P ({ω}|ωt) → 1 if and only if P ({ω}|ωt ∩A) P (A|ωt) → 1. That is, if and only if P ({ω}|ωt ∩A) → 1 and P (A|ωt) → 1. 26 Proof of Proposition 3. Let ε ∈ (0, 1) and let Pσ be a σ-additive measure that satisfies assumptions 1-3. Using Lemma 1, let S be a probability measure that satisfies S (ωt) = Pσ (ω t) for every history, but S (A) = 0. Let P = εPσ+(1 −ε) S. Then, for every ω ∈ A and every t, we have P ( A|ωt ) = εPσ (A∩ωt) + (1 −ε) S (A∩ωt) Pσ (ωt) = εPσ (A∩ωt) Pσ (ωt) ≤ ε. References [1] Acemoglu, D., Cherzonukov, V. and M. Yildiz (2016): “Fragility of Asymp- totic Agreement under Bayesian Learning.” Theoretical Economics, 11, 187- 227. [2] Al-Najjar, N., Pomatto, L., and A. Sandroni (2014). “Claim Validation.” The American Economic Review, 104(11), 3725-3736. [3] Belot, G. (2015): “Bayesian orgulity.” Philosophy of Science. 80(4). 483-503. [4] Belot, G. (2015): “Objectivity and Bias.” Mind, forthcoming. [5] Blackwell, D. and L. Dubins (1962): “Merging of opinions with increasing information.” The Annals of Mathematical Statistics, 33(3), 882-886. [6] Broad, C. D. (1918): ”On the relation between induction and probability - (Part I.).” Mind 27-4: 389-404. [7] Doob, J. L. (1949): “Application of the theory of martingales.” Le calcul des probabilites et ses applications, 23-27. [8] de Finetti, B. (1969). “Initial probabilities: A prerequisite for any valid in- duction”. Synthese, 20(1), 2-16. [9] de Finetti, B. (1970): Theory of Probability, vol. 2. Wiley, New York. [10] de Finetti, B. (1972). Probability, Induction, and Statistics. Wiley, New York. 27 [11] Edgeworth, F. Y. (1922): “The philosophy of chance.” Mind, 31-123, 257-283. [12] Elga, A. (2015): “Bayesian humility.” Philosophy of Science. Forthcoming. [13] Gilboa, I., and L. Samuelson (2012): “Subjectivity in inductive inference.” Theoretical Economics, 7(2), 183-215. [14] Goodman, N. (1955): Fact, Fiction and Forecast. Harvard University Press, Cambridge. [15] Howson, C. and P. Urbach. (2006). Scientific reasoning: The Bayesian ap- proach. [16] Huttegger, S. (2015): “Bayesian convergence to the truth and the meta- physics of possible worlds.” Philosophy of Science. 82(4), 587-601. [17] Kelly, K.T., and C. Juhl. (1994). “Reliability, convergence, and additivity.” PSA: Proceedings of the Biennial Meeting of the Philosophy of Science As- sociation, 181-189. [18] Kelly, K. T. (1996): Logic of Reliable Inquiry. Oxford University Press. [19] Lehrer, E., and R. Smorodinsky (1996): “Merging and learning.” Statistics, Probability and Game Theory, 147–168. [20] Lévy, P. (1937): Theorie de l’Addition des Variables Aléatoires. Gauthier- Villars, Paris. [21] Loś, J. and E. Marczewski (1949): “Extensions of Measures.” Fundamenta Mathematicae, 1(36), 267-276. [22] Pomatto, L., N. Al-Najjar, and A. Sandroni (2014): “Merging and testing opinions.” The Annals of Statistics 42(3), 1003-1028. [23] Rao, K.P.S. and M. Rao (1983): Theory of Charges, Academic Press, New York. [24] Russell, B. (1912). The Problems of Philosophy. Oxford. 28 [25] Shiryaev, A. N. (1996): Probability. Springer-Verlag, New York. [26] Stalker, D. F. (1994): Grue!: the New Riddle of induction. Open Court Publishing Company. [27] Weatherson, B. (2015): “For Bayesians, rational modesty requires impreci- sion.” Ergo, an Open Access Journal of Philosophy. 2. [28] Wrinch, D., and H. Jeffreys. “On some aspects of the theory of probability.” Philosophical Magazine. 38-228. 715-731. [29] Zabell, S. L. (2011): “Carnap and the logic of inductive inference.” Handbook of the history of logic. 265-309. 29