An Objective Justification of Bayesianism I: Measuring Inaccuracy Philosophy of Science, 77 (April 2010) pp. 201–235. 0031-8248/2010/7702-0002$10.00 Copyright 2010 by the Philosophy of Science Association. All rights reserved. 201 An Objective Justification of Bayesianism I: Measuring Inaccuracy* Hannes Leitgeb and Richard Pettigrew†‡ In this article and its sequel, we derive Bayesianism from the following norm: Accu- racy—an agent ought to minimize the inaccuracy of her partial beliefs. In this article, we make this norm mathematically precise. We describe epistemic dilemmas an agent might face if she attempts to follow Accuracy and show that the only measures of inaccuracy that do not create these dilemmas are the quadratic inaccuracy measures. In the sequel, we derive Bayesianism from Accuracy and show that Jeffrey Condi- tionalization violates Accuracy unless Rigidity is assumed. We describe the alternative updating rule that Accuracy mandates in the absence of Rigidity. 1. Introduction. One of the fundamental problems of epistemology is to say when the evidence in an agent’s possession justifies the beliefs she holds and, when it does, how it does this and to what extent. In this article and its sequel (Leitgeb and Pettigrew 2010), we defend the Bayesian solution to this problem for those cases in which the set of possible worlds about which the agent holds an opinion is finite. If W is such a set of possible worlds, let be the power set of W, and let be theP(W ) Bel(W ) set of functions . We regard each function in as a�b : P(W ) r � Bel(W )0 (potential) belief function on the power set of W. Indeed, one of the distinctive presuppositions of Bayesianism is that, if W is the set of possible worlds about which an agent holds an opinion, then that agent’s epistemic *Received July 2009; revised September 2009. †To contact the authors, please write to: Department of Philosophy, University of Bristol, 9 Woodland Road, Bristol BS8 1BX, United Kingdom; e-mail: hannes.leitgeb@bris.ac.uk; richard.pettigrew@bris.ac.uk. ‡We would like to thank F. Arntzenius, F. Dietrich, K. Easwaran, B. Fitelson (and his Berkeley reading group), A. Hájek, L. Horsten, F. Huber, J. Joyce, T. Kuipers, W. Myr- vold, S. Okasha, G. Schurz, T. Seidenfeld, R. Williams, J. Williamson, and B. van Fraassen for their comments on earlier versions of this article. Hannes Leitgeb would like to thank the Leverhulme Trust and the Alexander von Humboldt Foundation for their generous support of this work. Richard Pettigrew would like to thank the British Academy with whom he was a postdoctoral fellow during work on this article. 202 HANNES LEITGEB AND RICHARD PETTIGREW state at a given time t may be represented by a belief function b �t that takes each proposition A, represented as a subset of W, toBel(W ) a real number that measures the degree of credence the agent assignsb (A)t to A. Thus, to solve the fundamental problem of epistemology, the Bayes- ian must say, for a given body of evidence, which belief functions it would be rational for an agent to have at a time when she is in possession of that evidence. At the core of Bayesianism lie two claims that go some way to solving the fundamental problem of epistemology as set out above: they are Prob- abilism and Conditionalization. Probabilism is a synchronic norm: that is, it concerns the intrinsic properties of an agent’s belief function at particular moments in her epistemic life. In particular, it demands that an agent’s belief function should be a probability function at any such moment. Conditionalization, however, is a diachronic norm: that is, it concerns the relation between an agent’s belief functions at different times. In particular, it demands that an agent who learns the truth of a prop- osition E between times t and t′ (and nothing stronger) ought to update her belief function by conditionalizing on E.1 These central claims have been extended in various ways by various philosophers. Two such extensions will be of particular interest to us. First, the synchronic claim that characterizes (a version of) Objectivist Bayesianism: if an agent has opinions only about finitely many possible worlds and if E is the strongest proposition given to her by her evidence so far, then her belief function ought to be the uniform probability dis- tribution over the worlds in E: we will call this Uniform Distribution. And, second, Richard Jeffrey proposed a diachronic norm, which we will call Jeffrey Conditionalization, that is meant to cover those instances of updating in which the evidence learned does not come in the form of a proposition learned with certainty, as in Conditionalization, but rather in the form of some weaker side constraints on the agent’s belief function. In this article and in its sequel (Leitgeb and Pettigrew 2010), we defend the two core tenets of Bayesianism (and, to a much lesser extent, the additional Objectivist Bayesian tenet) by appealing to the following fun- damental norm: Accuracy. An epistemic agent ought to approximate the truth. In other words, she ought to minimize her inaccuracy.2 1. As Jon Williamson reminded us, Conditionalization must be qualified: it holds only when plain factual evidence about the world is involved; cf. Williamson, forthcoming. 2. Why not maximize your accuracy instead? Because we like to think of inaccuracy as being given by a distance: the lesser the distance from the truth, the lesser the inaccuracy; the greater the distance from the truth, the greater the inaccuracy. Since distances from the truth are bounded from below, that is, by the zero distance, they can be minimized, MEASURING INACCURACY 203 Also, we use this norm to criticize Jeffrey’s updating rule, and we defend an alternative to Jeffrey Conditionalization that applies to the same type of situations. We consider all of this to be an objective manner of justifying Bayesianism, which is based on theoretical considerations on how to get to the truth rather than on practical considerations on how to make prudent decisions. Indeed, for us an agent’s degree of belief in a propo- sition A is such that the agent ought to minimize its distance from the truth value of A; for all epistemological purposes, this feature is in fact constitutive of the notion of degree of belief. Although the chosen type of justification is objective in this sense, it should be kept in mind that what gets justified in this way is still mainly just standard subjective Bayes- ianism. We agree with Jim Joyce (1998, 2009) that the relevant notion of ac- curacy here is what he calls gradational accuracy, wherein gradational accuracy depends only on the truth values of propositions at worlds and on the agent’s belief function.3 We will make this feature of accuracy more precise in the Local Normality and Dominance postulate in section 5, when we will give the notion a mathematical analysis, as promised. The quantitative notion of accuracy that interests us differs from Popper’s (1972) comparative concept of verisimilitude, according to which some sets of statements are closer to the truth than other sets of statements. This concept was proven inadequate by Miller’s (1974) and Tichý’s (1974) triviality results.4 Given this understanding of the notion of accuracy, it is the purpose of this article to make the Accuracy norm precise. In the sequel, we will then investigate the consequences of this norm. 2. The Basic Concepts and the Argument in Brief. Our argument is long, and it involves a number of distinctions. Thus, for the sake of clarity, we and that is what we are asking for in Accuracy. Using some means of transformation, these properties might perhaps be captured just as well by accuracy or ‘inverse distance’, but employing the notion of inaccuracy directly seems to give us a much more appealing way of stating our central epistemic goal. 3. Indeed, Joyce’s original article has been a major inspiration for the project in this article and its sequel. We discuss Joyce’s own account in detail in the sequel of this article. 4. In the meantime, refined theories of verisimilitude have been introduced that do not suffer from triviality results; some of them do resemble our theory in important respects. In particular, Niiniluoto’s (1987) theory of estimated truthlikeness defines a relational notion of truthlikeness in terms of an expected value of quantitative truthlikeness that bears some similarity to the expected inaccuracy of propositions that we will be interested in. We leave it to another paper to work out the details of this correspondence that was pointed out to us by Theo Kuipers. 204 HANNES LEITGEB AND RICHARD PETTIGREW Figure 1. present an overview of its underlying concepts and its structure before we give it in a fully detailed form.5 We begin by drawing a distinction between local and global measures of inaccuracy. A local inaccuracy measure is a mathematical function that takes a proposition , a world , and a nonnegative realA P W w � W number x and gives a measure of the local inaccuracy of havingI(A, w, x) degree of credence x in proposition A at world w. So measuresI(A, w, x) the distance of x from the truth value of A at w, where the truthx (w)A values are represented by the real numbers 0 and 1, as shown in figure 1. Intuitively, will be greater, the more x differs from the truthI(A, w, x) value of A in w. However, a global inaccuracy measure is a mathematical function that takes a belief function b and a world w and gives a measure ofG(w, b) the global inaccuracy of having belief function b at world w. So G(w, b) measures the distance of b from the world w, where both belief function and world will be represented geometrically in terms of vectors, as we will explain in detail in section 3.2 and as shown in figure 2. Again in- tuitively, will be greater, the more the degree of belief assignmentG(w, b) that is determined by b differs from the truth value assignment that is determined by w. Obviously, not any such function I or G will do; rather, we will have to restrict ourselves to sensible or legitimate choices of such functions, which will be achieved later by formulating postulates on what such legitimate I s or Gs are like. Thus, our first attempt to make Accuracy precise results in the norm bifurcating: Accuracy (Local). An agent ought to minimize the local inaccuracy of her degrees of credence in all propositions relative to aA P W legitimate measure of local inaccuracy. Accuracy (Global). An agent ought to minimize the global inaccuracy 5. Bas van Fraassen’s comments were instrumental in making this section clearer. MEASURING INACCURACY 205 Figure 2. of her belief function relative to a legitimate measure of global in- accuracy. Note that, while we will be interested throughout in versions of both of these norms, Joyce (1998, 2009) is concerned solely with a version of Accuracy (Global). Thus, we have two norms to which Accuracy gives rise. However, as formulated above, the norms are still incomplete: both local and global inaccuracy are only defined relative to a world, but we have not specified yet relative to which world or set of worlds the inaccuracies in questions are to be calculated. The obvious answer at this point would seem to be “relative to the actual world.” But from an internalist point of view on justification—which we are going to adopt, as we will explain at greater length below (sec. 3.3)—this will not do, since we should not presuppose that the agent knows which world w in W is the actual world. Instead, the agent should take into account inaccuracies with respect to all and only the worlds that are epistemically possible for her; if this set is taken to be epistemically accessible to her, then we do not violate internalism about justification by demanding that she assesses her overall inaccuracy in terms of it. Hence, we focus on the measures of expected local and 206 HANNES LEITGEB AND RICHARD PETTIGREW global inaccuracy to which any pair of legitimate local and global inac- curacy measures will give rise, and we evaluate these expected inaccuracy measures over the set E of epistemically possible worlds. The expected local inaccuracy of a degree of credence is defined, as one would expect, as the sum of its inaccuracies at various worlds weighted by the degree of belief assigned to each of those worlds (or rather to their singleton sets). Thus, to determine the expected local inaccuracy of a degree of credence x in a proposition A, we must specify three parameters: i) The belief function that gives the degree of belief assigned to each of the worlds over which the sum is taken. That is, the belief function we use to weight the inaccuracies that we sum to give the expected local inaccuracy. ii) The set of worlds over which the sum is taken. This is the set of worlds that are epistemically possible for the agent. iii) The local inaccuracy measure that gives the inaccuracies of the degree of credence at the various worlds over which the sum is taken. Thus, we have the following definition: Definition 1 (Expected local inaccuracy). Given a local inaccuracy measure I, a belief function b, a degree of credence x, and propositions , we define the expected local inaccuracy of x in propositionA, E P W A by the lights of b, with respect to I, and over the set E of episte- mically possible worlds as follows: LExp (I, A, E, x) p b({w})I(A, w, x).�b w�E Expected global inaccuracy requires us to fix the same parameters. It is defined as follows: Definition 2 (Expected global inaccuracy). Given a global inaccuracy measure G, belief functions b and b′, and a proposition , weE P W define the expected global inaccuracy of b′ by the lights of b, with respect to G, and over the set E of epistemically possible worlds as follows: ′ ′GExp (G, E, b ) p b({w})G(w, b ).�b w�E Thus, our second attempt to make Accuracy precise gives: Accuracy (Expected local). An agent ought to minimize the expected local inaccuracy of her degrees of credence in all propositions A P relative to a legitimate measure of local inaccuracy.W MEASURING INACCURACY 207 Accuracy (Expected global). An agent ought to minimize the expected global inaccuracy of her belief function relative to a legitimate mea- sure of global inaccuracy. However, neither of these proposals is fully specified in its current form either, for we have not said by the lights of which belief function an agent ought to assess her expected local or global inaccuracy or over which set of epistemically possible worlds. Specifying the belief function by the lights of which we assess expected local or global inaccuracy leads us to a further distinction: the distinction between synchronic and diachronic versions of both Accuracy (Expected local) and Accuracy (Expected global). Here are the synchronic versions of Accuracy (Expected local) and Accuracy (Expected global): Accuracy (Synchronic expected local). An agent ought to minimize the expected local inaccuracy of her degrees of credence in all prop- ositions by the lights of her current belief function, relativeA P W to a legitimate local inaccuracy measure and over the set of worlds that are currently epistemically possible for her. Accuracy (Synchronic expected global). An agent ought to minimize the expected global inaccuracy of her current belief function by the lights of her current belief function, relative to a legitimate global inaccuracy measure and over the set of worlds that are currently epistemically possible for her. And here are the diachronic versions of Accuracy (Expected local) and Accuracy (Expected global), where an agent has learned evidence between time t and time t′ that imposes constraints C on her belief function atb ′t time t′ or on the set E of worlds that are epistemically possible for her at t′ or both: Accuracy (Diachronic expected local). At time t′, such an agent ought to have a belief function that satisfies constraints C and is minimal among belief functions thus constrained with respect to the expected local inaccuracy of the degrees of credence it assigns to each proposi- tion by the lights of her belief function at time t, relative toA P W a legitimate local inaccuracy measure and over the set of worlds that are epistemically possible for her at time t′ given the constraints C. Accuracy (Diachronic expected global). At time t′, such an agent ought to have a belief function that satisfies constraints C and is minimal among belief functions thus constrained with respect to expected global inaccuracy by the lights of her belief function at time t, relative to a legitimate global inaccuracy measure and over the set of worlds that are epistemically possible for her at time t′ given the constraints C. 208 HANNES LEITGEB AND RICHARD PETTIGREW These are very nearly our final versions of Accuracy. All that remains is to specify the legitimate local and global inaccuracy measures. This is the work of section 5, where we argue for the main thesis of this article: the only legitimate local and global inaccuracy measures are quadratic inaccuracy measures, which are known as Brier scores in the literature on scoring rules (Brier 1950). We call these characterizations Local Inaccu- racy Measures and Global Inaccuracy Measures, respectively. By substi- tuting them into the appropriate norms above, we obtain the four math- ematically precise versions of Accuracy that are the aim of this article. In Leitgeb and Pettigrew (2010), we investigate the consequences of these norms. Of course, as with any set of norms, it may turn out that some or all of these norms are simply unsatisfiable. Before investigation, there is no reason to think that there are such minimally inaccurate belief functions in the senses required by these norms. However, as we will show in the sequel to this article, such a situation arises only for certain instances of Accuracy (Diachronic expected local); for each of the other norms and indeed for many instances of this norm, there are belief functions that satisfy them, and, moreover, they are the belief functions that the Bayesian demands. Furthermore, in the situations in which Accuracy (Diachronic expected local) cannot be satisfied, its global analogue Accuracy (Dia- chronic expected global) can be. Thus, our approach does make a demand in these situations. We will discuss this further in the second article. Note that and will always be zero if′LExp (I, A, E, x) GExp (G, E, b )b b the global belief function b is identical to the constant zero function on singletons of members of E. So if an agent has ruled out all worlds in E by means of b, any degree of belief whatsoever may be assigned to A in order to minimize expected local inaccuracy, and any belief function b′ may be chosen to minimize global inaccuracy. In order to do better, we would have to generalize our framework and aim at a justification of Popper functions (see Popper [1968], for one reference among many) rather than standard absolute probability measures. We leave this topic to a different paper. In section 3, we discuss in greater detail the transition from Accuracy to the four norms just listed, making explicit all the formal and philo- sophical presuppositions of our theory. We are not going to justify these presuppositions in any substantial manner, but at least we will make sure we put all of our cards on the table, and we will formulate explicitly the main questions that will remain open. In section 4, we give a brief sketch of the argument in favor of quadratic inaccuracy measures, both local and global, and in section 5, we give this argument in full detail. MEASURING INACCURACY 209 3. The Presuppositions of Our Argument. 3.1. The Ought-Can Principle. We wish to make the following norm precise: Accuracy. An epistemic agent ought to approximate the truth. In other words, she ought to minimize her inaccuracy. In doing so, we will be guided at a number of points by the following version of a well-known normative or metanormative principle, the func- tion of which is to constrain our choice of normative systems: Ought-Can. A norm should not demand anything of an agent that is beyond her epistemic reach. This is just a variant of the classical Ought-Can principle applied in the present context. We leave open the exact character of the possibility modality that is implicit in ‘Can’ and ‘reach(ability)’. However, as will become clear from our applications of Ought-Can, it is certainly stronger than mere logical possibility and not too far from a notion of realistic achievability in the epistemic domain. One important consequence of Ought-Can is that one should not de- mand of an agent that she draw distinctions that she is conceptually unable to draw. Here is one way in which this is relevant to our argument: in the following sections, we presuppose that our agents hold opinions about a finite and nonempty set W of possible worlds of which they assume the actual world to be one. There is nothing particularly philosophical about our decision to stick to the case of finitely many worlds in this article; we simply assume this to be so and postpone the discussion of the infinite case to another time. However, already this seemingly harmless suppo- sition may appear to have drastic—and even drastically wrong—conse- quences: since there are, presumably, infinitely many possible worlds, every choice of a finite set of ‘possible worlds’ must consist in carving the actual set of worlds into finitely many chunks, which are then presupposed to figure as the chosen ‘(pseudo)possible worlds’ of the framework. Worse, if a uniform probability measure has been defined over the resulting finite set W, as demanded by Objectivist Bayesianism (Uniform Distribution), it is unclear whether this measure is also uniform over the actual set of worlds and, if so, in what sense. So are we buying into a presupposition that makes our approach appear highly questionable from the start? Not necessarily, in light of the Ought-Can principle. Assume that we are solely concerned with agents whose conceptual resources cut the space of logical possibilities into finitely many pieces—the members of W. Indeed, we may assume these agents to be epistemically unable to distinguish between any two of the possible worlds that belong to one and the same partition set 210 HANNES LEITGEB AND RICHARD PETTIGREW and to be incapable also of altering their conceptual framework on any rational grounds for the time of our investigation. From the viewpoint of any of these agents, at the time of our investigation, there is thus no way of having any sort of epistemic access to the ‘actual’ infinite set of possible worlds. Demanding of such agents that they transcend their con- ceptual boundaries would mean demanding that they go beyond their epistemic reach, in conflict with Ought-Can. Therefore, whenever we refer to agents in this article, let them be as just described, and Ought-Can will allow us to proceed without any further worries. For many practical purposes, maybe even we can be taken to be such finitely constrained agents. In all other contexts, that is, whenever we are dealing with full- fledged human agents and their (maybe) infinitary conceptual capacities, the arguments of this article do not apply. But hopefully, even in the latter case, our arguments will be interesting in themselves. Ought-Can will prove even more relevant in section 3.3, where we derive from it an in- ternalist view of justification that forces us to move from Accuracy (Local) and Accuracy (Global) to Accuracy (Expected local) and Accuracy (Ex- pected global), as we mentioned above. And finally, the principle will play a crucial role in section 5 as well, where we argue for Local and Global Inaccuracy Measures, which specify the legitimate local and global measures of inaccuracy: it follows from Ought-Can that a normative system should not be such that it may lead an agent who obeys the norms of this system into an epistemic dilemma, that is, into a situation in which she ought to change her epistemic state in two or more ways that are jointly impossible. We show that an agent who employs an inaccuracy measure that is not permitted by Local and Global Inaccuracy Measures may face an epistemic dilemma akin to the so-called discursive dilemma in the theory of judgment aggregation. From this conclusion and Ought-Can, our characterizations of the legitimate inaccuracy measures as being the quadratic ones will follow. 3.2. The Geometrical Framework. So much for Ought-Can; we will re- turn to it below. In the meantime, we turn to the other background as- sumptions of our approach. Broadly speaking, we are taking a geometrical approach; to coin a slogan, we are interested in the ‘geometry of reason’. Of course, it is common procedure to go back and forth between prob- ability measures as characterized by the Kolmogorov axioms and prob- ability measures viewed as points in geometrical space, so it might seem that putting forward a geometrical account of belief dynamics is com- pletely unproblematic. However, our task is not so much to presuppose probability theory and to exploit its mathematical features in applications of probabilistic reasoning but rather to justify probabilistic reasoning in MEASURING INACCURACY 211 a context in which probability theory is not yet in place. So we will have to take things more slowly. Since W is finite, there is an n such that . We start by posi-FWF p n tioning the members of W in the n-dimensional Euclidean space byn� ‘identifying’ each world with the ith unit vectorw � W p {w , . . . , w }i 1 n in , that is, with the vector , where if andn� (d , . . . , d ) d p 1 j p ii,1 i,n i, j if . If the real number 1 is taken to represent truth and thed p 0 j ( ii, j real number 0 is taken to stand for falsity, then each such vector can be considered as an assignment of the geometrical coun-(d , . . . , d )i,1 i,n terparts of truth and falsity to the singleton propositions ,{w }, . . . , {w }1 n such that the jth coordinate of the vector is identical to the(d , . . . , d )i,1 i,n real number 1 if and only if the proposition is true in the world that{w }j is represented by this vector. Given this mapping from worlds to geo- metrical points or vectors, we can measure distances or ‘closeness’ between worlds with respect to each other in terms of the Euclidean distance between their geometric counterparts. Accordingly, regarding truth values as points , respectively, in the one-dimensional Euclidean space,1, 0 1, 0 that is, , allows us to measure distances between truth values geomet-� rically. Now, there is certainly not just one geometrically ‘natural’ notion of distance in Euclidean space. In fact, there are lots of geometrically plausible metrics on and that are distinct from the Euclidean metric,n� � and indeed we are going to argue later that inaccuracy has to be measured in terms of one of them. However, we will demand that every such notion of the geometrical distance between two points supervenes on—is func- tionally dependent on—the Euclidean distance between these points. In this sense, measuring closeness will always amount to a geometrical, and indeed Euclidean, procedure in the context of this article. The next step is to locate degrees of belief within and to place belief� functions into . As pointed out at the beginning of this article, wen� consider belief functions to be mappings of the form , so�b : P(W ) r � 0 degrees of belief are assumed to be quantitative objects from the start.6 Following Accuracy from above, we regard rational agents to be aiming at distributing their degrees of belief in such a way that every such degree approximates the truth value of the proposition A (although it is notb(A) yet clear exactly in what sense). Hence, a rational agent’s degree of belief for a proposition is nothing but the agent’s best possible estimate or ‘simulation’ of the truth value of that proposition, given her present ep- istemic situation. Since truth and falsity have been represented by real 6. We could have regarded belief functions as mappings into rather than ; some�� �0 parts of our argumentation are in fact not going to hang on this. Our reason for not doing so from the start is quite simply that some readers might find negative degrees of belief odd. 212 HANNES LEITGEB AND RICHARD PETTIGREW numbers, too, degrees of belief and truth values are comparable—they occupy the same quantitative or geometrical scale. So, for example, as- signing a degree of belief 1 to a proposition A would mean that the agent believes that A is true rather than false, since the degree of belief 1 is closer—in fact, identical—to the real number 1 that represents truth than it is to the real number 0 representing falsity. In this way, closeness of a degree of belief to the truth can be measured again according to a metric on the one-dimensional Euclidean space. In order to see how belief functions determine points in the n-dimen- sional Euclidean space, by which it then becomes possible to measure their distances from the point vectors that represent possible worlds, it is useful to introduce the following terminology: call any function a global belief function, and let be the set of all�b : W r � Bel (W )glo 0 glo global belief functions. Obviously, every such global belief function may be regarded as a vector with forn(a , . . . , a ) � � a p b (w ) j p1 n j glo j . Since every belief function b induces a global belief function1, . . . , n by means of , we can position b in through itsnb b (w ) p b({w }) �glo glo j j corresponding global belief function. Apart from supplying belief func- tions with a geometrical interpretation, the latter equation also yields a way of interpreting global belief functions: we call them global belief functions as they may be taken to summarize an agent’s attitude toward all the worlds w in W, that is, toward all of the singleton propositions for . For example, a global belief function with the vector{w} w � W represents an agent who is—in the “degrees of belief being(1, 0, . . . , 0) best possible estimates or ‘simulations’ of truth values” sense mentioned above—certain of the truth of the proposition and certain that every{w }1 other singleton proposition is false. Geometrically, the distance between the vector that belongs to this global belief function and the vector that corresponds to the world is 0, which is exactly what we want to bew1 the case in such a situation. So, (global) belief functions and worlds have become comparable as well, again by means of their geometrical repre- sentations.7 7. We are not suggesting that this is the only ‘natural’ way of identifying belief functions with vectors or that this type of identification is free of presuppositions. In particular, representing belief functions geometrically in such a manner corresponds in some sense to a ‘bias’ toward worlds. It might well be that if a different form of representation were chosen, then we would not end up justifying Bayesianism but some different account of belief dynamics (say, the Dempster-Shafer approach or something else). At the same time, it is not as if this ‘bias’ toward worlds gives us anything like the Bayesian tenets in any obvious and immediate manner: it will need elaborate arguments, substantial postulates, and mathematical proofs until we will have finally derived these tenets (as we will in the sequel to this article). As we will point out later, there are various parameters in our theory that we set in a particular way and for which we investigate the consequences of MEASURING INACCURACY 213 We have seen how belief functions determine corresponding global be- lief functions. Is there also a way of inverting this procedure? That is, given a global belief function , is there a similarly salient way of de-bglo termining a belief function b that assigns degrees of belief not just to singleton propositions (or worlds) but to all propositions whatsoever? If Bayesianism were taken for granted, the answer would of course be yes, by iterated application of finite additivity. But Bayesianism is exactly what is at issue here, so without further argument there does not seem to be any obvious way of determining a unique belief function from a given global belief function justifiedly or, for that matter, of determining jus- tifiedly any belief function from a given global belief function at all. For this reason, presupposing that an agent’s epistemic state at a time involves the acceptance of a global belief function is at least prima facie a weaker presupposition than assuming that the agent’s epistemic state at that time involves the acceptance of a belief function. 3.3. Internalism, Expected Inaccuracy, and Ought-Can Again. In this section, we return to Ought-Can. Above, we noted that this principle forces the shift from Accuracy (Local) and Accuracy (Global) to Accuracy (Expected local) and Accuracy (Expected global). Here we explain how. In section 5, we will argue that the only legitimate inaccuracy measures are quadratic inaccuracy measures. However, while this conclusion gives us an important auxiliary notion of inaccuracy, it does not yet by itself yield a notion of inaccuracy that an agent can actually make use of in order to determine the local inaccuracy of her degree of credence in a proposition or the global inaccuracy of her belief function. The reason, as we observed above, is simply that the agent cannot be assumed to know at which world she is evaluating inaccuracies. So, by Ought-Can, we require a conception of the minimization of inaccuracy toward which the agent is epistemically capable of aiming. On such a conception, an agent’s assessment of the local inaccuracy of a degree of belief in a prop- osition will be bound to take into account the local inaccuracies of that degree of belief at all possible worlds that are not excluded by the evidence available to her, that is, all worlds that are epistemically possible for her. Similarly, on this conception, an agent’s assessment of the global inac- curacy of a belief function will be bound to take into account the global inaccuracies of that belief function at all epistemically possible worlds. Thus, the Ought-Can principle gives rise to a form of internalism about justification, and this internalism leads in turn to the notions of expected local and global inaccuracy defined above (definitions 1 and 2) and to setting them as such. But we would be equally interested in studying alternative ways of setting these parameters and in determining the consequences of these alternative settings. 214 HANNES LEITGEB AND RICHARD PETTIGREW versions of the Accuracy norm that demand that an agent should minimize not her actual local or global inaccuracy but her expected local or global inaccuracy instead. And, as we described above, this move results in a further bifurcation of the norms: Accuracy (Expected local) splits to give Accuracy (Synchronic expected local) and Accuracy (Diachronic expected local), while Accuracy (Expected global) splits to give Accuracy (Syn- chronic expected global) and Accuracy (Diachronic expected global). The synchronic norms constrain the intrinsic nature of an agent’s belief func- tion at particular times in her epistemic life: at any such time t, the agent’s belief function bt must have minimal expected local (respectively, global) inaccuracy by the lights of bt itself. And the diachronic norms constrain the relation between an agent’s belief function at times t and t′ between which the agent obtains evidence that places constraints C on the legit- imate belief functions at t′ or on the epistemically possible worlds at t′ or on both: must satisfy C and must be minimal, among those beliefb ′t functions that satisfy C, with respect to expected local (respectively, global) inaccuracy by the lights of bt and over the possible worlds that are epi- stemically possible at t′ and that satisfy the constraint C. We should say a little more about the mathematical notion of expec- tation in this context. Usually, expectations are defined only for proba- bility measures. However, in the absence of Probabilism—a claim that we wish to establish, not presuppose—there is no reason for thinking that belief functions are probability measures. However, while probability the- ory is the usual context in which expectations are defined, there is no objection in principle to extending the definition to cover the case of belief functions that may not be probability measures. Of course, if the belief function is not additive, we will not be able to prove the equivalence of the definitions we have given (see definitions 1 and 2) with two alternative definitions that arise from definitions of expected values that are standard in probability theory: namely, LExp (I, A, E, x) p r # b({w � W : I(A, w, x) p r}),�b r�Ran(I ) ′GExp (G, E, b ) p r # b({w � W : G(w, b) p r}).�b r�Ran(G) But we will never use the alternative definitions, and we take the definitions that we have given before to be the conceptually basic ones, so we do not regard this as a problem. It would be great to support this view by having to hand a general and abstract theory of the concept of expected value for which one could actually prove that, given the presuppositions of our approach, the resulting notion of expected value must be the one MEASURING INACCURACY 215 that is employed in definitions 1 and 2.8 We will have to leave any in- vestigations into that topic for a different occasion. By using what we take to be the basic definitions of expectation, we avoid the objection that Joyce (1998, 589) raises against Rosenkrantz’s (1981) appeal to expected inaccuracy. Like us, Rosenkrantz favors qua- dratic global inaccuracy measures. He lists constraints on global inac- curacy measures and conjectures that they are satisfied uniquely by the quadratic functions. However, he gives no proof of this claim, and, as Joyce notes, he gives no noncircular arguments in favor of his constraints. In particular, Rosenkrantz demands of an agent that she minimize her expected inaccuracy calculated over every possible partition of the space simultaneously. As Joyce points out, unless her belief function is already assumed to be a probability function, this will not be possible. Our precise versions of the Accuracy norms are not vulnerable to this objection since we only ever appeal to the most fine-grained partition of W that is con- ceptually available to the agent, that is, the set of singletons of worlds in W. 3.4. The Status of Our Presuppositions. Now that the underlying formal framework of our theory has been made more explicit, how are we going to argue for it? Short answer: we do not. Not that there is nothing at all to say in favor of it: among other things, one could point out that some aspects of our geometrical framework are purely conventional and thus do not need any further justification at all. For example, we could have represented truth by, say, the real number 2 and, accordingly, worlds by vectors that are like unit vectors but where the coordinate 1 is replaced by 2 and so forth.9 In other words, the absolute position of truth values, worlds, and belief functions in the Euclidean plane is arbitrary. However, fortunately, none of our results depend on it. Furthermore, if that Kantian move helped at all, the choice of a Eu- clidean geometry rather than a non-Euclidean one could perhaps be grounded in our intuitions about space and distance in general—although the ‘geometry of reason’ we are after can hardly be called Kantian, and in a context in which information is represented geometrically, often met- rics other than the Euclidean one are used. In fact, one would be com- pletely justified in wondering why degrees of belief should “live” in a Euclidean space at all—after all, physical space turned out to be non- Euclidean, and even from our internalist perspective it is perfectly rea- 8. We thank Franz Dietrich for highlighting this in a personal communication. 9. Franz Huber (2006, 2) cites a corresponding worry raised by Colin Howson in an unpublished manuscript. 216 HANNES LEITGEB AND RICHARD PETTIGREW sonable to assume that there might be agents whose “cognitive spaces” are non-Euclidean.10 Other aspects of the framework do not even have a conventional or intuitive character at all: for example, why is the notion of expected local inaccuracy given by taking the sum of weighted local inaccuracies rather than their product or their maximum or whatever other function comes to mind? Again, one could surely do better than just leaving the discussion at that point: indeed, one should be able to defend some necessary con- ditions on expected inaccuracy on the basis of assuming Ought-Can and the rest of the geometrical framework; it might also be possible to prove representation theorems by which our expected local inaccuracy functions would turn out to have an equivalent, and intrinsically plausible, quali- tative or comparative formulation and so forth. But it is hard to see how any such justification would be ultimate. Similarly, why does each world have the same distance from each other world according to their geometric representations? Is this because we have renormalized the scales of our given coordinate system in a way that leads to this result trivially, or do we commit ourselves to a substantial assumption that is at least implicitly pointing toward Objectivist Bayesianism? We will have to say just slightly more about this last point in the sequel to this article, but otherwise we simply leave the status of the framework untouched; that is, we take this geometrical framework as a presupposition of our justification of (Ob- jectivist) Bayesianism without giving it any further defense. One final remark, though: we do regard the question of just how much work in our argumentation is done by its geometrical Euclidean back- ground framework as a very important one; eventually, this article ought to be complemented by one that abstracts from the epistemic-geometrical models that we presuppose all and only the essential axioms that are needed in order for our arguments to go through, and only then will it be possible to see how much weight is carried by our Euclidean presup- positions. We will return to this as an open question to be formulated in the final section of this article. When we determine the quadratic inac- curacy measures as the legitimate ones in section 5, this ought to be understood in the way that they are the legitimate ones relative to the chosen Euclidean framework; other inaccuracy measures might be legit- imate if given a different framework.11 10. We are grateful to Alan Hájek and Kenny Easwaran for highlighting this as an important open problem. 11. We thank Branden Fitelson for making this point in a discussion of our article. MEASURING INACCURACY 217 TABLE 1. ARGUMENTS. First Argument Second Argument Third Argument Shared premises (sec. 5.1): Local Normality and Dominance X X X Global Normality and Dominance X X X Local and Global Comparability X X X Local and Global Minimum Inaccuracy X X X Dilemma 1 premise (sec. 5.2.1): Agreement on Inaccuracy X Dilemma 2 premise (sec. 5.2.2): Separability of Global Inaccuracy X Dilemma 3 premises (sec. 5.2.3): Continuous Differentiability and Agreement on Di- rected Urgency X Conclusion: Local Inaccuracy Measures X X X Global Inaccuracy Measures X X X 4. The Argument for Local and Global Inaccuracy Measures: An Over- view. Section 5 is devoted to justifying two claims: Local Inaccuracy Mea- sures and Global Inaccuracy Measures. The former says that the legiti- mate local inaccuracy measures are the quadratic scoring rules. The latter says that the legitimate global inaccuracy measures are the quadratic functions of the Euclidean metric on . In this section, we give an over-n� view of the argument. In fact, we will state three arguments for identifying these characteri- zations of the legitimate local and global inaccuracy measures. Each begins in the same way by imposing four conditions that restrict the class of legitimate inaccuracy measures (both local and global) on the basis of the principles highlighted in section 3. And each continues by noting that these restrictions fail to exclude inaccuracy measures that give rise to a dilemma for the agent, that is, a situation in which the epistemic norms of the previous section, combined with these inaccuracy measures, entail two mutually exclusive prescriptions for the agent. In each case, when we restrict the class of legitimate inaccuracy measures to exclude those that lead to the dilemma in question, we are left only with the quadratic inaccuracy measures, and local and global inaccuracy measures follow immediately. Thus, in section 5, we give three separate arguments, each of which shares its first four premises with the others. We set them out in table 1 for ease of reference. Each of the premises will be supported by our presuppositions—some strictly, some defeasibly. Furthermore, each of the arguments mentioned in this section will be seen to be strictly valid, that is, deductively valid given mathematics, as shown by the proofs given in the appendix. 218 HANNES LEITGEB AND RICHARD PETTIGREW The conclusions of these three arguments will be our promised two characterizations of legitimate inaccuracy measures that we want to es- tablish in section 5: Local Inaccuracy Measures. If is a legiti-� �I : P(W ) # W # � r �0 0 mate measure of the local inaccuracy of a degree of credence x in a proposition A at a possible world w, then there is such thatl � � 10 2I(A, w, x) p l[x (w) � x] ,A where is the characteristic function of the propositionx : W r {0, 1}A A. Global Inaccuracy Measures. If is a legitimate�G : W # Bel(W ) r � 0 measure of the global inaccuracy of a belief function b at a possible world w, then there is such thatl � � 10 2G(w, b) p lFFw � b FF ,glo where w and are represented by vectors as in section 3.2 andbglo is the Euclidean distance between vectors u and : that is,FFu � vFF v 2 2�FFu � vFF p (u � v ) � . . . � (u � v ) .1 1 n n The claim that quadratic inaccuracy measures yield the only legitimate scoring rules is similar to Selten’s central claim (1998). Thus, it might seem that we could easily adapt Selten’s ingenious argument to establish Local Inaccuracy Measures. However, at a number of points in his proof, Selten relies on the assumption that belief functions are probability func- tions. For his avowed purpose, this is perfectly legitimate. However, for the purposes of this article, we could not avail ourselves of a result pre- mised on this assumption to establish Local Inaccuracy Measures, which we then in turn wish to use to derive Probabilism, among other things (and, to the best of our knowledge, similar points can be made about much of the excellent and highly evolved literature on scoring rules and decision theory). Thus, we must beat our own path to our conclusion. As pointed out, we beat three paths, which begin together and diverge only at the final premises of the arguments. Thus, we begin with the shared premises in section 5.1; then, in section 5.2, we consider the three dilemmas that motivate the three different final premises. All three arguments turn on mathematical theorems; their proofs are annexed in the appendix. Among the three arguments, we consider the final one (sec. 5.2.3) to be the strongest and most convincing, but the first two arguments (secs. 5.2.1 and 5.2.2) are easier to state, which is why we will turn to them before we give the third, and philosophically central, argument. MEASURING INACCURACY 219 5. Measuring Inaccuracy. 5.1. The Shared Premises. The first premise of each of our arguments combines local analogues of Joyce’s Normality and Dominance conditions (1998, 596 and 593, respectively): the local version of Joyce’s Normality condition says that the inaccuracy of degree of credence x in proposition A at world w ought to depend only on the difference between x and the value of the characteristic function of A at w (i.e., the truth value of A at w); the local analogue of Dominance merely states that local inaccuracy increases as this difference increases. Local Normality and Dominance. If I is a legitimate inaccuracy mea- sure, then there is a strictly increasing function such that,� �f : � r �0 0 for any , , and ,�A P W w � W x � � 0 I(A, w, x) p f (Fx (w) � xF).A Note that this also implies that distances from the truth ( )x (w) p 1A and distances from falsity ( ) are measured in the same way,x (w) p 0A which is entailed by our geometrical take on truth and falsity as points in a space. It is clear that once a Euclidean framework such as ours is in place, a condition analogous to Local Normality and Dominance ought to hold for global inaccuracy measures as well. Local Normality and Dominance asserts that the local inaccuracy of a degree of credence x in proposition A at world w ought to be a strictly increasing function only of the difference (i.e., the Euclidean distance) between x and . Itsx (w)A analogue, Global Normality and Dominance asserts that the global in- accuracy of a global belief function b at a world w ought to be a strictly increasing function only of the Euclidean distance between the vector representation of b and the vector representation of w. That is, Global Normality and Dominance. If G is a legitimate global inac- curacy measure, there is a strictly increasing function � �g : � r �0 0 such that, for all worlds w and belief functions ,b � Bel(W ) G(w, b) p g(FFw � b FF).glo Global Normality and Dominance is a consequence of taking seriously the talk of inaccuracy as ‘distance’ from the truth, and it endorses the geometrical picture provided by Euclidean n-space as the correct clarifi- cation of this notion. As explained in section 3.2, the assumption of this geometrical picture is one of the presuppositions of our account, and we do not have much to offer in its defense, except for stressing that we would be equally interested in studying the consequences of minimizing 220 HANNES LEITGEB AND RICHARD PETTIGREW expected inaccuracy in a non-Euclidean framework. But without a doubt, starting with the Euclidean case is a natural thing to do. The third premise that is shared by each of our arguments for local and global inaccuracy measures says that any function on the real numbers that gives rise to a legitimate local inaccuracy measure also gives rise to a legitimate global inaccuracy measure and vice versa. Local and Global Comparability is as follows: i) If is a legitimate local inaccuracy mea-I(A, w, x) p f(Fx (w) � xF)A sure, then is a legitimate global inaccuracyG(w, b) p f (FFw � b FF)glo measure. ii) If is a legitimate global inaccuracy mea-G(w, b) p g(FFw � b FF)glo sure, then is a legitimate local inaccu-I(A, w, x) p g(Fx (w) � xF)A racy measure. Again, this is a consequence of our geometrical interpretation of accuracy: we interpret inaccuracy as distance from the truth, and we interpret dis- tance as being given by a strictly increasing function of the Euclidean metric. Since distances are independent of dimension, it should always be possible to use legitimate local inaccuracy measures in order to determine their global counterparts and also the other way round; it is simply not relevant on which dimensions Euclidean distances are measured. The final premise shared by each of our three arguments for local and global inaccuracy measures does nothing more than to lay down a con- vention: we will permit only inaccuracy functions that take value zero when the distance between truth value and degree of belief or between world and global belief function is zero. Minimum Inaccuracy is as fol- lows: i) If is a legitimate local inaccuracy mea-I (A, w, x) p f(Fx (w) � xF)A sure, then .f (0) p 0 ii) If is a legitimate global inaccuracy mea-G(w, b) p g(FFw � b FF)glo sure, then .g(0) p 0 In the presence of Local and Global Comparability, we can derive i from ii and ii from i. Thus, we need only impose one of these conditions. However, we state both, lest the reader be given the mistaken impression that one or the other is more fundamental. Together, Local Normality and Dominance, Global Normality and Dominance, Local and Global Comparability, and Minimum Inaccuracy restrict the class of legitimate inaccuracy measures. However, as we shall see in the next section, they do not restrict them enough. There are func- tions that satisfy these restrictions but which have undesirable properties. We call attention to one particular sort of undesirable property, that is, the property of giving rise to a dilemma for an epistemic agent. In each MEASURING INACCURACY 221 case, the dilemmas in question concern possible discrepancies between measuring inaccuracy in a local and in a global fashion. We show that, when we exclude the inaccuracy measures that give rise to these dilemmas, we are left with only the quadratic inaccuracy measures. This will complete our argument for local and global inaccuracy measures. Quadratic in- accuracy measures will be the ones that allow the local and the global perspective on belief functions to be compatible with each other. 5.2. Excluding Dilemmas: Completing the Three Arguments. An inac- curacy measure gives rise to a dilemma for an agent if the prescription to be as accurate as possible with respect to that inaccuracy measure entails two prescriptions for the agent such that she cannot satisfy both together. In sections 5.2.1–5.2.3, we consider three dilemmas to which an inaccuracy measure may give rise. In each case, we introduce a principle to exclude such inaccuracy measures and show that local and global in- accuracy measures follow from this stipulation, along with the four con- ditions enumerated in section 5.1. The respective dilemmas are serious in the following sense: (i) they are about minimizing inaccuracy, the central goal of our epistemic agents; (ii) they involve an agent’s having to choose to follow either of two options or norms; (iii) there does not seem to be any principled way of ranking the two options or norms, such that one would become epistemically prior or superior to the other. Since the di- lemmas below are serious in this sense, they have to be avoided, and the only manner in which this can be done is by making sure that the two options or norms never lead to different epistemic recommendations to the agent. This—defeasible—argumentation in favor of principles by which the dilemmas may be avoided can only be defeated in either of two ways: first of all, by attacking iii, that is, by showing that for each of the three dilemmas there is in fact a way of ranking one option or norm over the other. For example, if someone were to put forward a sufficiently strong argument in favor of a form of epistemic holism according to which considerations of global inaccuracy always overrule considerations of lo- cal inaccuracy, then this would defeat the seriousness of each of our three dilemmas. Second, it can be defeated by presenting yet another serious dilemma in the sense of i–iii into which an agent is led by opting for our quadratic inaccuracy measures. In that case, no inaccuracy measure what- soever could protect an agent from being confronted with some serious dilemma, and the best one could hope for would be a multitude of mu- tually exclusive and partially defective choices of inaccuracy measures, such that each one of them would avoid some epistemic dilemmas, but none of them would avoid all. We hope that at least as things stand, our arguments are undefeated as yet. 222 HANNES LEITGEB AND RICHARD PETTIGREW 5.2.1. Agreement on Inaccuracy. By Local and Global Comparability from section 5.1, if the function f gives rise to a legitimate local inaccuracy measure, then it gives rise to a legitimate global inaccuracy measure as well, and vice versa. However, the four conditions enumerated in section 5.1 can be shown not to exclude functions f such that (1) the global inaccuracy measure determines the inaccuracy of a belieff (FFw � b FF)glo function at a world, (2) its counterpart local inaccuracy measure yields an indirect way of also determining the inaccuracyf (Fx (w) � b(A)F)A of a belief function at a world by summing up the local inaccuracies of degrees of belief assigned to world propositions in the expected manner, and yet the outcomes of the two determination procedures differ. Such a disagreement would give rise to a dilemma: the agent who uses both the global inaccuracy measure and its local counterpart will come to two conflicting conclusions concerning the inaccuracy of her beliefs. Of course, it might be the case that despite the numerical disagreement, some formal properties are still shared by the globally and the indirectly locally de- termined inaccuracies of belief functions at a world—for instance, the ordering of belief functions according to their inaccuracies. And searching for the belief function that minimizes expected inaccuracy first in a glob- ally and then in a locally induced way might still yield one and the same output, even when the globally and the locally determined inaccuracies diverge in value for some or even all arguments. But the only way for the agent to have a guarantee that the global and the local procedure will never lead to any conflict whatsoever—and thus that the global and the local procedure always lead to the same epistemic recommendations, in- dependently of how sensitive the agent is to the exact numerical inaccuracy values—is to postulate a convergence between the global and local way of determining the inaccuracy of any belief function at any world. Otherwise the agent’s situation would be analogous to that of a group of individuals faced with making a collective judgment that is vulnerable to the paradoxes of judgment aggregation. For instance, consider the stock example of the so-called discursive dilemma, the most vivid of these par- adoxes (see, e.g., Pettit 2001). Discursive Dilemma. Three judges must decide whether to convict a defendant. By law, the defendant may be convicted if and only if propositions P and Q hold. The judges’ judgments on P and Q and the consequence for conviction are recorded in the following table, along with the majority judgment in each case. MEASURING INACCURACY 223 P Q Conviction Judge 1 True True Yes Judge 2 True False No Judge 3 False True No Majority True True Yes/no Thus, while the majority of the individual judgments concerning P and Q leads to a conviction, the majority of the consequences of those judgments for conviction leads to acquittal. Which conviction consequence reflects the aggregate of the judges’ judgments? This is the discursive dilemma. In the discursive dilemma, there is a tension between two algorithms by which to aggregate individual judgments on two propositions by three individuals into a single judgment. The first algorithm begins by deriving the conviction consequences from the individual judgments on P and Q and then takes the majority verdict; the other begins by taking the majority verdict on each of P and Q and then derives the conviction consequence. They lead to conflicting results, and it is not clear at all how to tell between them. The agent who uses global and local inaccuracy measures that give rise to different values for the inaccuracy of her belief function faces a similar problem. She faces the problem of aggregating her various degrees of belief in various propositions into a value for the inaccuracy of this belief function as a whole. She has at her disposal two obvious measures by which to obtain this value: one is just given by applying the global in- accuracy function itself, and the other by summing up the relevant local inaccuracies. If they disagree, the agent faces an irresolvable dilemma, analogous to that faced by the judges in the discursive dilemma.12 We exclude the possibility that gives rise to this dilemma by imposing the following condition on legitimate inaccuracy measures that yields a principled way of relating global and local inaccuracy judgments: Agreement on Inaccuracy. Suppose I is a legitimate local inaccuracy measure. Then, by Local Normality and Dominance, there is a strictly increasing function such that� �f : � r � I(A, w, x) p f (Fx (w) �0 0 A . Further, by Local and Global Comparability,xF) G(w, b) p f (FFw � is a legitimate global inaccuracy measure. Then, the followingb FF)glo 12. It would be great if we had some results that would show that even if the agent did not aggregate local inaccuracies by summing them up but rather by applying some other numerical operation to them that would satisfy certain natural constraints, then a theorem similar to the one stated below could be derived. Unfortunately, we do not have anything like that to offer at this point, so we have to leave this for future work. 224 HANNES LEITGEB AND RICHARD PETTIGREW must hold: if b is a belief function and is a world,wi n G(w , b) p I({w }, w , b({w })).�i j i j jp1 That is, n f (FFw � b FF) p f (Fx (w ) � b({w })F).�i glo {w } i jj jp1 From this, along with the four conditions stated in section 5.1, local and global inaccuracy measures follow by the following theorem: Theorem 3. The following two propositions are equivalent: i) Function f is strictly increasing, and, for all belief functions, b, and worlds ,wi n f(kw � b k) p f (Fx (w ) � b({w })F).�i glo {w } i jj jp1 ii) There is such that, for all , .� 2l � � x � � f (x) p lx10 0 This theorem is proved in the appendix. 5.2.2. Separability of Global Inaccuracy. In the previous section, we drew attention to a possible dilemma that results from using the legitimate local and global inaccuracy measures given by the same function f. And we ruled out the possibility by introducing Agreement on Inaccuracy. In this section, we describe another way in which an inaccuracy measure could give rise to conflicting values for the inaccuracy of a belief function at a world. To state the problem, we introduce the following terminology: if 1 ≤ and , then is the projection ofnj ≤ n (a , . . . , a ) � � proj ((a , . . . , a ))1 n 0 j 1 n onto the linear subspace that is spanned by the unit vectors(a , . . . , a )1 n that represent the worlds in : that is,W � {w }j proj ((a , . . . , a )) p (a , . . . , a , 0, a , a ).j 1 n 1 j�1 j�1 n Hence, for , , whereas .i ( j proj (w ) p w proj (w ) p (0, . . . , 0)j i i j j Now, as in the previous section, suppose that f is a function that gives rise to a local inaccuracy measure I and a global inaccuracy measure G. And suppose further that our epistemic agent’s global belief function is represented by the vector . Then, given a world wi, there seem(a , . . . , a )1 n to be two ways to measure the inaccuracy of the agent’s belief function at wi that arise from combining I and G: MEASURING INACCURACY 225 1. One might simply use G: that is, the inaccuracy of at(a , . . . , a )1 n wi is G(w , (a , . . . , a )).i 1 n 2. Or, for any world wj with , one might take the inaccuracy ofi ( j at wi to be(a , . . . , a )1 n I({w }, w , a ) � G(proj (w ), proj ((a , . . . , a ))).j i j j i j 1 n That is, one might take the local inaccuracy of the degree of cre- dence in proposition at world wi and add it to the global{w }j inaccuracy at wi of the ‘remainder’ of when world wj(a , . . . , a )1 n is not considered: that is, geometrically speaking, one adds the global inaccuracy at world wi of the belief function represented by the projection of onto the subspace spanned by(a , . . . , a )1 n .W � {w }j As in the previous section, the conditions listed in section 5.1 do not rule out the possibility that these two ways of measuring the inaccuracy of the agent’s belief function at wi disagree, and, if they do, a dilemma might arise for the agent. As before, we rule out the functions f that give rise to this dilemma by laying down a further principle: Separability of Global Inaccuracy. Suppose I is a legitimate local in- accuracy measure. Then, by Local Normality and Dominance, there is a strictly increasing function, such that� �f : � r � I(A, w, x) p0 0 . Further, by Local and Global Comparability,f (Fx (w) � xF) G(w,A is a legitimate global inaccuracy measure. Then,b) p f (FFw � b FF)glo what follows must hold: for all with ,w , w � W i ( ji j G(w , (a , . . . , a )) p f (FFw � (a , . . . , a )FF)i 1 n i 1 n p f (Fx (w ) � a F) � f (FFproj (w ) � proj ((a , . . . , a ))FF).{w } i j j i j 1 nj As in the case of Agreement on Inaccuracy, this condition may be justified by noting that, if it were to fail, two legitimate ways by which an agent may determine her inaccuracy would lead to different results in at least one situation; such disagreement would lead to a situation analogous to that described in the discursive dilemma. From Separability of Global Inaccuracy along with the four conditions stated in section 5.1, local and global inaccuracy measures follow by the following theorem: Theorem 4. Separability of Global Inaccuracy and Minimum Inac- curacy entail Agreement on Inaccuracy (which, in combination with the assumptions of sec. 5.1, yields local and global inaccuracy mea- sures, by theorem 3). 226 HANNES LEITGEB AND RICHARD PETTIGREW The theorem is proved in the appendix again. 5.2.3. Agreement on Directed Urgency. Before we can state our final dilemma—saving the best for last—we must restrict the class of legitimate accuracy measures a little more than is done by the four conditions of section 5.1. In particular, we demand Continuous Differentiability: i) If is a legitimate local inaccuracy mea-I(A, w, x) p f(Fx (w) � xF)A sure, then f is continuously differentiable on .�� 0 ii) If is a legitimate global inaccuracy mea-G(w, b) p g(FFw � b FF)glo sure, then g is continuously differentiable on .�� 0 Again, in the presence of Local and Global Comparability, we can derive i from ii and ii from i. However, again, we state both conditions to avoid the mistaken impression that one is more fundamental than the other. Having said this, we will give our argument for Continuous Differentia- bility only in terms of i. This is not because we derive ii by inferring it from i, rather, it is because the argument for ii is exactly analogous and may be easily reconstructed from the argument for i. Suppose that f is a function that gives rise to a local inaccuracy measure. By Local Normality and Dominance, f is an increasing function. But it is clear that should be also a continuous function of x, andf (Fx (w) � xF)A thus f should be a continuous function. After all, if weref (Fx (w) � xF)A discontinuous as a function of x at some particular , an agent’s�x � �0 0 accuracy could improve or deteriorate dramatically by an arbitrarily small change to her degree of credence in the neighborhood of x0. Thus, f must be continuous. However, Continuous Differentiability demands something further. It demands that f be continuously differentiable on . To justify this claim,�� 0 consider again the notion of expected local inaccuracy, which we intro- duced in section 2. Given a belief function b, propositions , aA, E P W degree of credence x in A, and a local inaccuracy measure I, we have interpreted as the expected value of the inaccuracy of xLExp (I, A, E, x)b by the lights of b, with respect to I and over the epistemically possible worlds .w � E Now, if I were a legitimate local inaccuracy function, the function would provide not only a measure of the expected in-LExp (I, A, E, x)b accuracy of x in A by the lights of b, with respect to I and over E. It would provide also a means by which to measure the urgency and direc- tion—in short, the directed urgency—with which an agent for whom E is the set of epistemically possible worlds ought to change her degree of credence in A by the lights of some belief function b. Clearly, this measure would be provided by the derivative of with respect toLExp (I, A, E, x)b x, were this derivative to exist. Wherever it is defined, the absolute value MEASURING INACCURACY 227 of the function measures the rate at which the ex-d/dx LExp (I, A, E, x)b pected inaccuracy of x by the lights of b is changing (the slope of the tangent): thus, by the lights of b, it would be more urgent to change the degree of credence r in proposition A than to change the degree of credence s in the same proposition, just in case the absolute value of evaluated at r were greater than the absolute valued/dx LExp (I, A, E, x)b of the same derivative evaluated at s.13 Furthermore, if the sign of evaluated at r differed from the sign of the samed/dx LExp (I, A, E, x)b derivative evaluated at s, then the degree of credence r in proposition A ought to be increased when the degree of credence s in the same prop- osition ought to be decreased, or vice versa. Indeed, only the derivative of could supply the agent with this sort of informa-LExp (I, A, E, x)b tion. Thus, if I is to be a legitimate local inaccuracy measure, then should be defined on since there ought to be a�d/dx LExp (I, A, E, x) �b 0 measure of directed urgency that an agent can use to determine a local recommendation of where to go epistemically and, as it were, how quickly she should move. If Ought-Can by itself does not support this claim sufficiently, then one hopes it does so in conjunction with the geometrical framework that we presuppose. In any case, it is straightforward to show that, if is defined on , for every belief�d/dx LExp (I, A, E, x) �b 0 function b, propositions , and degree of credence x, thenA, E P W must be differentiable on this domain as well. Thus, ifI(A, w, x) , then f must be differentiable on . Of course,�I(A, w, x) p f (Fx (w) � xF) �A 0 this argument requires some amount of idealization since for all “real world” cases in which an agent ought to determine the directed urgency of a belief change, computing small real-valued differences rather than infinitesimal ones should be sufficient.14 But if the agent wants to be certain about this, whatever the level of precision required, then f ought to be differentiable. What’s more, just as we wish our local inaccuracy measure to be a continuous function of the degree of credence whose inaccuracy it is mea- suring, we would like our directed urgency measure to be a continuous function of the degree of credence of the directed urgency of the change to which it is measuring: we would not wish the urgency and direction by which an agent should change her degree of credence to change dras- tically after an arbitrarily small shift in her degree of credence. From this, 13. Gibbard (2008), too, interprets the derivative of the inaccuracy measure as a measure of the urgency of updating. 14. Alan Hájek and Kenny Easwaran pointed this out to us. 228 HANNES LEITGEB AND RICHARD PETTIGREW the local part of Continuous Differentiability—namely, i from above— follows.15 As we mentioned above, our argument in favor of the global part of Continuous Differentiability—namely, ii—is analogous to the argument just given in favor of the local part. Focusing just on propositions A, which are singleton propositions of the form for , we{w } j p 1, . . . , nj say that if is a legitimate global inaccuracy mea-G(w, b) p g(FFw � b FF)glo sure, then it also ought to give rise to a measure of the urgency with which an agent must change her degree of belief in such a singleton proposition. Given a belief function b, then the urgency to change the degree of belief x in proposition for an agent with belief function b′{w }j by the lights of b would be given by the absolute value of d ′ ′ ′ ′GExp (G, E, (b ({w }), . . . , b ({w }), x, b ({w }), . . . , b ({w })),b 1 j�1 j�1 ndx if this derivative were to exist (and which, given Continuous Differentia- bility, indeed exists). And, as in the local case, its sign would indicate the direction in which the change must occur. Thus, we interpret the deriv- atives of both expected global inaccuracy and expected local inaccuracy as measures of directed urgency. Granted this, we claim that these two measures ought to agree on singleton propositions whenever the global and local inaccuracy measures from which they arise are based on the same strictly increasing and continuously differentiable function, f. This is the content of Agreement on Directed Urgency. Agreement on Directed Urgency. If andI(A, w, x) p f (Fx (w) � xF)A are legitimate local and global inaccuracyG(w, b) p f (FFw � b FF)glo measures, respectively, and if f is differentiable, then, for all belief functions b and b′ and all worlds ,w � Wj d LExp (I, {w }, E, x)b jdx d ′ ′ ′ ′p GExp (G, E, (b ({w }), . . . , b ({w }), x, b ({w }), . . . , b ({w })).b 1 j�1 j�1 ndx Suppose this condition were not to hold. Then the agent who employed these measures of inaccuracy, and the measures of directed urgency to which they give rise, would be left with a dilemma. Where (1) the local 15. If this little transcendental argument is not convincing enough, here is a much more mundane thought: let us restrict ourselves just to ‘geometrically nice’ local and global inaccuracy measures. But in order to be ‘geometrically nice’, these measures will have to be given by continuously differentiable functions. MEASURING INACCURACY 229 measure of directed urgency for differed from (2) the global measure{w }j of directed urgency for the coordinate , she would be unable to determinewj the urgency with which she must update her belief in order to minimize her expected inaccuracy and perhaps even the direction in which her update should proceed; the two measures would give conflicting values between which she could not choose in a principled way. Together with Continuous Differentiability and the four conditions stated in section 5.1, Agreement on Directed Urgency entails local and global inaccuracy measures by means of the following theorem: Theorem 5. The following two propositions are equivalent: i) Function is strictly increasing and continuously dif-� �f : � r �0 0 ferentiable, , and, for all belief functions ,f (0) p 0 b � Bel(W ) all , and all , :�w � W a , . . . , a a , . . . , a � �j 1 j�1 j�1 n 0 d b({w })f (Fx (w ) � xF)� i {w } ijdx w �Wi d p b({w })f (FFw � (a , . . . , a , x, a , . . . , a )FF).� i i 1 j�1 j�1 ndx w �Wi ii) There is , such that, for all , .� 2l � � x � � f (x) p lx10 0 The proof is given in the appendix. 6. A Look Ahead to the Sequel and to Future Work. This concludes our argument for local and global inaccuracy measures and, with it, our de- fense of the four mathematically precise versions of the Accuracy norm introduced in section 2, that is, the synchronic local and global versions and the diachronic local and global versions, now supplied with the right inaccuracy measures. In the sequel to this article (Leitgeb and Pettigrew 2010), we investigate the consequences of these norms. Before considering some open questions about our approach, we report the results of that investigation: 1. From the synchronic local version of Accuracy, we derive Prob- abilism. 2. From the diachronic local version of Accuracy, we derive Condi- tionalization. 3. From a related, but much stronger norm, we derive Uniform Dis- tribution. 230 HANNES LEITGEB AND RICHARD PETTIGREW 4. We show that, in the situations normally assumed to be covered by Jeffrey’s updating rule, there is no updating rule that satisfies the diachronic local version of Accuracy. However, the diachronic global version can be satisfied. We show that Jeffrey’s updating rule sometimes violates this diachronic version of the norm, unless the so-called postulate of rigidity is not required by fiat, and we describe the alternative updating rule that satisfies it. We finish with five open questions that point toward future research: • How can the approach taken in this article be extended to the case of an infinite set of worlds; in particular, how can it be extended to the case of a nondenumerable set of worlds? What role does count- able additivity play in such extensions? • Which of our conclusions depend essentially on our geometrical background machinery being Euclidean? Which conclusions can be drawn in a non-Euclidean setting? • How can the theory be translated into a more abstract system of axiomatic constraints on both belief update and the geometrical background system (along the lines of Joyce but also Greaves and Wallace [2006])? How robust are our results if, at the relevant places, summing up of inaccuracies gets replaced by any numerical oper- ation that satisfies some set of plausible constraints? • How does the theory relate to theories of verisimilitude in which relational truthlikeness is analyzed in terms of the expected degree of truthlikeness (as in Niiniluoto’s [1987] theory of estimated truth- likeness)? • How does our theory of expected inaccuracy work in a framework that permits partial beliefs in self-locating propositions? Such an application would cover the well-known Sleeping Beauty problem. See Kierland and Monton (1999) for a related attempt in which it is assumed that quadratic inaccuracy measures provide the only le- gitimate scoring rule. • Could a variant of our theory of expected inaccuracy justify prob- abilistic methods of judgment aggregation or amalgamation? Appendix: Proofs of Theorems 3–5. Here, we prove the three theorems used to argue for local and global inaccuracy measures in section 5. MEASURING INACCURACY 231 Proof of Theorem 3. It will suffice to show that the following two prop- ositions are equivalent: i′) Function g is strictly increasing and, for all belief functions, b, and worlds ,w � Wi n 2 2g(FFw � b FF ) p g(Fx (w ) � b({w })F ).�i glo {w } i jj jp1 ii′) There is , such that, for all , .�l � � x � � g(x) p lx10 0 Suppose i′ and ii′ are equivalent. Then, if i, then satisfies i′1/2g(x) p f (x ) and thus ii′, so , which gives ii. Similarly, if ii, then1/2f (x ) p lx g(x) p satisfies ii′ and thus i′, so f satisfies i. Thus, we will prove the1/2f (x ) equivalence of i′ and ii′. First, we show that ii′ implies i′. Thus, suppose . Clearly, gg(x) p lx is strictly increasing. Now suppose that b is a belief function and wi a world; then, 2 2 2lFFw � b FF p l[b({w }) � . . . � (1 � b({w })) � . . . � b({w })]i glo 1 i n n 2p l Fx (w ) � b({w })F .� {w } i jj jp1 Thus, ii′ implies i′. Now, we show that i′ implies ii′. Our strategy is to show that, from i′, it follows that, for any , . Since, by i′, g�x, y � � g(x � y) p g(x) � g( y)0 is also strictly increasing, ii′ follows by Cauchy’s classical result that all monotone additive functions on are linear on .16� � Thus, suppose . Then let ,�x, y � � b({w }) p . . . p b({w }) p 00 1 n�2 , and . Then, by i′,1/2 1/2b({w }) p x b({w }) p 1 � yn�1 n 2 2 2� � � �g(FFw � (0, . . . , 0, x, 1 � y)FF ) p g(F0 � xF ) � g(F1 � (1 � y)F ).n From this, we have , as required.g(x � y) p g(x) � g( y) Proof of Theorem 4. Consider the following statements: 16. In fact, we need a slightly different version, which states that all monotone additive functions on are linear on . Suppose is additive on . Then define� � � � �� � g : � r � �0 0 0 0 0 as follows: if and � if . Then it is easy to show′ ′g : � r � g (x) p g(x) 0 ≤ x g(�x) x ! 0 that g′ is additive and monotone on if g is additive and monotone on . Thus, g′�� �0 satisfies the hypotheses of Cauchy’s result. For the proof of Cauchy’s result, see Aczél and Dhombres (1989). 232 HANNES LEITGEB AND RICHARD PETTIGREW i) Function f is strictly increasing, , and, for all belief func-f (0) p 0 tions b and all worlds and such that ,w w i ( ji j f (FFw � b FF)i glo p f (Fx (w ) � b({w })F) � f (FFproj (w ) � proj ((a , . . . , a ))FF).{w } i j j i j 1 nj ii) Function f is strictly increasing and, for all belief functions, b, and worlds ,wi n f (FFw � b FF) p f (Fx (w ) � b({w })F).�i glo {w } i jj jp1 We must show that i entails ii. Suppose i holds. Then, by repeatedly separating the local inaccuracy measure from the global one, we obtain: f (FFw � b FF) p f (Fx (w ) � b({w })F)[� ]i glo {w } i jj j(i � f (FFw � (0, . . . , 0, a , 0, . . . , 0)FF)i i p f (Fx (w ) � b({w })F) � f (Fx (w ) � b({w })F)[� ]{w } i j {w } i ij i j(i n p f (Fx (w ) � b({w })F).� {w } i jj jp1 Thus, ii, as required. Proof of Theorem 5. For reasons analogous to those given in the proof of theorem 3, it will suffice to show that the following two statements are equivalent: i′) Function is strictly increasing and differentiable,� �g : � r �0 0 , and, for all belief functions and :g(0) p 0 b � Bel(W ) w � Wj d 2b({w })g(Fx (w ) � xF )� i {w } ijdx w �Wi d 2p b({w })g(FFw � (a , . . . , a , x, a , . . . , a )FF ).� i i 1 j�1 j�1 ndx w �Wi ii′) There is , such that, for all , .�l � � x � � g(x) p lx10 0 First, we prove that ii′ implies i′. If and , then g isl � � g(x) p lx10 certainly strictly increasing and differentiable: indeed, . Thus, by′g (x) p l MEASURING INACCURACY 233 straightforward differentiation and direct calculation, if a , . . . , a ,1 j�1 , then�. . . , a � �n 0 d 2b({w })g(Fx (w ) � xF )� i {w } ijdx w �Wi p b({w })[�2lx (w ) � 2lx]� i {w } ij w �Wi d 2p b({w })g(FFw � (a , . . . , a , x, a , . . . , a )FF ),� i i 1 j�1 j�1 ndx w �Wi as required. Next, we prove that i′ implies ii′. Our strategy will be to show that i′ implies that g′ is constant. This will suffice since, by i′, g is increasing, so . We do this in two stages. First, we prove that, on the assumptionl 1 0 of i′, on , and then we prove that, on the assumption′ ′ �g (1 � x) p g (1) � 0 of i′, on .′ ′ �g (x) p g (x � 1) � 0 Thus, suppose i′ holds. We wish to show that on .′ ′ �g (x � 1) p g (1) � 0 So, suppose . Then let , , and .� 1/2a � � j p 1 a p a a p . . . p a p 00 2 3 n Then, by i′, d 2b({w })g(Fx (w ) � xF )� i {w } i1dx w �Wi d 2�p b({w })g(FFw � (x, a, 0, . . . , 0)FF ).� i idx w �Wi But d 2b({w })g(Fx (w ) � xF )� i {w } i1dx w �Wi d 2 2 2p [b({w })g((1 � x) ) � b({w })g(x ) � . . . � b({w })g(x )]1 2 ndx ′ 2 ′ 2 ′ 2p �2(1 � x)b({w })g ((1 � x) ) � 2xb({w })g (x ) � . . . � 2xb({w })g (x ),1 2 n and d 2�b({w })g(FFw � (x, a, 0, . . . , 0)FF )� i idx w �Wi d 2 2 2�p [b({w })g((1 � x) � a) � b({w })g(x � (1 � a) )1 2dx 2 2� b({w })g(x � a � 1) � . . . � b({w })g(x � a � 1)]3 n 234 HANNES LEITGEB AND RICHARD PETTIGREW ′ 2 ′ 2 2�p �2(1 � x)b({w })g ((1 � x) � a) � 2xb({w })g (x � (1 � a) )1 2 ′ 2 ′ 2� 2xb({w })g (x � a � 1) � . . . � 2xb({w })g (x � a � 1).3 n Thus, taking , we havex p 0 ′ ′�2b({w })g (1) p �2b({w })g (1 � a),1 1 which gives , as required.′ ′g (1) p g (1 � a) Now we wish to show that on . Thus, suppose′ ′ �g (x) p g (x � 1) � 0 . Then let b be a belief function in such that�a � � Bel(W ) b({w }) p 10 2 and , and let .b({w }) p b({w }) p . . . p b({w }) p 0 a p . . . p a p 01 3 n 2 n Then, by i′, d d 2 2g(Fx (w ) � xF ) p g(FFw � (x, 0, 0, . . . , 0)FF ),{w } 2 21dx dx which gives for such that , and thus′ 2 ′ 2 �g (x ) p g (x � 1) x � � x ( 00 for such that since x2 is bijective between′ ′ �g (x) p g (x � 1) x � � x ( 00 and . Thus, since g′ is continuous on , for� � � ′ ′� � � g (x) p g (x � 1)0 0 0 . Thus, for all , , so g′ is constant� � ′ ′ ′x � � x � � g (x) p g (x � 1) p g (1)0 0 on . This completes our proof.�� 0 REFERENCES Aczél, J., and J. Dhombres. 1989. Functional Equations in Several Variables. Cambridge: Cambridge University Press. Brier, G. W. 1950. “Verification of Forecasts Expressed in Terms of Probability.” Monthly Weather Review 78:1–3. Gibbard, A. 2008. “Rational Credence and the Value of Truth.” In Oxford Studies in Epis- temology, vol. 2, ed. T. Gendler and J. Hawthorne, 143–64. Oxford: Oxford University Press. Greaves, H., and D. Wallace. 2006. “Justifying Conditionalization: Conditionalization Max- imizes Expected Epistemic Utility.” Mind 115 (459): 607–32. Huber, F. 2006. “The Consistency Argument for Ranking Functions.” Studia Logica 82:1– 28. Joyce, J. M. 1998. “A Nonpragmatic Vindication of Probabilism.” Philosophy of Science 65 (4): 575–603. ———. 2009. “Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Belief.” In Degrees of Belief, ed. F. Huber and C. Schmidt-Petri, 263–97. Synthese Library 342. Dordrecht: Springer. Kierland, B., and B. Monton. 1999. “Minimizing Inaccuracy for Self-Locating Beliefs.” Philosophy and Phenomenological Research 70 (2): 384–95. Leitgeb, H., and R. Pettigrew. 2010. “An Objective Justification of Bayesianism II: The Consequences of Minimizing Inaccuracy.” Philosophy of Science, in this issue. Miller, D. W. 1974. “Popper’s Qualitative Theory of Verisimilitude.” British Journal for the Philosophy of Science 25:166–77. Niiniluoto, I. 1987. Truthlikeness. Dordrecht: Reidel. Pettit, P. 2001. “Deliberative Democracy and the Discursive Dilemma.” Philosophical Issues (Supp. Noûs) 11:268–99. Popper, K. R. 1968. The Logic of Scientific Discovery, rev. ed. London: Hutchinson. ———. 1972. Objective Knowledge: An Evolutionary Approach. Oxford: Clarendon. MEASURING INACCURACY 235 Rosenkrantz, R. D. 1981. Foundations and Applications of Inductive Probability. Atascadero, CA: Ridgeview. Selten, R. 1998. “Axiomatic Characterization of the Quadratic Scoring Rule.” Experimental Economics 1 (1): 43–61. Tichý, P. 1974. “On Popper’s Definition of Verisimilitude.” British Journal for the Philosophy of Science 25:155–60. Williamson, J. Forthcoming. “Objective Bayesianism, Bayesian Conditionalization, and Vol- untarism.” Synthese.