untitled A Pragmatist’s Guide to Epistemic Utility Benjamin Anders Levinstein*y We use a theorem from M. J. Schervish to explore the relationship between accuracy and practical success. If an agent is pragmatically rational, she will quantify the expected loss of her credence with a strictly proper scoring rule. Which scoring rule is right for her will depend on the sorts of decisions she expects to face. We relate this pragmatic con- ception of inaccuracy to the purely epistemic one popular among epistemic utility the- orists. 1. Introduction. Accuracy is an important epistemic good. Indeed, accord- ing to accuracy-first epistemology, accuracy is the only epistemic good. The higher your credences in truths and the lower your credences in falsehoods, the better off you are, all epistemic things considered. Given this alethic mo- nism, recent proponents of accuracy-first epistemology argue for a variety of epistemic norms by co-opting the resources of practical decision theory, with inaccuracy playing the role of epistemic disutility.1 For instance, Joyce (1998, 2009) argues that agents should have credences that obey the axioms of the probability calculus by appeal to the decision-theoretic norm of dom- inance avoidance. On Joyce’s favored measures of inaccuracy, any credence function that is not probabilistically coherent will be less accurate than some fixed probabilistically coherent alternative function at every world.2 *To contact the author, please write to: Department of Philosophy, 1 Seminary Place, Rutgers University, New Brunswick, NJ 08901; e-mail: balevinstein@gmail.com. yThanks to Seamus Bradley, Catrin Campbell-Moore, Greg Gandenberger, James Joyce, Richard Pettigrew, Patricia Rich, and audiences in Bristol and Munich. I was supported by the European Research Council starting grant Epistemic Utility Theory: Foundations and Applications during some of the work on this article. 1. Recent examples of the epistemic utility approach include Joyce (1998, 2009), Leitgeb and Pettigrew (2010a, 2010b), Pettigrew (2016a), and Konek and Levinstein (2017). 2. Other decision-theoretic norms appealed to include minimizing expected inaccuracy to establish conditionalization (Greaves and Wallace 2006; Leitgeb and Pettigrew 2010b), minimax to establish the principle of indifference (Pettigrew 2016b), Hurwicz criteria Received April 2016; revised December 2016. Philosophy of Science, 84 (October 2017) pp. 613–638. 0031-8248/2017/8404-0001$10.00 Copyright 2017 by the Philosophy of Science Association. All rights reserved. 613 Unlikewith traditional Dutch-bookarguments, theappealtoaccuracycon- siderations appears to be nonpragmatic. Joyce claims that his argument for probabilism brings in no practical considerations whatever and is instead purely epistemic. Indeed, epistemic utility theory (i.e., this decision-theoretic, accuracy-first approach to epistemology) tries to eschew pragmatic consider- ations entirely. Such philosophical scruples lead to two difficult problems. First, as we will see, Joyce’s argument and the arguments of other epistemic utility the- orists only work for a certain class of measures called strictly proper scoring rules.3 This class excludes some extremely natural measures, and it is hard to see why only those measures of inaccuracy are legitimate. Second, if inac- curacy measures are to play the role of epistemic disutility functions for ra- tional agents, it is not clear how to determine which particular measure is right for which agent.4 It is doubtful that any intuitive notion of accuracy could render one measure objectively correct for all agents, and it is also hard to see what reasons an agent would have to choose one measure over another. We will provide an answer to both these questions below but from a start- ing point anathema to a pure epistemic utility theorist. Avoiding appeal to intrinsic epistemic goodness entirely, we will assume that all value is ulti- mately grounded in practical value. In particular, credences have value based ontheir connection to practical success. For us, the first question is how a prac- tically rational agent goes about assigning value to her own credences and the credences of others (i.e., how does she assign value to doxastic states?). One initial advantage we have over the pure epistemic utility theorist is that we can assume such an agent will be probabilistically coherent, for oth- erwise she’s vulnerable to Dutch books. Indeed, we assume such an agent will be an expected utility maximizer. From this starting point of expected utility maximization, we can under- stand accuracy’s practical role by repurposing a representation theorem from Schervish (1989). Here is the idea in brief. Suppose you have a credence of .3 that it will rain. You may end up having to make a decision at some point on the basis of this credence, such as whether to bring an umbrella, whether to drive instead of walk, or whether to accept a monetary bet that pays off just in case it in fact rains. You do not yet know for sure which particular decisions you will have to make, but you do know that the less accurate your 3. In fact, a few other structural restrictions are needed as well. For details, see Pettigrew (2016a). 4. Some epistemic utility theorists will see this issue as less important than the first, but others see it as necessary at least for the argument for probabilism. See sec. 2.2. to make sense of Jamesian epistemology (Pettigrew 2016c), and chance-dominance avoidance to establish the Principal Principle (Pettigrew 2013). 614 BENJAMIN ANDERS LEVINSTEIN credence is, the more likely it is that you will make what turns out to be the wrong decision (relative to your desires). So, you can assign your credence an expected loss (i.e., negative expected utility) by averaging over the values of the possible good and bad decisions you might make based on it. As we will see from Schervish’s theorem, under some natural assumptions, this method generates exactly the sort of measures of inaccuracy that epistemic utility theorists find acceptable. That is, the expected loss function a rational agent uses to assign practical value to her own credence or to evaluate another agent’s credence simply is a proper scoring rule. Moreover, Schervish’s the- orem will allow us to represent an agent with a single measure of inaccuracy that reflects her expectations of the kinds of practical decisions she will make. Although our measures of inaccuracy will ultimately be generated from practical considerations, they are nonetheless in a derivative sense episte- mic. They reflect an agent’s valuation of her credence before she has any par- ticular purpose for it in mind (i.e., before she knows which decisions she will end up making). This allows us to treat epistemic value as quasi-separable from practical value since we do not need to reference any specific practical decision when evaluating how well-off an agent is epistemically. We are in agreement with the pure epistemic utility theorist that the only epistemic good is accuracy as measured in accord with a proper scoring rule. We sim- ply disagree about the ultimate source of this value. This practical approach to epistemic utility will give us a further advan- tage over the pure epistemic utility theorist. Because inaccuracy measures function, for us, as summary statistics of expected practical disutility, we can use them to explain an agent’s practico-epistemic behavior—practical ac- tions that are performed for the sake of epistemic gain, such as evidence gath- ering, paying for information, and conducting experiments. Understanding this sort of behavior is extremely important to epistemology but nonetheless falls outside of the domain of the purely epistemic.5 To be clear, this practical approach does not entail that the project of the pure epistemic utility theorist is doomed. Despite the current difficulties, there may well be a satisfactory account of why proper scoring rules are the only reasonable measures of epistemic utility that do not invoke any prac- tical considerations whatever. The point instead is to investigate the valua- tion of doxastic states from a practical perspective and to see why and how pragmatically rational agents will use proper scoring rules for such a valua- tion. Indeed, epistemic utility theorists themselves may still find this discus- sion of interest even if they reject the practical foundations. In addition to the independent usefulness of the technical methods used for generating 5. See Gibbard (2007) for another approach that aims to make sense of accuracy in terms of its consequences for practical success. PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 615 measures of inaccuracy, the relationship between what epistemic utility the- orists claim is accuracy’s purely epistemic value and its ultimate practical value should concern them, especially when it comes to practico-epistemic behavior. Moreover, many philosophers will be drawn to the claim that the practical value of accuracy is the primary or even sole value of accuracy and that there is no such thing as purely epistemic utility. For instance, functionalists think that we simply cannot divorce doxastic states entirely from their effects on our behavior. Beliefs only make sense insofar as they interact with desires to produce action. Such philosophers will then generally prefer pragmatic ar- guments for epistemic norms (such as Dutch books) and will likewise prefer a pragmatic basis for valuing accuracy. Like the epistemic utility theorists, however, they too should be interested in understanding the notion of accu- racy and how it relates to practical success. So, although we here appeal ultimately only to pragmatic instead of pure epistemic value, our approach will also have significant payoffs. These, in brief, include 1. A new justification of the standard measures of inaccuracy. 2. A new explanation of why and when to use one measure over another. 3. A better understanding of the connection among accuracy, practical success, epistemic evaluation, and practico-epistemic behavior such as evidence gathering. Section 2 introduces the basic tools for measuring inaccuracy and the diffi- culties of epistemic utility theory. Section 3 explains how to determine the practical value of a credence, presents Schervish’s theorem, and discusses its significance. Section 4 briefly relates Schervish’s theorem to the value of information, evidence gathering, and evaluation of other agents. Section 5 wraps up. 2. Inaccuracy and Scoring Rules. In this section, we look at the two im- portant questions identified in the introduction: what are the general con- straints on plausible candidate measures of inaccuracy, and which measure in particular is right in a given context? We will approach these questions for now from the point of view of the epistemic utility theorist. That is, we will see how we might try and answer them if we want a purely epistemic notion of inaccuracy. Let us start with credences in individual propositions. A measure of inac- curacy, or scoring rule, is meant to quantify how close a credence in a prop- osition is to its truth-value at a world. At the very least, a higher credence in a true proposition should not count as more inaccurate than a lower credence in that same proposition. We can use this minimal constraint to define the class of functions of interest: 616 BENJAMIN ANDERS LEVINSTEIN Definition 1 A function G :½0, 1� � f0, 1g → ½0, ∞� is a (local) scoring rule if G(x, 1) is monotonically decreasing and G(x, 0) is monotonically increasing. Often we will write a scoring rule G as (g1, g0) where gi(x) 5 G(x, i). By re- quiring g1(x) and g0(x) to be monotonically decreasing and increasing re- spectively, we guarantee that as a credence gets closer to a truth-value, its score will not get worse. Note that for now, we do not even require the mono- tonicity to be strict. We can generalize this idea to constrain good measures of inaccuracy for entire credence functions. Let Q be a finite set of worlds and F be a subset of the power set of Q; bel(F) is the set of belief functions over F, where a belief function assigns some number x ∈ ½0, 1� to each proposition in F. Note that probability functions form a subset of the belief functions. For w ∈ Q and X ∈ F , let w(X) 5 1 if w ∈ X and 5 0 otherwise. To constrain the class of relevant functions, we first define an analogous weak monoto- nicity constraint: Definition 2 A function G : bel(F) � Q → ½0, ∞� is weakly truth-directed if for any b, c ∈ bel(F), if jw(X) 2 b(X)j ≤ jw(X) 2 c(X)j for every X ∈ F , then G(b, w) ≤ G(c, w). Weak truth-directedness says that if b’s credence is always at least as close to the truth as c’s credence, then b is no more inaccurate than c. We then say: Definition 3 A function G : bel(F) � Q → ½0, ∞� is a (global) scoring rule if it is weakly truth-directed. These definitions of scoring rules are too weak to carve out a good class of inaccuracy measures, but they will be useful below. The most obvious way to strengthen them is to require stronger monotonicity conditions. We say: Definition 4 A function G :½0, 1� � f0, 1g → ½0, ∞� is a (strict local) scor- ing rule if G(x, 1) is strictly decreasing and G(x, 0) is strictly increasing. Likewise, we define a stronger notion of truth-directedness: Definition 5 A function G : bel(F) � Q → ½0, ∞� is truth-directed if for any b, c ∈ Prob(F ), if 1. jw(X) 2 b(X)j ≤ jw(X) 2 c(X)j for every X ∈ F and 2. jw(X) 2 b(X)j < jw(X) 2 c(X)j for some X ∈ F , then G(b, w) < G(c, w). PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 617 In turn, Definition 6 A function G : bel(F) � Q → ½0, ∞� is a (strict global) scor- ing rule if it is truth-directed. 2.1. Propriety. Any strict scoring rule is in some sense a measure of inaccuracy. However, from the point of view of the epistemic utility theo- rist, even some strict scoring rules fail to generate the results she wants. Joyce (1998, 2009), for instance, argues that epistemic agents should be probabilistically coherent. In schematic terms, Joyce’s argument runs as fol- lows. There is some class I of reasonable measures of inaccuracy. For any scoring rule G ∈ I and any nonprobabilistic belief function b, there exists an alternative probability function c that is less inaccurate than b according to G at every possible world. Furthermore, according to G, for any proba- bilistically coherent function c and any belief function b, c is less inaccurate than b at some world. In other words, all and only the nonprobability func- tions are dominated according to every reasonable measure of inaccuracy. The major weakness of this argument is that some measures of inaccu- racy that seem perfectly reasonable do not yield this result. Consider the absolute-value measure, for instance: abs b, wð Þ 5 o X∈Q b Xð Þ 2 w Xð Þj j: Here, abs is clearly truth-directed and at least seems natural. Nevertheless, it yields absurd verdicts about the relative accuracy of two belief functions. Imagine an urn contained a red, a green, and a blue ball, one of which will be drawn at random (i.e., with a 1/3 chance). According to abs, an agent with a credence of 0 in red, green, and blue counts as less inaccurate than an agent with a credence of 1/3 in each proposition, regardless of which ball is actually drawn.6 Surprisingly, it is relatively easy to identify exactly what further major re- striction on measures of inaccuracy is needed to generate Joyce’s results: ev- ery probability function must assign itself minimum expected inaccuracy.7 6. Note that the agent with a credence of 0 in each proposition will receive a total score of 1, since her credence in two of the propositions will be perfectly accurate, while her credence in one proposition will be off by 1. The agent with a credence of 1/3 in each proposition will be off by 2/3 in one proposition and by 1/3 in the remaining two, for a total score of 4/3. 7. Joyce (2009) himself derives propriety from truth-directedness along with the weaker principle of Coherent Admissibility, which requires every probability function to be non- dominated. 618 BENJAMIN ANDERS LEVINSTEIN That is, the argument requires scoring rules to be strictly proper according to the following definition: Definition 7 A scoring rule G is a proper scoring rule if for all probability functions b and belief functions c, Eb(G(c)) is minimized at b 5 c, where Eb denotes b’s expectation function. If this minimum is unique, then G is a strictly proper scoring rule. If G is proper but not strictly proper, then we say that G is a merely proper scoring rule. Note that if G is a local scoring rule, then G is (strictly) proper if for all x, y ∈ ½0, 1�, yg1(x) 1 (1 2 y)g0(x) is (uniquely) minimized at x 5 y. Propriety is a curious property. On the one hand, it is a crucial constraint necessary for the success of the epistemic utility program. In addition to Joyce’s argument, nearly every other argument in the epistemic utility liter- ature requires this restriction as well.8 Without it probabilistically coherent credence functions would be self-undermining. That is, they would face a kind of Moorean paradox: ‘I assign credence x to X, but I think a credence of x0 in X would be more/at least as accurate.’ On the other hand, it seems hard to justify on the basis of reflection on the notion of inaccuracy alone. Propriety simply does not seem to stem from alethic monism on its own.9 Furthermore, propriety rules out two of the most obvious measures of inac- curacy right from the bat, that is, abs and the euclidean measure: euc b, wð Þ 5 o X∈F b Xð Þ 2 w Xð Þð Þ2 � �1=2 : Two of most common measures of distance are abs and euc, and inaccuracy is supposed to be a measure of proximity to truth. Both are truth-directed, yet neither is proper.10 Fortunately for the epistemic utility theorist, other scoring rules are rel- atively natural as well and do turn out to be proper. Three common strictly proper global rules include 8. See, e.g., n. 2. 9. There are a number of arguments that try to independently motivate restrictions on the class of reasonable inaccuracy measurements that entail propriety. Discussing each would substantially lengthen this article, but I refer the interested reader to Joyce (1998), D’Agostino and Sinigaglia (2010), Leitgeb and Pettigrew (2010a), and Pettigrew (2016a). For further doubts about the plausibility of propriety stemming from alethic monism, see Gibbard (2007). 10. We have already seen that abs is improper. To see that euc is improper, suppose an agent assigns credence .9 to X and .1 to :X. She expects the credence function that as- signs 1 to X and 0 to :X to be less inaccurate than she is according to euc. PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 619 Brier Score BS b, wð Þ 5 1Fj j oX∈F w Xð Þ 2 b Xð Þð Þ2: Log Score Log b, wð Þ 5 2 1Fj j oX∈F ln 1 2 w Xð Þð Þ 2 b Xð Þj jð Þ: Spherical Score Sph b, wð Þ 5 1Fj j oX∈F 1 2 1 2 w Xð Þ 2 b Xð Þj j b Xð Þ2 1 1 2 b Xð Þð Þ2 � �1=2 : Each of these rules is additive. That is, each is simply the (normalized) sum of a local strictly proper rule: Local Brier BS(x, i) 5 (i 2 x)2. Local Log Log(x, i) 5 2ln(j(1 2 i) 2 xj). Local Spherical Sph(x, i) 5 1 2 j1 2 i 2 xj=(x2 1 (1 2 x)2)1=2. Later on, we primarily focus on local rules and then see how they relate to additive global rules.11 So, despite its theoretical importance, propriety itself is in need of some additional explanation. We provide one below—when we walk through Scher- vish’s theorem we will gain a new understanding of what makes proper scor- ing rules so special. Our solution will not satisfy the austere scruples of those who want inaccuracy to be a purely epistemic notion with no appeal to prag- matic considerations but instead will explain why pragmatically rational agents use them to determine the value of their own credences. 2.2. Which Scoring Rule to Use? A second question is which scoring rule serves as the best measure of inaccuracy in a given context. Even if we insist on strict propriety, we have infinitely many rules left to choose from. 11. The Spherical Score looks odd at first, but it is more natural when understood geo- metrically. For given credence function c, proposition X, and world w, let k cX k be the length of the vector cX 5 h c(X), 1 2 c(X) i. Let vX,w be the angle between cX and h w(X), w(:X) i. The local spherical score of a credence c(X ) is then k cX k cos vX ,w. That is, it is determined by the length of the vector cX and the angle between cX and the actual truth-value of X at w. For a more thorough discussion, see Jose (2007). 620 BENJAMIN ANDERS LEVINSTEIN It is not immediately clear why someone would opt for the Brier or the Log or the Spherical rule. Some epistemic utility theorists may regard this question as less pressing. As we saw, Joyce establishes an accuracy-dominance argument for proba- bilism. As long as the scoring rule in question is strictly proper—and meets a few other structural assumptions12—all and only probability functions are undominated, so it appears in this case that there is no need to choose any single measure. However, as Bronfman (2009) and Pettigrew (2016a, chap. 5) point out, which functions dominate which others is scoring-rule dependent. In particular, if b is not a probability function, then there may be no proba- bility function that dominates it on every rule that is considered legitimate. So, if an agent adopts b as her credence function but does not adopt any par- ticular measure of inaccuracy, then any probability function c will do worse than b at some world according to some measure. Both authors argue that if we do not choose a single rule with which to measure an agent’s inaccuracy, then the normative force of accuracy-dominance arguments for probabilism is undermined.13 In response, one may be a subjectivist and claim that the scoring rule merely reflects an agent’s subjective epistemic values, just as in practical contexts rational agents may adopt alternative credence functions.14 One may also be an objectivist and claim that a single rule is correct.15 Schervish’s theorem will enable us to provide a new kind of answer. An agent’s scoring rule will not reflect her epistemic values, but instead it will represent the kinds of decision problems she expects to face. In full gener- ality, any proper scoring rule could be correct in a given context. Further- more, an agent’s global scoring rule will usually be built out of different local scoring rules for different propositions. 3. The Pragmatic Evaluation of Credences. Let us now put aside this no- tion of pure epistemic utility unsullied by practical value and return to the 12. Namely, as long as the rule is truth-directed, continuous, strictly proper, and additive (i.e., the sum of local scoring rules), the result that all and only probability functions are undominated goes through. 13. I harbor doubts as to whether this objection is actually successful, but I mention it here to note that epistemic utility theorists themselves consider this issue an important problem. It is worth acknowledging as well that the class of admissible measures of in- accuracy need not be narrowed all the way down to a singleton to avoid the Bronfman objection. 14. Joyce (2009) at least leans in this direction. 15. This position is perhaps the most popular among epistemic utility theorists, with the Brier usually being the rule of choice (Rosenkrantz 1981; Leitgeb and Pettigrew 2010a; Pettigrew 2016a). PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 621 world of hard-nosed pragmatism. We wish now to understand how a practi- cally rational agent will evaluate her own doxastic state and possible alter- native doxastic states. For instance, we will try to determine how much ex- pected utility an agent assigns her credence of .6 that it will rain. Our task is a bit easier than the epistemic utility theorist’s, as we already have some understanding of practical rationality. We will assume that prac- tically rational agents are expected utility maximizers.16 In particular, they have probabilistically coherent credence functions, since otherwise they would be subject to Dutch books. This starting point will give us some initial trac- tion. We make a few additional assumptions. First, unlike in causal or eviden- tial decision theory, we will only look at situations in which acts and states are independent. That is, in the situations we consider below, whether an agent performs an action has no bearing on whether an event of interest oc- curs. For instance, whether you bring an umbrella does not by itself (at least normally) affect your credence that it will rain. Because the actions do not affect outcome, we will often refer to actions as ‘bets’. This may seem unduly restrictive, but we are interested in evaluating cre- dences in propositions, not credences conditionalized on or imaged on the performance of action. Your credence that you will get a promotion is dif- ferent from your credence that you will get a promotion supposing you bribe your boss, and the two in turn have different values. Second, we will assume that credences and states are independent. That is, the probability of events of interest does not depend on an agent’s cre- dences. For example, the chance a coin will land heads will not be affected by your belief that the coin will land heads. Third, we assume that the value of an outcome is not itself affected by an agent’s credence. In other words, agents do not themselves assign direct value to the beliefs they hold. For ex- ample, we will not try to account for the utility you gain from your high cre- dence that your colleagues are fond of you. These last two assumptions are for the sake of simplification. 3.1. The Practical Value of a Credence. With this background out of the way, let us now see how an agent may evaluate her own credence in terms of expected practical value. Let R be the proposition that it will rain today. There are a number of different bets on R that an agent, let us call her 16. In particular, I assume that agent’s doxastic states are (or are representable by) a unique probability function and that she has a utility function that is unique up to pos- itive affine transformation. Both of these idealizations are necessary for Schervish’s the- orem to generate a unique scoring rule. An important question that I will not explore here is what happens when these assumptions are relaxed. 622 BENJAMIN ANDERS LEVINSTEIN Alice, might take that will affect her utility. Suppose the possible actions are bringing an umbrella (u), wearing a raincoat (w), or staying home (s). Suppose we want to know whether Alice will bring an umbrella. Given that she is an expected utility maximizer, she will only if EU uð Þ ≥ max EU wð Þ, EU sð Þð Þ 5 EU :uð Þ, where EU is Alice’s expected utility function. In other words, we can see whether she will bring an umbrella by looking at her decision between two actions, bringing an umbrella or not bringing an umbrella, even though the action space itself is more fine grained. This will allow us to treat each of Alice’s decision problems as if there were only two options from now on: whether to u or :u.17 Imagine Alice’s payoff matrix is as given in table 1, which shows how much utility Alice gets, depending on whether she brings an umbrella when it rains or does not rain. By a simple calculation of Alice’s expected utility, we can determine how high her credence x in R must be before she decides to bring an umbrella. EU uð Þ 5 x 21ð Þ 1 1 2 xð Þ 22ð Þ 5 x 2 2: (1) EU :uð Þ 5 x 24ð Þ 1 1 2 xð Þ 0ð Þ 5 24x: (2) We then have (1) > (2) if and only if x > 2=5. So, Alice will bring an um- brella if x > 2=5 and not bring an umbrella if x < 2=5. For ease, we will con- ventionally decide that Alice will bring the umbrella if and only if x > 2=5. 17. An important issue in decision theory is the relationship between small- and grand- world decision problems. In small-world problems, an agent does not partition the space of outcomes, states, and acts maximally finely. In the current (small-world) decision problem, for instance, Alice does not distinguish between outcomes in which her um- brella breaks and outcomes in which her umbrella remains in tact, even though those clearly result in different rewards. Ideally, an agent would always deliberate using a maximally fine-grained partition (if such there be), but that requirement is so unrealistic that it would render decision theory of little guiding value. I agree, then, with Joyce (1999) that when an agent deliberates using a small-world partition and selects action a from that set of actions, she is committed to the view that her “fully considered beliefs and desires would sanction the choice of a from among the alternatives listed” (74). In other words, “we can think of a rational agent’s attitudes toward the states, outcomes, and acts in a small-world decision problem as her best estimates of the attitudes that she would hold regarding those states, outcomes, and acts in the grand-world context” (75, emphasis mine). PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 623 All that matters for how much utility Alice ends up getting is (i) whether R and (ii) whether her credence is greater than 2/5. 3.2. Reformulation. We need to reformulate this problem to suit our aims of constructing proper scoring rules. Since scoring rules are loss func- tions, we will describe Alice as an expected loss minimizer instead of as an expected utility maximizer. This choice is purely conventional. Table 2 reexpresses table 1 in terms of losses instead of gains. Notice that if it rains, Alice is sure to incur a loss of at least 1 no matter what she does. As far as her decision is concerned, this loss is irrelevant since it is merely a result of the state of the world. So, we can normalize table 2 by subtracting the minimum loss that is sure to result at each state. We then arrive at table 3. Finally, we rewrite table 3 as table 4 by dividing the loss in each cell by the sum of the total losses in each cell. In this case, the sum is 5, since Alice will lose 2 if she performs u and :R and will lose 3 if she performs :u and R. When considering this problem in isolation, we can forget about the sum of the losses. It represents the “stakes” of the problem, but it will not affect Alice’s decision. So, for now, we can rewrite the payoff matrix as table 5, which captures this single decision problem conspicuously. As before, all that matters for much (dis)utility Alice ends up getting is (i) whether R and (ii) whether her credence is greater than 2/5. 3.3. Scoring Rules and the Problem of the Umbrella. We can now de- sign a scoring rule that will track how much of a loss (under our normali- zation) a credence of x in R will bring Alice. That is, given that her credence is currently x, we determine how much she expects to lose from her bet on rain. TABLE 2. LOSS MATRIX R :R u 1 2 :u 4 0 TABLE 1. PAYOFF MATRIX R :R u 21 22 :u 24 0 624 BENJAMIN ANDERS LEVINSTEIN First, let us determine how much she would lose if R and if :R, respec- tively. Per table 5, If R and x ≤ 2=5, Alice will lose 3/5, since she will not bring an umbrella. If x ≥ 2=5 and :R, she will lose 2/5, since she will bring an umbrella. Otherwise she loses nothing. Consider G 5 (g1, g0), where g1 xð Þ 5 3=5 if x ≤ 2=5 0 if x > 2=5, ( and g0 xð Þ 5 0 if x ≤ 2=5 2=5 if x > 2=5: ( Given Alice’s credence and R’s truth-value, G returns the amount Alice will lose. Now suppose Alice wishes to evaluate a credence of y given her cre- dence x. That is, she wants to determine how much she would expect to lose if she had decided whether to bring an umbrella on the basis of a credence of y in R. We then have Ex G yð Þð Þ 5 x � g1 yð Þ 1 1 2 xð Þg0 yð Þ, 5 x � 3=5 if y ≤ 2=5 1 2 xð Þ � 2=5 if y > 2=5, ( where Ex(G(y)) is minimized exactly when x, y ≤ 2=5 or TABLE 3. LOSS MATRIX, FIRST NORMALIZATION R :R u 0 5 1 2 1 2 5 2 2 0 :u 3 5 4 2 1 0 5 0 2 0 TABLE 4. LOSS MATRIX, SECOND NORMALIZATION R :R u 0 (2/5) � 5 :u (3/5) � 5 0 PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 625 x, y > 2=5: So, G is a merely proper scoring rule. 3.4. The General Case. More schematically, we can represent the agent as choosing between two options, d1 and d0, where d1 is better to perform if P and d0 is better to perform if :P. That is, L d1, Pð Þ ≤ L d0, Pð Þ L d1, :Pð Þ ≥ L d0, :Pð Þ: We again normalize losses in the same way we did in table 3 by setting L d1, Pð Þ 5 L d0, :Pð Þ 5 0: Then, in accord with table 4, we can now construct a loss matrix as ex- pressed in table 6, where q ∈ ½0, 1� and W ∈ (0, ∞�. Alice’s cutoff for per- forming d1 is represented by q. That is, if and only if Alice’s credence in P < q will she perform d1. Otherwise, she will perform d0. The weight or stakes of the problem is represented by W. Again, we ignore W for now and focus only on the cutoff points for deciding whether to d1. Definition 8 A q-problem with respect to P is a two-decision problem such that L(d1, :P) 5 W � q. For any particular q, Alice sees no difference in expected value between two forecasts on the same side of q since those forecasts will lead to the ex- act same action. In this more general case, we can determine Alice’s valuation of a cre- dence x with the merely proper scoring rule G 5 (g1, g0): TABLE 5. LOSS MATRIX, STAKE-FREE VERSION R :R u 0 2/5 :u 3/5 0 TABLE 6. LOSS MATRIX, SECOND NORMALIZATION P :P d1 0 q � W d0 (1 2 q) � W 0 626 BENJAMIN ANDERS LEVINSTEIN g1 xð Þ 5 1 2 q if x ≤ q 0 if x > q, ( and g0 xð Þ 5 0 if x ≤ q q if x > q: ( 3.5. Uncertainty about the Bet. We are often uncertain what bets we are actually going to face in the future. Supposing you are going to bet on P, you may still be uncertain whether you will face a q- or q0-problem. For instance, Alice may know she will be offered some bet that will return $1 if it rains, and $0 otherwise, but not yet know what price the bookie will offer her. We will handle the more general case in a moment, but for now assume that there is some finite set Q ⊂ ½0, 1� such that Alice has credence 1 that she will face some q-problem, where q ∈ Q. We will treat Q as a random variable representing the q-value of the decision problem Alice faces and use Pr(q0) as an abbreviation for Pr(Q 5 q0). So, if Alice is uncertain about the value of Q, what expected loss does she assign her credence x in P? Ignoring the stakes, we find h1 xð Þ :5 EL xjPð Þ 5 o q ∈ Q x ≤ q 1 2 qð Þ � Pr qð Þ, (3) h0 xð Þ :5 EL xj:Pð Þ 5 o q ∈ Q q < x q � Pr qð Þ, (4) where h1 represents Alice’s expected loss of having credence x conditional on P, and h0 represents Alice’s expected loss of having credence x condi- tional on :P. That is, hi represents Alice’s expected loss given the truth-value of P but with the q-value of the bet still undetermined. We calculated h1 as follows: if P is the case and Alice’s credence x turns out to be less than or equal to Q, she will lose 1 2 Q. If her credence turns out to be greater than Q, she will not lose anything. So, h1 just is the sum of 1 2 q discounted by Alice’s credence Pr(Q 5 q) for every q ≥ x. The for- mula for h0 is determined similarly. We can then determine Alice’s unconditional expected loss of credence x in P by PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 627 ELx xð Þ 5 x � h1 xð Þ 1 1 2 xð Þh0 xð Þ, where ELx(x) is the loss Alice currently expects to suffer from her credence of x before she learns whether P is true or false. More generally, we can cal- culate Alice’s unconditional expected loss of alternative forecast y in P by ELx yð Þ 5 x � h1 yð Þ 1 1 2 xð Þh0 yð Þ, where ELx(y) is the expected loss Alice assigns to using another forecast instead of her own, but with her utility function held fixed. It is easy to check that H 5 (h1, h0) is in fact a merely proper scoring rule. 3.6. Letting the Stakes Count. As we noticed, two q-problems can have very different stakes. The same hand in blackjack matters a lot more at the $1,000 table than the $1 table, even though the probabilities remain the same. Furthermore, an agent may expect that if she faces a q-problem it will present her with stakes different from a q0-problem. Suppose, for instance, she is 50% confident she will face the low-stakes problem in the top panel of table 7 and 50% confident she will face the high-stakes problem in the bottom panel of table 7. Because the latter problem has much higher stakes, having her credence on the right side of 2/3 is significantly more important than having her credence on the right side of 1/2. In general, for any given q, the expected value of the stakes W can vary. Let E(WFq) be the expected value of W given that Alice faces a q- problem. Letting g1 represent the expected loss of credence x conditional on P and g0 be the expected loss of credence x conditional on :P, we have g1 xð Þ 5 o q ∈ Q x ≤ q 1 2 qð Þ � E Wjqð Þ � Pr qð Þ, (5) g0 xð Þ 5 o q ∈ Q q < x q � E Wjqð Þ � Pr qð Þ: (6) TABLE 7. q-PROBLEMS WITH DIFFERENT STAKES P :P Low stakes: d1 0 1/2 d0 1/2 0 High stakes: d*1 0 (2/3) � 15 d*0 (1/3) � 15 0 628 BENJAMIN ANDERS LEVINSTEIN The overall expected loss Alice assigns forecast y if her credence is x is then ELx yð Þ 5 x � g1 yð Þ 1 1 2 xð Þ � g0 yð Þ: (7) Again, G is a merely proper scoring rule. Let Q 5 fq1, ::: , qng. If Alice’s credence is x where qi < x ≤ qi11, then she assigns any y in the same inter- val the same expected loss she assigns herself. Otherwise, she assigns y a greater expected loss. To determine G, what matters then is both how high the stakes are ex- pected to be given that a q-problem is faced and how likely Alice thinks it is that she will face a q-problem. That is, what matters is the quantity M qð Þ :5 E Wjqð Þ � Pr qð Þ: We can then reformulate equations (5) and (6) above as g1 xð Þ 5 o q ∈ Q x ≤ q 1 2 qð Þ � M qð Þ, (8) g0 xð Þ 5 o q ∈ Q q < x q � M qð Þ, (9) where M(q) measures how important Alice thinks q-problems are to get right. Note that (i) M(q) ≥ 0 and (ii) M(q) 5 0 if and only if Pr(q) 5 0. Neither of these facts is surprising: it is never good to get a q-problem wrong, and if you think you might face a q-problem, then getting it right matters at least a little bit. 3.7. The Continuous Case. So far, we have assumed that q is in some finite set Q ⊂ ½0, 1� to keep things discrete. However, Alice could poten- tially face a q-problem for any q ∈ ½0, 1�. The main difficulty is that once we make this generalization, the probability of any particular q-problem is 0.18 The natural way to handle this problem is to trade out M(q) for a function m(q) that measures the probability density of facing a q-bet factored by the expected stakes of that bet. That is, m qð Þ:5 p qð ÞE Wjqð Þ, where p(q) is the probability density Alice assigns to the claim that she will face a q-problem. We call such an m Alice’s support function. The support 18. With the exception of at most countably many values of Q. PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 629 function measures how much relative importance is assigned to each pos- sible value of Q.19 To get a handle on m, note that (i) m(q) ≥ 0 and (ii) m is constantly 0 over some region [a, b] just in case Alice is certain that she will not face a q- problem for any q ∈ ½a, b�. As with M, the first condition reflects the fact that it is never good to get a q-problem wrong. The second condition means that if Alice thinks it is possible she will face a q-problem for q in some re- gion, then getting such a problem right matters at least a little bit. By switching out M for m and the sums for integrals in equations (8) and (9), we can generate scoring rules when Q 5 ½0, 1�. With some reformula- tion and loss of generality, Schervish’s theorem is then: Theorem 1 (Schervish). Let m(q) be a support function, and let g1 xð Þ 5 ð1 x 1 2 qð Þm qð Þdq, g0 xð Þ 5 ðx 0 q � m qð Þdq: Then G 5 (g1, g0) is a proper scoring rule. If m is strictly positive almost everywhere, then G is a strictly proper scoring rule. With some generalization, this method gets us every proper and strictly proper scoring rule, aside from uninteresting ones that, for example, assign every region infinite importance. Note that the condition that m be positive almost everywhere is a kind of regularity condition. Alice will set m(q) 5 0 if and only if she is certain she will not face a q-problem for that q-value. So, if she leaves open the possi- bility (however remote) of facing any q-problem whatever, then she will use a strictly proper scoring rule to evaluate her credence. 3.8. Examples. Above, we defined the local versions of the Brier Score, Log Score, and Spherical Score. Let us see how these each encodes very dif- ferent expectations about the bets Alice may face. If Alice assigns constant weight to every point, we have m qð Þ 5 c, which generates the Brier Score (times c/2). 19. For theories that allow E(Wjq) to be defined even when Pr(q) 5 0, see Rényi (1955) and Popper (1959). 630 BENJAMIN ANDERS LEVINSTEIN If Alice cares about points in [0,1] closer to 0 or 1 more than she cares about other points, she might set m qð Þ 5 1 q � 1 2 qð Þ , which generates the Log Score. If Alice weights points near.5 more than she weights other points, she may set m qð Þ 5 1 2q2 2 2q 1 1ð Þ3=2 , which generates the Spherical Score. Figure 1 provides a visual representation of each of these scoring rules along with their corresponding support functions. The Brier Score is the most egalitarian of all scoring rules in terms of the decisions the user expects to make. Imagine you knew you were going to be offered a bet on X that paid $2 if X and $0 otherwise. The price of the bet will be chosen at random (i.e., by the uniform distribution on [0,2]). In is case, the Brier Score is the right scoring rule. The stakes of the bet are con- stant; that is, E(Wjq) 5 2 for all q. Furthermore, since the value of Q fol- lows a uniform distribution, p(q) 5 1 for all q. So, m(q) 5 2. Figure 1. Brier, log, and spherical scores along with their support functions. Left, ascending curves represent g0(x) and descending curves represent g1(x) for the re- spective scoring rules. Color version available as an online enhancement. PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 631 The Logarithmic Score is approximately right when, in expectation, bets will be concentrated near the end points of the unit interval and when the stakes are high near those points. The Spherical Rule is best when the suc- cess of your decisions will likely depend on correct unbiased binary classi- fication, that is, guessing whether X or :X is true depending on whether your credence in X is greater or less than .5. In real life, of course, our views over the bets we will face on proposi- tions are a lot messier. Suppose I am wondering whether my house will burn down in the next year. I do not yet know exactly which bet I will face on that proposition, but in expectation a credence of .01 is very different from a cre- dence of .0001. In the former case, I will likely pay quite a bit for an insur- ance policy. So, relatively close to 0, my loss function will behave a lot like the Log Rule. However, a credence of 1026 and a credence of 1027 in this same proposition are roughly equivalent as far as my real-world success is concerned. Despite the value of my house, if my credence is low enough that it will burn down, I am effectively morally certain that it will not burn down. I assign, at some point, negligible practical weight to the possibility that I will wrongly bet that it will remain in tact for the next year. So the Log Rule will not be a perfect fit. Similarly, the Spherical Score is a good approximation of the right score for my credence in whether it will rain. Whether I make the right decision about bringing an umbrella, buying baseball tickets in advance, or cancel- ing my picnic hinges on whether my credence is above or below (approx- imately) .5. I doubt any decision I make will depend on whether my cre- dence is on the correct side of .05 or .95. At that point, I am effectively certain it will not (will) rain as far as my decision making goes. Therefore, the correct support function will place more weight on middling regions of the unit interval and less weight on extreme regions. 3.9. Global Scoring Rules. Let us now briefly turn our attention to scoring global credence functions. The idea is the same: the expected score of an entire credence function is the expected loss an agent would incur from using that credence function to make bets. The score of an individual credence x in proposition X at a world is the expected loss. That is, the score measures the expected loss given the actual truth-value of X before it is known exactly which bet on X the agent will face. Likewise, when we score an entire credence function b at a world, we plug in the actual truth-values of the propositions in F , but we do not yet plug in the actual bets the agent faces. Indeed, she may not face a bet on some propositions at all. The easiest way to determine this global score is simply to identify it with the (normalized) sum of the local scores: 632 BENJAMIN ANDERS LEVINSTEIN G b, wð Þ 5 a � o X∈F GX b Xð Þ, w Xð Þð Þ, where a is positive, and GX is the scoring rule used for proposition X. In general, as we have seen, the scoring rule for the individual propositions will vary heavily, depending on the expected bets the agent may face. Fur- thermore, because we are much more likely to face bets on some proposi- tions than on others, and because the stakes of those bets will vary heavily, some propositions will count much more to b’s overall score at w. For in- stance, it is much more likely that we will face a bet on whether it will rain than on whether there are an odd number of grains of sand in the Sahara. So, G(b,w) will give extra weight to the former proposition. For this to work (i.e., for us to be able to simply add up a bunch of local scores) we need a way of localizing bets. That is, given your entire credence function, we need a way to determine which single proposition your deci- sion to perform act u or :u is a bet on. You might, for instance, want to bring an umbrella if your credence in rain is high enough. But you may also decide to bring an umbrella if your credence that it will snow is high. So, your decision is not really a bet on rain, nor is it a bet on snow, but instead it is a bet on (rain or snow). To do so, the best method seems to be to partition the set of states of the world into those in which you would rather perform u and those in which you would rather perform :u. With your utility assignments fixed, the prop- osition u is taken to be a bet on the disjunction of all the states in which performing u is in expectation better than performing :u.20 In this way, Schervish’s method can be used to generate an additive scoring rule that represents the expected loss of an agent’s entire doxastic state. 4. The Value of Information. One useful application of scoring rules is that they allow us to quantify the value of information for various proposi- tions. We will now see how this may work in some detail. 20. As a referee points out, this introduces another important grand-world/small-world problem. To see this, suppose Alice would rather bring an umbrella if she learned it was going to rain but would want to leave it home otherwise. That is, on the partition {R, :R}, Alice considers u a bet on R. However, suppose that if it will rain only very lightly (L), she would still rather leave her umbrella at home. So, when she considers {R&L, R&:L, :R}, she takes u to be a bet on just R&:L. Even if all of Alice’s prefer- ences are ultimately partition invariant, then, what actions are a bet on which states of the world will vary depending on how the set of states is carved up. In turn, Schervish’s method will yield different results depending on this partition. I do not have a general answer to the question of how finely agents need to partition the set of states of the world for the representative scoring rules to reflect their expected losses from their credence function in the best way. However, I lean toward a subjectivist solution—your global score should reflect the finest partition that is cognitively available. PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 633 4.1. Experiments. Suppose Alice is a scientist interested in whether X. She decides to spend some resources to perform an experiment that will provide her with new evidence. She cannot perform every potential exper- iment, so she will have to make some choices on the basis of the expected informational value of the results. How might she go about deciding? Following Greaves and Wallace (2006), we call a partition E of Q an ex- periment if the agent will learn some element of that partition is true. We say that an experiment E is performed once the agent learns which element of E in fact obtains. For example, suppose Alice is wondering whether her favorite team will win tonight. She knows that if Jake is pitching, there is a .75 chance they will win. Otherwise there is a .25 chance. If she looks at the team’s website, she can find out who is pitching. So, in this case, there are four relevant states fW ∧ P, W ∧ :P, :W ∧ P, :W ∧ :Pg, where W stands for the prop- osition that her team wins, and P stands for the proposition that Jake is pitching. She can perform the experiment E of finding out who is pitching and thereby partition Q into P 5 fW ∧ P, :W ∧ Pg and :P 5 fW ∧ :P, :W ∧ :Pg. If Alice performs E, she will learn either P or :P. So, Alice’s posterior credence in W, denoted bE(W), will either be b(WjP) 5 :75 or b(Wj:P) 5 :25, although she does not yet know which. Suppose Alice’s current cre- dence in W is .5. How valuable is the experiment E to her in expectation? Before performing E, Alice assigns her credence an expected loss of EG(:5) 5 :5(g1(:5) 1 g0(:5)). If she learns P, she will either get a score of g1(.75) or g0(.75). So, her expected score given that she learns P is b W Pj Þg1 :75ð Þ 1 b :Wð jPð Þg0 :75ð Þ 5 :75g1 :75ð Þ 1 :25g0 :75ð Þ : Likewise, her expected score given that she learns :P is :25g1(:25) 1 :75g0(:25). Given the constraints we have established, she must also have a .5 credence in P. So, the expected score she currently assigns to her as- yet-unknown posterior credence after performing E is EG bεE Wð Þð Þ 5 :375 g1 :75ð Þ 1 g0 :25ð Þð Þ 1 :125 g0 :75ð Þ 1 g1 :25ð Þð Þ: (10) Greaves and Wallace show that updating by conditionalization is the policy that minimizes expected loss for all proper scoring rules. So, (10) is less than or equal to Alice’s current score of EG(.5), and the inequality is strict if G is strictly proper. From a practical point of view, this result follows from Good (1967), who shows that in any decision context, free information never has negative value in expectation.21 Indeed, we can quantify the expected value of performing E 21. See Myrvold (2012) for an in-depth discussion of Good’s theorem and scoring rules. 634 BENJAMIN ANDERS LEVINSTEIN as the difference between Alice’s current expected loss and her expected posterior loss, that is, as Val(E) 5 EG(:5) 2 EG(bE(W)). If we identify util- ity with dollars, Alice should pay up to $Val(E ) to learn the result of exper- iment E.22 Note that the precise value of an experiment depends on which scoring rule Alice uses for the proposition under investigation. Suppose, for in- stance, that Alice had the option of performing either E or an alternative ex- periment E0 that would reveal who won the game with 20% probability and would reveal no relevant information otherwise. That is, if Alice performs E0 instead of E, then there is a 20% chance she will learn either that W or that :W for sure and an 80% chance her credence in W will remain at .5. If Alice can only perform one of the two experiments, she will prefer to perform E if she uses the Brier Score, but she will prefer to perform E0 if she uses the Log Score.23 Put slightly differently, the evidence an agent chooses to gather will de- pend on her scoring rule, which in turn depends on what sorts of decisions she expects she will make on the basis of her credence in the proposition in question. The decision of which evidence to collect is, what we might call, practico-epistemic. The evidence itself will determine her credal state, but her process of investigation is a practical one. Finally, we observe that some experiments are absolutely preferable to others. That is, sometimes Alice will prefer to perform E1 to E2 regardless of her scoring rule. For instance, suppose Alice is interested in whether Bob or Carol will win the upcoming election. One polling company asks 5,000 people whom they plan to vote for. A second company asks only 50 (distinct) people. If we stipulate that each company uses a reasonable method of selecting participants, Alice will always prefer to learn the results of the first poll to the results of the second, although she would prefer to learn the results of both polls to the results of either one. When performing E1 is preferable to performing E2 according to every strictly proper rule, E1 first-order stochastically dominates E2. That is, re- 22. For simplicity, we restrict attention here just to the gain in value with respect to the proposition W instead of the whole credence function. 23. From the setup, if Alice performs E0, she has credence .1 she will learn W and cre- dence .1 she will learn :W. In either case, she is guaranteed a perfect score of 0, since she will end up with credence 1 (0) in W just if W is true (false). Conditional on learning nothing relevant, she has credence .5 she will receive a score of g1(.5) and credence .5 she will receive a score of g0(.5). She assigns credence .8 to learning nothing relevant, so her total expected score should she perform experiment E0 is EG(bE0(W)) 5 :4(g1(:5) 1 g0(:5)). According to the Brier Score, she expects to have disutility .1875 after perform- ing E and disutility .2 after performing E0. On the Log Score, she expects disutility of approximately .56 after performing E and disutility of approximately .55 after perform- ing E0. PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 635 gardless of the kinds of bets Alice expects to make, she is in expectation better off if she performs E1.24 4.2. Evaluation of Agents. One special kind of experiment we often conduct is to ask other agents what their credence is in some proposition. Instead of going to her favorite team’s website, for instance, Alice might ask Bob how confident he is that the team won. Ex ante, Alice does not know what Bob thinks. We can determine Alice’s assessment of the value of asking Bob for his credence the same way we de- termined the value of other experiments above. Let B be the experiment of asking Bob. Then Val(B) 5 EG(:5) 2 EG(bB(W)). That is, the value of ask- ing Bob is the expected difference in inaccuracy (according to Alice’s scor- ing rule) of her own credence after asking Bob and her current credence. Note that EG(bB(W)) is not Alice’s assessment of Bob’s inaccuracy but is instead her assessment of what her own inaccuracy will be after talking to Bob. For instance, suppose Alice knows that Bob always has credence 0 in truths and credence 1 in falsehoods. If Bob tells her he is certain that her team lost, then she will become certain that they won and vice versa. So, she is ex ante certain that Bob will be perfectly inaccurate and also certain that she herself will be perfectly accurate after talking to him. More formally, we can capture this distinction between Alice’s assess- ment of her own expected posterior inaccuracy and her assessment of Bob’s expected inaccuracy as follows. Let B 5 x refer to the proposition that Bob’s credence in W is x, and let bx(W) :5 b(WjB 5 x). We then have25 EG bB Wð Þ � � 5 o x b B 5 xð Þ � bx Wð Þg1 bx Wð Þð Þ 1 1 2 bx Wð Þð Þg0 bx Wð Þð Þ � , (11) EG Bð Þ 5 o x b B 5 xð Þ bx Wð Þg1 xð Þ 1 1 2 bx Wð Þð Þg0 xð Þð Þ: (12) As before, then, the value of the experiment of asking Bob (i.e., eq. [11]) is determined by what Alice expects of the disutility of her own future cre- dence, which of course depends on her expectations about what she will use her credence for. That is why we here look at G(bx(W ))—that is, the inac- curacy of Alice’s credences after talking to Bob. Equation (12) measures Alice’s assessment of Bob’s inaccuracy before she learns what he actually thinks. Practically, her view of Bob’s inaccuracy 24. See DeGroot and Fienberg (1982, 1983) and Schervish (1989) for more on the value of experiments and first-order dominance. 25. More generally, if Bob’s credence could take on any value in [0,1], we can straight- forwardly replace the sums in both eqq. (11) and (12) with integrals. 636 BENJAMIN ANDERS LEVINSTEIN measures how much Alice expects to lose if she were to switch over from her own credences to Bob’s to make decisions while retaining her preferences over outcomes. So, if she expects that Bob is more accurate than she is, she would prefer (ex ante) to use his credences to hers.26 Note that, if Bob is an epistemic expert for Alice, then equations (11) and (12) coincide. That is, if for any x, b(WjB 5 x) 5 x, then gi(bx(W)) 5 gi(x). As with experiments, which agents are expected to be more accurate than which others is scoring-rule dependent. For instance, Alice might expect Bob to be more accurate with respect to the Brier Score but less accurate with respect to the Log Score than Carol is.27 If Alice uses the Brier Score, then she would prefer Bob’s credences to her own, given the decisions she actu- ally expects to make. Alice may also sometimes expect one agent to be more accurate than an- other regardless of which scoring rule she uses. For example, suppose Alice treats Bob and Carol both as epistemic experts, but she knows that Carol will have either credence .8 or .2 in W, while Bob will have either credence .6 or .4 in W. It is easy to check, by equation (12), that for any G that is strictly proper, Carol is in expectation more accurate than Bob. So, regardless of the bets Alice expects she will make, she thinks she would be better off using Carol’s credences than Bob’s credences.28 In other words, one agent is judged absolutely more accurate than another if she is thought to be doing better regardless of the purposes of inquiry. One agent is judged better rel- ative to a particular rule if she is thought to be doing better relative to the particular purposes of inquiry for the proposition(s) in question. 5. Conclusion. We began with two questions: (1) Which measures of inac- curacy are legitimate? and (2) When should we use a particular measure over another? Schervish’s theorem provided an answer to both. Strictly proper scoring rules are the right tools for measuring an agent’s inaccuracy when she is broadly uncertain what sorts of bets she might face. The exact nature of her expectation of future decision problems will determine which proper scoring rule in particular is right for her. Although Schervish’s theorem iden- tifies inaccuracy with expected practical loss, it retains some claim to being a 26. Because Alice does not yet know what Bob’s credences are, G can be strictly proper even though she still expects Bob to be more accurate than she is. 27. To see why, imagine that Alice knows Bob has either credence .25 or .75 in W and that Carol has credence 1, 0, or .5. If she treats both agents as experts with respect to W, then—with the right numbers filled in—asking Bob for his credence can be made equiv- alent to experiment E, and asking Carol can be made equivalent to E0 above. 28. See DeGroot and Fienberg (1982, 1983) and DeGroot and Eriksson (1985) for a de- tailed characterization of when one credence function is more accurate than another on every strictly proper rule. PRAGMATIST’S GUIDE TO EPISTEMIC UTILITY 637 measure of epistemic utility. After all, it measures the value of a doxastic state and the value of truth. REFERENCES Bronfman, A. 2009. “A Gap in Joyce’s Argument for Probabilism.” Unpublished manuscript, Uni- versity of Michigan. D’Agostino, M., and C. Sinigaglia. 2010. “Epistemic Accuracy and Subjective Probability.” In EPSA Epistemology and Methodology of Science: Launch of the European Philosophy of Sci- ence Association, ed. M. Suárez, M. Dorato, and M. Rédei, 95–105. Dordrecht: Springer. DeGroot, M. H., and E. Eriksson. 1985. “Probability Forecasting, Stochastic Dominance, and the Lorenz Curve.” In Bayesian Statistics 2: Proceedings of the Second Valencia International Meeting, ed. J. Bernardo, M. DeGroot, D. Lindley, and A. Smith, 99–118. Amsterdam: Elsevier. DeGroot, M. H., and S. E. Fienberg. 1982. “Assessing Probability Assessors: Calibration and Re- finement.” In Statistical Decision Theory and Related Topics III, vol. 1, ed. S. S. Gupta and J. O. Berger. New York: Academic Press. ———. 1983. “The Comparison and Evaluation of Forecasters.” In “Proceedings of the 1982 I.O.S. Annual Conference on Practical Bayesian Statistics,” special issue, Statistician 32 (1–2): 12–22. Gibbard, A. 2007. “Rational Credence and the Value of Truth.” In Oxford Studies in Epistemology, vol. 2, ed. T. Gendler and J. Hawthorne, 143–64. Oxford: Oxford University Press. Good, I. 1967. “On the Principle of Total Evidence.” British Journal for the Philosophy of Science 17:319–22. Greaves, H., and D. Wallace. 2006. “Justifying Conditionalization: Conditionalization Maximizes Expected Epistemic Utility.” Mind 115 (632): 607–32. Jose, V. R. 2007. “A Characterization for the Spherical Scoring Rule.” Theory and Decision 66:263–81. Joyce, J. M. 1998. “A Nonpragmatic Vindication of Probabilism.” Philosophy of Science 65:575– 603. ———. 1999. The Foundations of Causal Decision Theory. Cambridge Studies in Probability, In- duction, and Decision Theory. Cambridge: Cambridge University Press. ———. 2009. “Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Belief.” In Degrees of Belief, vol. 342, ed. F. Huber and C. Schmidt-Petri, 263–97. Dordrecht: Springer. Konek, J., and B. A. Levinstein. 2017. “The Foundations of Epistemic Decision Theory.” Mind, forthcoming. Leitgeb, H., and R. Pettigrew. 2010a. “An Objective Justification of Bayesianism I: Measuring In- accuracy.” Philosophy of Science 77:201–35. ———. 2010b. “An Objective Justification of Bayesianism II: The Consequences of Minimizing Inaccuracy.” Philosophy of Science 77:236–72. Myrvold, W. C. 2012. “Epistemic Value and the Value of Learning.” Synthese 187 (2): 547–68. Pettigrew, R. 2013. “A New Epistemic Utility Argument for the Principal Principle.” Episteme 10 (1): 19–35. ———. 2016a. Accuracy and the Laws of Credence. Oxford: Oxford University Press. ———. 2016b. “Accuracy, Risk, and the Principle of Indifference.” Philosophy and Phenomeno- logical Research 92 (1): 35–59. ———. 2016c. “Jamesian Epistemology Formalised: An Explication of ‘The Will to Believe.’” Episteme 13 (3): 253–68. Popper, K. 1959. The Logic of Scientific Discovery. New York: Basic. Rényi, A. 1955. “On a New Axiomatic Theory of Probability.” Acta Mathematica Academiae Hungarica 6 (3): 286–335. Rosenkrantz, R. 1981. Foundations and Applications of Inductive Probability. Atascadero, CA: Ridgeview. Schervish, M. 1989. “A General Method for Comparing Probability Assessors.” Annals of Statis- tics 17:1856–79. 638 BENJAMIN ANDERS LEVINSTEIN