Confirmation Measures and Sensitivity

November 3, 2014

Abstract

Stevens (1946) draws a useful distinction between ordinal scales, interval

scales, and ratio scales. Most recent discussions of confirmation measures

have proceeded on the ordinal level of analysis. In this paper, I give a more

quantitative analysis. In particular, I show that the requirement that our

desired confirmation measure be at least an interval measure naturally yields

necessary conditions that jointly entail the log-likelihood measure. Thus I

conclude that the log-likelihood measure is the only good candidate interval

measure.

1


1 Introduction

Suppose our preferred confirmation measure, c, outputs the numbers c(H1,E) = 0.1,

c(H2,E) = 0.2, c(H3,E) = 0.3, c(H4,E) = 50 for hypotheses H1, H2, H3, and H4,

given evidence E. It is natural to want to say that H1 and H2 are confirmed to

roughly the same (low) degree by E, and that H4 is confirmed by E to a much

higher degree than either H1 or H2. We might also want to say that the difference in

confirmation conferred by E on H1 as opposed to on H2 is the same as the difference

in confirmation conferred by E on H2 as opposed to on H3. If we make any of the

preceding assertions, we are implicitly relying on the assumption that it is legitimate

to interpret the differences between the numbers outputted by measure c. In other

words, we are assuming that c is at least an interval measure in the terminology

of Stevens (1946). In this paper I will show how the preceding assumption, when

properly spelled out, places stringent requirements on c that considerably narrow

down the field of potential confirmation measures. In fact, I will show that only

the log-likelihood measure meets the requirements. My argument does not, however,

establish that the log-likelihood measure is an interval measure, nor that it is the

true measure of confirmation; the argument only shows that the log-likelihood is

the only candidate interval measure. This leaves it open that there is no adequate

confirmation measure that is an interval measure.

I start by laying out my background assumptions in Section 2. In Section 3, I

make the requirements on c more precise. In Section 4, I show how these requirements

entail that c is the log-likelihood measure. In Section 5, I discuss the implications of

the argument and consider a couple of objections.

2 Background Assumptions

According to a criterion of confirmation universally agreed upon among Bayesians,

E confirms H just in case Pr(H|E) > Pr(H).1 Although this criterion suffices to
1Disconfirmation happens when the inequality sign is reversed, and when there is an equality

sign we have confirmation neutrality.

2


answer the binary question of whether or not E confirms H, it does not answer the

quantitative question of whether E confirms H to a high degree, nor does it answer

the comparative question of which of two hypotheses is confirmed more by E.2 In

order to answer either of the preceding types of question, one needs a confirmation

measure that quantifies the degree to which E confirms (or disconfirms) H. The

following is a small sample of the measures that have been offered in the literature:

The plain ratio measure, r(H,E) =
Pr(H|E)
Pr(H)

The log-ratio measure, lr(H,E) = log r(H,E)

The difference measure, d(H,E) = Pr(H|E) −Pr(H)

The log-likelihood measure, l(H,E) = log(
Pr(E|H)
Pr(E|¬H) )

The alternative difference measure, s(H,E) = Pr(H|E) −Pr(H|¬E)3

Since Bayesians analyze confirmation in terms of probability, and since the proba-

bility distribution over the algebra generated by H and E is determined by Pr(H|E),
Pr(H), and Pr(E), it has become standard to assume that any confirmation mea-

sure can be expressed as a function of Pr(H|E), Pr(H), and Pr(E). The preceding
assumption is essentially the requirement that Crupi et al. (2013) call “formality.”

A strong case can however be made for not allowing our measure of confirmation to

depend on Pr(E). As Atkinson (2009) points out, if we let c(H,E) be a function of

Pr(E), then c(H,E) can change even if we add to E a piece of irrelevant ”evidence”

E′ that is probabilistically independent of H and E, and of their conjunction. To

see this, suppose that c(H,E) = f(Pr(H),Pr(H|E),Pr(E)). Let E′ be any propo-
sition whatsoever that is independent of H, E, and H&E.4 Then c(H,E&E′) =

f(Pr(H),Pr(H|E&E′),Pr(E&E′)) = f(Pr(H),Pr(H|E),Pr(E)Pr(E′)). If f de-
pends on the third argument, we can find some probability function Pr such that

f(Pr(H),Pr(H|E),Pr(E)Pr(E′)) 6= f(Pr(H),Pr(H|E),Pr(E)), and thus such
2Carnap (1962) was the first philosopher to draw the distinction between these three questions.
3This measure is also sometimes called the ”Joyce-Christensen measure,” after Joyce (1999) and

Christensen (1999).
4I am of course assuming here that H and E are fixed.

3


that c(H,E&E′) 6= c(H,E). However, this is clearly counterintuitive, since E′ is
probabilistically independent of H and E and therefore should not have any impact

on the confirmation of H. So we conclude that f should not depend on Pr(E).

Since I find the preceding argument convincing, I will assume that the confirma-

tion measure we are looking for is of the following form: c(H,E) = f(Pr(H),Pr(H|E)).
Since there is no a priori restriction on what credences an agent may have except

that these credences must lie somewhere in the interval [0, 1], I will assume that f is

defined on all of [0, 1] ∗ [0, 1]. Note that, as Huber (2008) points out, this is not the
same as assuming that any particular probability distribution Pr(∗) is continuous.

The preceding two assumptions are summed up in the following requirement:

Strong Formality (SF). Any confirmation measure is of the following form: c(H,E) =

f(Pr(H),Pr(H|E)), where f is a function defined on all of [0, 1] ∗ [0, 1].

It should be noted that (SF) excludes some of the confirmation measures that

have been offered in the literature.5 I briefly address lingering objections to (SF) in

Section 5.

Finally, I will also adopt the following convention:

Confirmation Convention (CC).

c(H,E) :



> 0 if Pr(H|E) > Pr(H),

= 0 if Pr(H|E) = Pr(H),

< 0 if Pr(H|E) < Pr(H).

(CC) is sometimes taken to be part of the definition of what a confirmation

measure is (e.g. by Fitelson (2001)). Although I think it is a mistake to think of

(CC) in this way, I will adopt (CC) in this paper for convenience. (CC) has the role

of setting 0 as the number that signifies confirmation neutrality.

5In particular, the alternative difference measure.

4


3 The Main Requirement on c

Suppose we witness a coin being flipped 10 times, and our task is to assign a credence

to the proposition that the coin comes up heads on the 11th flip. If we do not in

advance know anything about the coin’s bias, it is reasonable to guess that the coin

will come up heads with probability H/10 on the 11th flip, where H is the number

of times the coin comes up heads in the 10 initial flips.6 In making this guess, we

are setting our credence in the coin landing heads equal to the observed frequency

of heads. This move is reasonable since the law of large numbers guarantees that

the observed frequency of heads converges in probability to the coin’s actual bias.

The observed frequency of heads does not necessarily equal the coin’s bias after just

10 flips, however. In fact, statistics tells us that the confidence interval around the

observed frequency can be approximated by p̂±z
√

1
n
p̂(1 − p̂), where p̂ is the observed

frequency, n is the sample size (in this case, 10 coin flips), and z is determined by

our desired confidence level.

For example, suppose we witness 4 heads in 10 coin flips and we set our confidence

level to 95%. In that case, z = 1.96, p̂ = 0.4, and the calculated confidence interval

is approximately [0.1, 0.7]. Clearly, the confidence interval in this case is rather

large. Given our evidence, we can do no better than to estimate the coin’s bias as

0.4. However, we also need to realize that if the 10 flips were repeated, we would

probably end up with a slightly different value for p̂: we should acknowledge that

credences are bound to vary with our varying evidence.

The above example illustrates one way that variability can sneak into our cre-

dences: if our credence is calibrated to frequency data, then our credence inherits

the variability intrinsic to the frequency data. However, even if we set our credence

by other means than frequency data, we must admit that rational credences are in-

trinsically somewhat variable. For example, if the sky looks ominous and I guess

that there is a 75% chance that it is going to rain (or perhaps my betting behavior

reveals that this is my credence that it is going to rain), I must concede that another

agent whose credence (or revealed credence) is 74% or 76% is just as rational as I

6This assumes 0 < H < 10.

5


am: I do not have either the evidence nor the expertise to discriminate between these

credences. And even if I do have good evidence as well as expertise, I must admit

that I am almost never in a position where I have all the evidence, and had I been

provided with somewhat different evidence, I would have ended up with a somewhat

different credence.

The fact that our credences are variable is a fact of life that any rational agent

must face squarely. It is not hard to see that this fact also affects Bayesian confir-

mation theory. Bayesian confirmation measures are defined in terms of credences,

and are therefore infected by the variability inherent in credences. If Bayesian con-

firmation measures are necessarily affected by variable credences, I contend that we

should want a confirmation measure that is affected by such variability in a system-

atic and predictable way. We should want this even if we only care about the ordinal

properties of confirmation measures. Suppose, for instance, that our confirmation

measure is very sensitive to minor variations in the prior or the posterior. In that

case, if we find out that c(H,E) > c(H′,E′), we cannot necessarily be confident

that H truly is better confirmed by E than H′ is by E′ because a small variation in

our credence in H or H′ might well flip the inequality sign so that we instead have

c(H,E) < c(H′,E′). In order to be confident that c(H,E) really is better confirmed

than c(H′,E′), we need to be assured that the inequality sign is stable. Now, we can

be assured that the inequality is stable as long as c(H,E)−c(H′,E′) is of “significant
size.” But in order for us to be able to determine that c(H,E) − c(H′,E′) is “of
significant size,” we need to be able to draw meaningful and robust conclusions from

this difference.

Thus, even if we are primarily interested in the ordinal ranking of evidence-

hypothesis pairs provided by c, we still want to be able to draw conclusions from the

difference c(H,E)−c(H′,E′). However, if c is very sensitive to small variations in the
priors or posteriors of H and H′, then the quantity c(H,E) − c(H′,E′) is unstable:
it could easily have been different, since our priors or posteriors could easily have

been slightly different (for instance if we calibrated our priors to frequency data).

We are therefore only justified in interpreting the difference c(H,E) − c(H′,E′) if c
is relatively insensitive to small variations in the priors and posteriors.

6


Suppose, moreover, that slight variations in small priors (or posteriors) have a

larger effect on c’s output than do slight variations in larger priors. Then we cannot

compare the quantity c(H,E) −c(H′,E) to the quantity c(H′′,E) −c(H′,E) unless
our prior credences in H′′ and H are approximately the same. In order for us to be

able to compare c(H,E) − c(H′,E) to c(H′′,E) − c(H′,E) in cases where our prior
credences in H′′ and H are very different, we need c to be uniformly insensitive to

small variations in the prior (and the posterior). We can sum up the preceding two

remarks as follows:

Main Requirement (MR). We are justified in interpreting and drawing conclu-

sions from the quantity c(H,E)−c(H′,E′) only if c is uniformly insensitive to small
variations in Pr(H) and Pr(H|E).

As it stands, (MR) is vague. What counts as a small variation in a credence?

Moreover, what does it mean, concretely, for c to be uniformly insensitive to such

variations? To get a better handle on these questions, let us formalize the important

quantities that occur in (MR). Following (SF), we are assuming that c is of the

form c(H,E) = f(Pr(H),Pr(H|E)). For simplicity, let us put Pr(H) = x and
Pr(H|E) = y, so that c = f(x,y). According to (MR), we require that f be
uniformly insensitive to small variations in x and y. I will use v(p,�) to capture the

notion of a small variation in the probability p, where � is a parameter denoting the

size of the variation. Moreover, I will use ∆x�c(x,y) to denote the variation in c that

results from a variation of size � about x. That is to say,

∆x�c(x,y) = f(v(x,�),y) −f(x,y) (3.1)

Similarly, I will use ∆y�c(x,y) to denote the variation in c that results from a

variation of size � about y. Thus,

∆y�c(x,y) = f(x,v(y,�)) −f(x,y) (3.2)

The next step is to get a better grip on (MR) by investigating the terms that

occur in (3.1) and (3.2). In sections 3.1 through 3.3, that is what I do.

7


3.1 What is uniform insensitivity?

First, the demand that c be uniformly insensitive to variations in the prior and the

posterior now has an easy formal counterpart: it is simply the demand that for

different values x1,x2,y1, and y2 of x and y, we have ∆
x1
� c(x1,y1) = ∆

x2
� c(x2,y2) =

∆x2� c(x2,y1) = etc. and ∆
y1
� c(x1,y1) = ∆

y2
� c(x2,y2) = ∆

y2
� c(x1,y2) = etc. Thus, across

different values of x and y, a small variation in c will mean the same thing. More

importantly, this means we can consider ∆x�c(x,y) as purely a function of �, and

likewise for ∆y�c(x,y). From now on, I will therefore write:

g(�) := ∆x�c(x,y) (3.3)

h(�) := ∆y�c(x,y) (3.4)

In order to figure out what the requirement that c be insensitive to small varia-

tions amounts to, we need to figure out how to quantify variations in credences. It

is to this question that I now turn.

3.2 What is a small variation in a credence?

Given a credence x, what counts as a small variation in x? This question turns out to

have a more subtle answer than one might expect. Using the notation from equations

3.1 and 3.2, what we are looking for is the form of the function v(x,�). Perhaps the

most natural functional form to consider is the following one: v(x,�) = x+�. On this

model, a small (positive) variation in the probability x is modeled as the addition of

a (small) number to x. However, if we consider specific examples, we see that this

model is too crude. For example, supposing that x = 0.5, we might consider 0.05 a

small variation relative to x. But if we consider x = 0.00001 instead, then 0.05 is no

longer small relative to x; instead it is now several orders of magnitude bigger.

The above example shows that the additive model cannot be right. An easy fix is

to scale the size of the variation with the size of x. In other words, we might suggest

the following form for v: v(x,�) = x + x�. This adjustment solves the problem

mentioned in the previous paragraph. According to the new v, a variation of size

8


0.025 about 0.5 is “equal” to a variation of 0.0000005 about 0.00001. In contrast to

the previous additive model, v(x,�) = x+x� is a “multiplicative” model of variability,

as we can see by instead writing it in the following form: v(x,�) = x(1 + �)

However, the multiplicative model, though much better than the additive model,

is still insufficient. One problem is purely mathematical. Since v(x,�) is supposed

to correspond to a small positive shift in probability, we should require that 0 ≤
v(x,�) ≤ 1, for all values of x and �. However, x + x� can easily be larger than 1,
for example if x = 0.9 and � = 0.2.7 The other problem is that v(x,�) treats values

of x close to 0 very differently from values of x close to 1. For instance, a variation

where � = 0.1 will be scaled to 0.001 when x = 0.01. But when x = 0.99, the same �

will be scaled to just 0.099. This is very problematic, since for every hypothesis H in

which we have a credence of 0.99, there corresponds a hypothesis in which we have a

credence of 0.01, namely ¬H. But a small variation in our credence in H is necessarily
also a small variation in our credence in ¬H, simply because Pr(¬H) = 1−Pr(H):
H and ¬H should therefore be treated symmetrically by v. There is an easy fix to
both of the preceding problems: if we scale � by x(1−x) instead, then first of all we
have 0 ≤ x + �x(1 −x) ≤ 1 , and thus 0 ≤ v(x,�) ≤ 1. Second of all, H and ¬H are
now treated symmetrically. From the preceding considerations, we therefore end up

with the following as our functional form for v: v(x,�) = x + x(1 −x)�.
There is a completely different argument by which we can arrive at the same

functional form for v. As I mentioned in the example at the beginning of section 3,

credences are sometimes calibrated to frequency data. This is for example usually

the case if H is a medical hypothesis. Suppose H represents the hypothesis that

a person P has disease X, for instance. The rational prior credence in H (before

a medical examination has taken place) is then the frequency of observed cases of

X in the population from which P is drawn. The frequency of observed cases of

X can be modeled as the outcome of a binomial process having mean Pr(H) and

variance Pr(H)(1 − Pr(H)). Suppose we observe the frequency fr(Ĥ). Then the
estimated variance is V ar(H) ≈ fr(Ĥ)(1 − fr(Ĥ)). The variance is maximal at
fr(Ĥ) = 0.5 and decreases as fr(Ĥ) moves closer to 0 or to 1. Arguably, it makes a

7This is also a problem for the additive model

9


lot of sense in this case for the variability in one’s credence to vary with the variance

of the frequency data. But that is exactly what v(x,�) = x + x(1 − x)� does: it
scales credence variability by data variance. Thus, according to v, a variation of size

V ar(X)� about credence X is equal to a variation of size V ar(Y )� about credence

Y .

From all the preceding considerations, I conclude that the following is the best

functional form for v:

v(x,�) = x + x(1 −x)� (3.5)

3.3 Uniform insensitivity to small variations in the prior and

posterior

The next step is to understand what insensitivity amounts to. To say that c is insen-

sitive to small variations in the prior or posterior is to say that such variations have

a small effect on confirmation: the most natural way to formalize this requirement

is in terms of continuity. Since g(�) represents the change in confirmation resulting

from a change (by �) in probability, a natural continuity requirement for c would be

that g and h should be continuous at 0.

However, continuity is too weak a requirement. Even if a function is continuous,

it is still possible for it to be very sensitive to small variations. For instance, the

function f(x) = 1000000x is continuous (everywhere), but is at the same time very

sensitive to small perturbations of x. Sensitivity to perturbations is most naturally

measured by looking at how the derivative behaves. Minimally, we should therefore

require that g and h be differentiable at 0. The next natural requirement would be to

demand that the derivative of both g and h be bounded by some “small” number. Of

course, pursuing such a requirement would require a discussion of what is to count

as a “small” number in this context. Since I do not actually need a requirement of

this sort in my argument in the next section, I will not pursue a discussion of these

issues here. The only upshot from this section is therefore that g and h should be

differentiable at 0.

10


4 The Main Result

Let me summarize where we are. Our desire to be able to draw conclusions from

differences in confirmation, i.e. from expressions of the form c(H,E) − c(H′,E′),
led us to the requirement that c be uniformly insensitive to small variations in

Pr(H) and Pr(H|E). In sections 3.1 through 3.3, I made the various components of
this requirement more precise. Putting all these components together, we have the

following:

Formal Version of the Main Requirement (MR) 4.1. We are justified in

drawing conclusions from the difference c(H,E) − c(H′,E′) only if the following
conditions are all met:

1. f(v(x,�),y) −f(x,y) = g(�), where:

2. g(�) does not depend on either x or y

3. g(�) is differentiable at 0

4. v(x,�) = x + x(1 −x)�

5. f(x,v(y,�) −f(x,y) = h(�), where:

6. h(�) does not depend on either x or y

7. h(�) is differentiable at 0

Note that (5) - (7) are just (1) - (3) except that they hold for h instead of for

g. Note also that (MR) is essentially epistemic. It says that “we” (i.e. agents

11


interested in confirmation) are only justified in drawing conclusions (of any kind)

from c(H,E) − c(H′,E′) if certain formal conditions are met. These conditions
ensure that c(H,E) behaves reasonably well. Together with (SF) and (CC), the

conditions in (MR) entail the log-likelihood measure, as I show next.

Main Result 4.1. If (MR) is true, (SF) is assumed, and (CC) is adopted as a

convention, then

c(H,E) = log
Pr(E|H)
Pr(E|¬H)

Where the identity is unique up to positive linear transformations with constant term

0.

Proof. Starting with (1) from (MR), we have,

f(v(x,�),y) −f(x,y) = g(�) (4.1)

If we divide each side by x(1 −x)�, we get:

f(v(x,�),y) −f(x,y)
x(1 −x)�

=
g(�)

x(1 −x)�
(4.2)

Next, we let � → 0:

lim
�→0

f(v(x,�),y) −f(x,y)
x(1 −x)�

= lim
�→0

g(�)

x(1 −x)�
(4.3)

Since g is differentiable at 0 (from part (3) of (MR)), the right hand side of the

above equation is just 1
x(1−x)g

′(0). Since the limit exists on the right hand side of the

equation, it must exit on the left side as well. But the left side is just ∂
∂x
f(x,y). We

therefore have,

∂

∂x
f(x,y) =

1

x(1 −x)
g′(0) (4.4)

Next, we take the antiderivative of each side of (4.4) with respect to x. Since g

and hence g′(0) does not depend on x (from part (2) of (MR)), we have:

12


f(x,y) = g′(0)(log x− log (1 −x)) + C (4.5)

Here, C is a number that depends on y but not on x. If we perform the above

calculations again starting instead with f(x,v(y,�)) −f(x,y) = h(�), we find that:

C = h′(0)(log y − log (1 −y)) + K (4.6)

Here, K is just a constant, i.e. it depends on neither x nor y. We therefore have:

f(x,y) = g′(0)(log x− log (1 −x)) + h′(0)(log y − log (1 −y)) + K (4.7)

Now set x = y = 0.5. The second part of (CC) then entails that K = 0. Next,

set x = y. Then (CC) entails:

g′(0)(log x− log (1 −x)) + h′(0)(log x− log (1 −x) = 0 (4.8)

This in turn entails that g′(0) = −h′(0). Thus we have,

f(x,y) = −h′(0)(log x− log (1 −x)) + h′(0)(log y − log (1 −y)) (4.9)

= h′(0) log
y

1 −y
∗

1 −x
x

(4.10)

Remembering that x = Pr(H) and y = Pr(H|E), (4.9)-(4.10) together with (SF)
entail:

13


c(H,E) = f(Pr(H),Pr(H|E)) (4.11)

= h′(0) log
Pr(H|E)

1 −Pr(H|E)
∗

1 −Pr(H)
Pr(H)

(4.12)

= h′(0) log
Pr(H|E)
Pr(H)

∗
Pr(¬H)
Pr(¬H|E)

(4.13)

= h′(0) log
Pr(H|E) ∗Pr(E)

Pr(H)
∗

Pr(¬H)
Pr(¬H|E) ∗Pr(E)

(4.14)

= h′(0) log
Pr(E|H)
Pr(E|¬H)

(4.15)

Finally, (CC) entails that h′(0) must be a positive number. Thus c(H,E) = l, up

to positive linear transformations with constant term 0.

5 Discussion and Objections

In the previous section, I showed that (MR), (SF), and (CC) jointly entail the log-

likelihood confirmation measure, l. The proof entails l up to strictly positive linear

transformations with constant term 0. That is to say, if log
Pr(E|H)
Pr(E|¬H) is a legitimate

confirmation measure, then so is a ∗ log Pr(E|H)
Pr(E|¬H) , for a > 0; the argument does

not establish that any particular logarithmic base is better than another. In Stevens

(1946)’s terminology, our measure is apparently ratio, meaning that we are justified in

interpreting both intervals and ratios between outputs of the measure. Analogously,

mass is also a ratio measure since it makes sense to say both that the difference

between 2kg and 4kg is the same as the difference between 4kg and 6kg, and that

4kg is twice as big as 2kg.

It therefore appears that my conclusion is stronger than what I set out to estab-

14


lish: in the introduction, I said that the goal was to find a confirmation measure that

can be interpreted as an interval measure. But the proof in the previous section ap-

parently establishes that l is a ratio measure. However, contrary to the appearances,

I think it is illegitimate to interpret l as a ratio measure. The difference between

interval measures and ratio measures is that ratio measures have a non-arbitrary 0.

But in our case, it is (CC) that establishes our measure’s 0, and (CC) is (as the

name suggests) just a convention. We could just as easily have chosen a convention

such that 1 meant confirmation neutrality. Therefore, the 0 is arguably arbitrary,

and it is not legitimate to interpret our measure as anything more than an interval

measure.

The second thing to notice about my argument is that it does not actually estab-

lish that the log-likelihood measure is the true confirmation measure. This is because

(MR) merely gives necessary conditions, and no sufficient ones. Thus, what my ar-

gument shows is really a conditional statement: if there is any interval confirmation

measure, then that measure is l. The preceding conditional is, of course, equivalent

to the following disjunction: either there is no interval confirmation measure or the

only interval confirmation measure is l.

The third and final observation I will make about the argument is that it clearly

depends very much on the choice of v. In Section 3.2 I considered and rejected two

other measures of variability: the additive measure, v(x,�) = x + �; and the multi-

plicative measure, v(x,�) = x + x�. It is natural to ask what confirmation measures

we end up if we instead use these alternative measures of credence variability. The

answer, although I will not show this here, is that the additive measure yields the

difference confirmation measure, d, whereas the multiplicative measure yields the

log-ratio confirmation measure, lr. We can therefore see that d and lr “embody”

defective measures of credence variability: arguably, that is a strike against these

measures.

Next, I will consider a couple of objections to the argument. First, my argu-

ment is obviously only sound if the assumptions in (MR) are correct. However,

the assumptions in (MR) might remind the reader of assumptions made in Good

(1960, 1984) and Milne (1996). These assumptions have been criticized by Fitelson

15


(2001, 2006) as being “strong and implausible” (2001, 28-29n43) and for having “no

intuitive connection to material desiderata for inductive logic” (2006, 7n13).

Why does my argument escape Fitelson’s criticisms? How is my argument dif-

ferent from the arguments made by Good and Milne? The answer is that, whereas

Good and Milne are not interested in the interval properties of their confirmation

measures, and the various mathematical assumptions they make therefore seem un-

motivated, all the properties listed in (MR) arise naturally out of our wish to have

a confirmation measure that is at least an interval measure.

Finally, one may object to some of the other background assumptions I make

in Section 1. In particular, Strong Formality may be accused of being too strong

since it excludes the alternative difference measure right off the bat. My reply to

this objection is as follows: the argument in Section 4 can be carried out without

Strong Formality, but the resulting analysis does not yield the alternative difference

measure, nor any other recognizable confirmation measure. Thus, even if one rejects

(SF), one cannot use the type of argument I have given in this paper to argue for the

alternative difference measure or other standard measures that depend non-trivially

on Pr(E).8

6 Conclusion

I have argued that there is a set of conditions that any confirmation measure must

meet in order to justifiably be interpreted as an interval measure. Furthermore, I

have shown that these necessary conditions, together with an additional plausible

assumption and a widely accepted convention, jointly entail the log-likelihood mea-

sure. My argument does not show that l is an interval measure, but it does show that

it is the only measure that stands the chance of being one. Nor does the argument

in this paper show that l is the “true” confirmation measure. However, to the extent

that we care about our measure’s being an interval measure, we should regard the

conclusion in this paper as favoring l as our preferred measure.

8Such as Carnap’s measure, c(H, E) = Pr(H&E) −Pr(H)P (E).

16


References

Atkinson, D. (2009). Confirmation and justification. A commentary on Shogenji’s

measure. Synthese, 184(1):49–61.

Carnap, R. (1962). Logical Foundations of Probability. Chicago: University of

Chicago Press, second edition.

Christensen, D. (1999). Measuring confirmation. Journal of Philosophy, XCVI:437–

61.

Crupi, V., Chater, N., and Tentori, K. (2013). New axioms for probability and

likelihood ratio measures. British Journal for the Philosophy of Science, 64:189–

204.

Fitelson, B. (2001). Studies in Bayesian Confirmation Theory. PhD thesis, University

of Wisconsin – Madison.

Fitelson, B. (2006). Logical foundations of evidential support. Philosophy of Science,

73:500–12.

Good, I. J. (1960). Weight of evidence, corroboration, explanatory power, informa-

tion and the utility of experiments. Journal of the Royal Statistical Society: Series

B, 22:319–31.

Good, I. J. (1984). The best explicatum for weight of evidence. Journal of Statistical

Computation and Simulation, 19:294–299.

Huber, F. (2008). Milne’s argument for the log-ratio measure. Philosophy of Science,

75:413–20.

Joyce, J. (1999). The Foundations of Causal Decision Theory. Cambridge: Cam-

bridge University Press.

Milne, P. (1996). Log[P(hleb)/P(hlb)] is the one true measure of confirmation. Phi-

losophy of Science, 63:21–6.

17


Stevens, S. (1946). On the theory of scales of measurement. Science, 103(2684):577–

80.

18