The Independence Condition in the Variety-of-Evidence Thesis


The Independence Condition in the Variety-of-Evidence Thesis
Author(s): François Claveau
Reviewed work(s):
Source: Philosophy of Science, Vol. 80, No. 1 (January 2013), pp. 94-118
Published by: The University of Chicago Press on behalf of the Philosophy of Science Association
Stable URL: http://www.jstor.org/stable/10.1086/668877 .
Accessed: 30/01/2013 09:13

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

 .
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

 .

The University of Chicago Press and Philosophy of Science Association are collaborating with JSTOR to
digitize, preserve and extend access to Philosophy of Science.

http://www.jstor.org 

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=ucpress
http://www.jstor.org/action/showPublisher?publisherCode=psa
http://www.jstor.org/stable/10.1086/668877?origin=JSTOR-pdf
http://www.jstor.org/page/info/about/policies/terms.jsp
http://www.jstor.org/page/info/about/policies/terms.jsp


The Independence Condition in the

Variety-of-Evidence Thesis
François Claveau*y
The variety-of-evidence thesis has been criticized by Bovens and Hartmann. This article
points to two limitations of their Bayesian model: the conceptualization of unreliable
evidential sources as randomizing and the restriction to comparing full independence to
full dependence. It is shown that the variety-of-evidence thesis is rehabilitated when
unreliable sources are reconceptualized as systematically biased. However, it turns out
that allowing for degrees of independence leads to a qualification of the variety-of-
evidence thesis: as Bovens and Hartmann claimed, more independence does not always
imply stronger confirmation.

1. Introduction. Seeking a variety of evidence for a hypothesis is standard
practice in science, as well as in normal life. The members of the OPERA
Collaboration, for instance, appealed to the value of evidential variety when
they disclosed their measurement of neutrinos apparently traveling faster
than light: “While OPERA researchers will continue their studies, we are
also looking forward to independent measurements to fully assess the nature
of this observation” ðIstituto Nazionale di Fisica Nucleare 2011Þ.

Evidential variety is also prized in economics. For example, it is com-
monplace in labor economics for a causal hypothesis to be seen as more
strongly supported if it can rely not only on macrodata evidence but also on
microdata evidence. If the hypothesis under consideration is ‘the relatively
long duration of unemployment benefits in France is a cause of its relatively
high unemployment rate’, the proposition ‘there is a positive statistical as-
sociation between average duration of benefits and unemployment rates

Received April 2012; revised July 2012.
To contact the author, please write to: Faculty of Philosophy, Erasmus University, Rotterdam;
-mail: claveau@fwb.eur.nl.

I thank Stephan Hartmann, Kevin Hoover, Conor Mayo-Wilson, Luca Moretti, and Julian
eiss for their helpful comments, James Kelleher for English proofreading, and the SSHRC
67-2009-0001) for financial support.

hilosophy of Science, 80 (January 2013) pp. 94–118. 0031-8248/2013/8001-0008$10.00
opyright 2013 by the Philosophy of Science Association. All rights reserved.
*
e

y
R
(7

P
C

94

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


among industrial countries’ ðmacrodata evidenceÞ will be interpreted as sup-
porting the hypothesis, but the support will be even higher if the evidential

INDEPENDENCE IN VARIETY OF EVIDENCE 95
elements also include the proposition ‘the average length of an unemploy-
ment spell increased in Austria for the category of job seekers affected by
the 1989 reform of benefits duration’ ðmicrodata evidenceÞ.1

The widespread quest for evidential variety can be justified by what
Bayesians call the variety-of-evidence thesis.

Variety-of-evidence thesis. Ceteris paribus, the strength of confirmation

of a hypothesis by an evidential set increases with the diversity of the ev-

Som
its k

1. Th
table

2. In
tial s
ð1995
idential elements in that set.

e Bayesians maintain that this thesis could be given a formal proof once

ey terms—the ceteris paribus clause, confirmation, variety—are prop-
erly defined. In seeking this proof, the most popular interpretation of variety
has been to equate it to a measure of independence among evidential ele-
ments.2 The intuitive idea behind the proposals of Earman ð1992Þ and
Howson and Urbach ð1993Þ is that an evidential set is varied to the extent that
each element ei is not made significantly more likely by learning other ele-
ments in the set—the extreme case being full probabilistic independence
between ei and any conjunct of the other elements.

It turns out that one runs into problems in trying to prove the variety-of-
evidence thesis using such a measure of independence. It is indeed clear
from a measure introduced by Myrvold ð1996Þ and recently labeled “fo-
cused correlation” by Wheeler ð2009Þ that, in order to prove the variety-of-
evidence thesis using the most popular interpretation of variety, one must
either assume that the hypothesis entails the evidence, which would be for-
getting the role of auxiliary hypotheses, or smuggle into the ceteris paribus
clause the measure of independence conditional on the hypothesis, which
seems unwarranted.

In parallel to these developments, Bovens and Hartmann introduced an-
other characterization of variety as reliability independence. Using this no-
tion of independence, they challenged the belief that Bayesianism can prove
the variety-of-evidence thesis. According to their model, “less varied evi-
dence may indeed provide more confirmation to the hypothesis” ðBovens
and Hartmann 2002, 47; 2003, 106Þ.
ese evidential propositions summarize, respectively, the results from OECD ð2006,
3.3Þ and Lalive, van Ours, and Zweimüller ð2006Þ.
contrast, Horwich ð1982, 1998Þ connected variety with the capacity of an eviden-
et to disconfirm alternative hypotheses. For discussions and criticisms, see Wayne
Þ, Fitelson ð1996Þ, and Bovens and Hartmann ð2003, 107Þ.

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


Bovens and Hartmann use what seems to be a plausible understanding of
variety: evidential elements for a given hypothesis are varied to the extent

96 FRANÇOIS CLAVEAU
that they do not share potential reasons for being unreliable. For example,
the ICARUS Collaboration ð2012Þ was in a position to produce evidence
for or against the hypothesis of faster-than-light neutrinos, evidence that
was partially independent of the OPERA experiment. The independence
is partial here because, while on the one hand, the two groups shared the
same neutrino beams from CERN ðmaking them share some potential
reasons to be unreliableÞ, on the other hand, they used different detectors—
opening up the possibility that the ICARUS measurement is unbiased while
the OPERA measurement is systematically biased due to a defect in the
OPERA detector.3

This article takes a second look at Bovens and Hartmann’s result. My
primary concern is to assess whether their result should affect the status of
the variety-of-evidence thesis as a guide to scientific practice. Endorsing
their ðplausibleÞ interpretation of variety as reliability independence, I ar-
gue that two aspects of their model shed doubt on the relevance of their
result for actual science. First, the unreliable sources in their model are not
like unreliable sources in actual science ði.e., their unreliable sources are
randomly biased while systematic bias is far more likely to be the issueÞ. I
show, in section 4, that the variety-of-evidence thesis is rescued when the
model is slightly modified to capture unreliability as systematic bias. Sec-
ond, their model, and my first modification to it, contrasts full independence
to full dependence, while variety in the variety-of-evidence thesis is more a
question of degrees of independence. In section 5, I extend the model to
consider degrees of independence. I then show that the variety-of-evidence
thesis, as Bovens and Hartmann initially claimed, is false.

2. Bovens and Hartmann’s Result. In this section, I present a simplified
version of Bovens and Hartmann’s model and reproduce one of their re-
sults against the variety-of-evidence thesis.4 The model uses three types of
propositional variables:

• The hypothesis variable H 5 fh; :hg, where h stands for the propo-
sition that the hypothesis of interest is true ðe.g., ‘some neutrinos can
travel faster than light’Þ and :h stands for its negation.
3. This is now the official explanation of the OPERA anomaly ðCERN 2012Þ.
4. Their model is slightly more complex because it adds another propositional variable
to the three that I consider—i.e., the ‘testable consequence’ C ðBovens and Hartmann
2003, 89–90Þ. Since I focus on the issue of independent reliability—not on the issue of
independent testable consequences—this addition is superfluous.

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


• The evidential variable Ei 5 fei; :eig, where ei stands for a positive
report regarding h, that is, a report to the effect that a testable conse-

Figure 1. Two cases of partially reliable evidential sources. A, Independent reli-
ability; B, shared reliability.

INDEPENDENCE IN VARIETY OF EVIDENCE 97
quence of h holds ðe.g., ‘the measured velocity of the neutrinos in
this experiment is higher than the speed of light’Þ and :ei stands for a
negative report.

• The reliability variable Ri 5 fri; :rig, where ri stands for the propo-
sition that the evidential source i ðthe one having as output EiÞ is
reliable and :ri stands for the proposition that the source is unreliable.

Two joint probability distributions are constructed over the set of variables
fH; E1; E2; R1; R2g. The assumed probabilistic independencies among the
variables can be read off the Bayesian networks in figure 1 by using the
d-separation criterion ðPearl 1988, 117–18Þ.5

The probability distribution PIð�Þ associated with the network in fig-
ure 1A is meant to capture the idea that two sources are reliability indepen-
dent ði.e., R1 ⫫ R2Þ. The probability distribution PSð�Þ associated with figure
1B captures the other extreme when evidential sources have fully shared
reliabilities ði.e., R1 5 R2 5 RÞ. The notion of variety modeled here is thus:

Reliability independence. Two evidential elements are independent if

their reliabilities are independent.

5. Th
evide
beari
have
the re
simil
needs
to be
consi
This
my re
e two distributions share the condition H ⫫ R1; R2; R—i.e., before learning the
ntial report Ei, learning that the associated evidential source is reliable or not has no

ng on the strength of belief in the hypothesis ðand vice versaÞ. For fig. 1A, we also
Ei ⫫ Ej; RjjH for i ≠ j, which means that once the realization of H is known, learning
alization of Ej or Rj for j ≠ i is not relevant to the probability distribution of Ei. A
ar condition for fig. 1B is Ei ⫫ EjjH; R for i ≠ j, which means that, in this case, one
to condition on the reliability variable, too, in order for the two evidential reports
irrelevant to each other. These conditions do not universally apply to what can be
dered evidential elements for a hypothesis ðe.g., Wheeler and Scheines 2011, fig. 3Þ.
fact must be kept in mind in interpreting the result of Bovens and Hartmann as well as
sults.

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


The joint probability distributions are further specified. The root variables H
and Ri are given prior probabilities:

98 FRANÇOIS CLAVEAU
PðhÞ 5 h0 and PðriÞ 5 ri; ð1Þ
where h0 and ri are parameters strictly between 0 and 1. The prior degree
of belief that the hypothesis is true is thus h0, and ri is the prior degree of
belief that the evidential source i is reliable. For compactness, I will write
�h0 for the prior probability that the hypothesis is false ð1 2 h0Þ and �ri for
the prior probability that source i is unreliable ð1 2 riÞ.

What remains to be spelled out is how the evidential variable Ei varies
with its parents. It is assumed that, when a source is reliable, the evidential
report is a perfect truth tracker:

Pðeijh; riÞ 5 1 and Pðeij:h; riÞ 5 0 for i 5 f1; 2g: ð2Þ
That is, when the hypothesis is true, a reliable evidential source will give a
positive report; when the hypothesis is false, such a source will give a neg-
ative report.

What happens when the source is unreliable? To specify this case, Bovens
and Hartmann rely on the following intuition:

Irrelevance of an unreliable source. If one knows for sure that a given

source is unreliable ðRi 5 :riÞ, the report coming from this source ðei

This
or :eiÞ should not have any effect on the degree of belief in the hypothesis
h.

can be written
Pðhjei; :riÞ 5 Pðhj:ei; :riÞ 5 Pðhj:riÞ: ð3Þ
In other words, an unreliable source gives garbage information regarding
the truth of the hypothesis. Upon learning the information from an unreli-
able source, the agent makes no updating to the subjective probability of the
hypothesis.

Note that this is not the only plausible interpretation of ‘unreliable source’.
The interpretation clashes, in particular, with the idea that unreliability might
be due to calibration issues. Taking again the OPERA experiment as an ex-
ample, some early critics claimed that the anomalous result might be due to a
problem with clock synchronization ðContaldi 2011Þ. The estimated time of
travel would be systematically below the actual time because the clock at the
end of the tube clicked slightly later then the clock at the beginning.6 With
this type of unreliability, it is possible to undo the bias: given the estimated
value and given the bias, one can retrieve the actual travel time. Knowing, for

6. The OPERA researchers have indeed identified similar biases by now.
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


example, that the experimental setup is biased ð:riÞ in such a way that the
estimated time is systematically underestimated by a factor b, obtaining a

INDEPENDENCE IN VARIETY OF EVIDENCE 99
positive report ei that the time of travel is t should matter for one’s belief in
the hypothesis because an unbiased estimate of the travel time can be re-
trieved by computing t=b. For such calibration problems, we thus have that
Pðhjei;:riÞ ≠ Pðhj:riÞ, which contradicts condition ð3Þ. In this article, I stick
to interpretations of unreliability compatible with condition ð3Þ and keep
the calibration interpretation for future work. I will later give examples of
reasons for unreliability that are compatible with condition ð3Þ.

Condition ð3Þ implies what follows ðproof in Bovens and Hartmann
2003; app. C.1Þ:

Pðeijh; :riÞ 5 Pðeij:h; :riÞ ¼: ai: ð4Þ
Another way to express this condition is Ei ⫫ Hj:ri.

Parameter ai is the last parameter of the model; it is the probability that
the evidential report is positive, given that the source is unreliable. The
probability that the evidential report is negative, given that the source is
unreliable, is simply 1 2 ai ¼: �ai. Table 1 sums up how the realizations of
Ei depend on the values taken by H and Ri.

Bovens and Hartmann offer a specific interpretation of ai in terms of a
randomizing evidential source. I will later offer an alternative interpreta-
tion of this parameter, but let me first reproduce their result relative to the
variety-of-evidence thesis. The probability of interest is the posterior belief
in the hypothesis given two positive reports: Pðhje1; e2Þ 5 P*ðhÞ. For the
two joint probability distributions PIð�Þ and PSð�Þ—that is, the distribu-
tions associated with the reliability-independent version ðfig. 1AÞ and the
shared-reliability version ðfig. 1BÞ—this posterior can be written using the
likelihood-ratio form

P*ðhÞ 5 h0
h0 1 �h0L

; where L 5
Pðe1; e2j:hÞ
Pðe1; e2jhÞ

: ð5Þ

Which posterior is higher, P*I ðhÞ or P*SðhÞ? To turn this comparison into
an assessment of the variety-of-evidence thesis, we need a plausible inter-
pretation of the ceteris paribus condition of this thesis. Bovens and Hart-
mann impose restrictions that seem sufficient to meet the condition. First,

TABLE 1. PROBABILITY OF A POSITIVE

REPORT GIVEN THE VALUES OF H AND RI

PðeijH; RiÞ ri :ri
h 1 ai
:h 0 ai

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


we want to rule out comparing hypotheses starting with unequal degrees
of confirmation. We thus impose P ðhÞ 5 P ðhÞ. Second, we want the evi-

100 FRANÇOIS CLAVEAU
I S

dential sets to potentially differ in confirmatory strengths for no other reason
than their relative variety. A sufficient condition for this goal is to require
that all positive reports ei in the independent-reliability and the shared-
reliability versions have the same confirmatory strength for h—that is,
PIðhjeiÞ 5 PSðhjejÞ for i; j 5 f1; 2g. This condition holds if the different ai
and ri are reduced to only a single a and a single r across the two models
ðBovens and Hartmann 2003, 104Þ.

Given this interpretation of the ceteris paribus condition, the likelihood
ratios associated with the two posteriors P*I ðhÞ and P*S ðhÞ are ðproof in
app. AÞ

LI 5
ða�rÞ2

ðr 1 a�rÞ2 ;

LS 5
a2�r

r 1 a2�r
:

ð6Þ

Since we assume that the prior probability of the hypothesis is the same in
the two models, the variety-of-evidence thesis implies that P*I ðhÞ > P*S ðhÞ
for all admissible parameter values—that is, we should have a higher con-
fidence in the hypothesis if our two positive reports come from reliability-
independent sources as compared to reliability-shared sources. This is equiv-
alent to LS > LI. It turns out, however, that the inequality is reversed for some
combinations of values of a and r ðproof in app. AÞ:

P*I ðhÞ > P*S ðhÞ ⇔ :5 > �a�r: ð8Þ
Figure 2 divides the parameter space in two regions: the bigger white region
where independence is more confirmatory and the gray region where shared
reliability is more confirmatory.

What happens? Why is it sometimes better for confirmation to have no
reliability independence rather than full independence? To understand this,
it is crucial to see what shared reliability entails in the second version of the
model, namely, that E1 is a truth teller if and only if E2 is a truth teller. As
truth tellers, E1 and E2 will always give concordant reports. But when none
of the evidential variables is a truth teller ði.e., when :rÞ, then each evi-
dential variable has a probability a of producing a positive report. It is
crucial to recognize that this probability is not affected by the value the
other evidential variable is realizing. It implies that when, and only when,
they are unreliable, E1 and E2 might realize discordant reports. A second
concordant report in the shared-reliability model thus contributes to confir-
mation in the following way: “we feel more confident that the instrument is
not a randomizer and this increase in confidence in the reliability of the

ð7Þ
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


instrument benefits the confirmation of the hypothesis” ðBovens and Hart-
mann 2003, 98Þ. The region of the parameter space where shared reliability is

Figure 2. Parameter space showing when shared reliability is more confirmatory
than independent reliability.

INDEPENDENCE IN VARIETY OF EVIDENCE 101
better is the one where this channel of ‘higher confirmation of h because
higher confidence in the reliability of the source’ is the most effective. With
low values of a, it is unlikely that an unreliable source would output two
positive reports; it is thus likely that the two positive reports come from a
reliable source. With low values of r, the agent starts with little confidence
in the source; there is both great room for improving confidence and little to
be gained for the belief in the hypothesis from the direct effect of a positive
report since such a report is not likely to be truth tracking.

3. Questioning Bovens and Hartmann’s Result. While the logic of this
result is simple, its implications for the variety-of-evidence thesis are less
clear. As Bovens and Hartmann ð2003, 95nÞ recognize, the result of their
model “does not apply to unreliable instruments that do not randomize.” It
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


thus seems that the champions of the variety-of-evidence thesis would have
little to worry about if evidential sources were rarely as Bovens and Hart-

102 FRANÇOIS CLAVEAU
mann depict them to be. And it indeed seems to be the case that scientists
do not think of their evidential sources in the manner depicted by the
shared-reliability model.

For example, consider the macrodata evidence for the hypothesis that
one cause of the relatively high French unemployment rate is its relatively
long duration of unemployment benefits. To simplify, let us imagine that the
macrodata evidence is Pearson’s correlation coefficient between the legis-
lated duration of unemployment benefits and the unemployment rate for a
sample of industrial countries. There are many potential reasons why this
coefficient would not be truth tracking ði.e., would be unreliableÞ with re-
spect to the hypothesis of interest—for example, the correlation might not
be a sign of causation from benefits to the unemployment rate because of
the presence of a common cause, or perhaps the causal structure in France
deviates substantially from the ones in the sampled countries.

Now imagine that I decide to compute the correlation coefficient twice.
That is, I have the data for the two variables in my computer, and I use my
favorite statistical software to compute the correlation twice ðe.g., send the
command corðBd,UÞ to R twiceÞ. It is reasonable to say that the two results
would share a single reliability state: my second correlation coefficient
would be reliable evidence for the hypothesis if and only if my first coeffi-
cient is also reliable evidence for the hypothesis. Does Bovens and Hart-
mann’s shared-reliability situation come anywhere close to capturing how
we think about these two results? Obviously not.

According to the model, the two results will always be in concordance
if the procedure is reliable. This implication seems fine. But something
strange must happen with my statistical software when the procedure is
unreliable. In this case, the two results might be at odds. Furthermore, if I
were to compute the coefficient again and again, I would necessarily get
discordant results ðprovided a is strictly between 0 and 1Þ. To be sure, it is
possible that my procedure is unreliable and is randomizing in this way.
My correlation command might have been redefined such that it randomly
outputs a number between 21 and 1. However, this is not what would
normally be of concern. The reasons given above for why a correlation might
be unreliable evidence for a specific causal claim will not bring about such
randomness. If, for example, there is a confounding common cause, one
would expect both coefficients to be identically affected.

Another way to see the problem with the model is to imagine that the data
used to compute the correlation coefficients are known to be totally unre-
lated to French unemployment. Let us say that the two variables are the
respective prices of two types of fish at the Grote Markt in Rotterdam.
Since the correlation between these two variables is not tracking the truth of
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


the hypothesis about French unemployment, the source is unreliable for this
hypothesis ðR 5 :rÞ. Now the model tells us that, since the source is un-

INDEPENDENCE IN VARIETY OF EVIDENCE 103
reliable, the two computed correlations must be probabilistically indepen-
dent. The agent thus starts with some strength of belief a that the correla-
tion between the fish prices is positive. She sends the correlation command
and reads a first positive coefficient. Strangely, learning this first coefficient
will not make her revise her belief about the probable value of the second
coefficient; just before pressing Enter again on her computer, she will still
believe to strength a that the computer output will be positive.

The counterintuitiveness of Bovens and Hartmann’s model is not an ar-
tifact of my specific choice of example. Take the complex experimental
setup of the OPERA Collaboration. Imagine that the research team—before
announcing that it had located biases in its experimental procedure—had
rerun the experiment with the exact same setup ðimagine this to be the case
even though the exact same setup is a physical impossibilityÞ and that the
results had corroborated the initial measurement. What would have been the
reaction of the scientific community? The model tells us that the new re-
sults should have been taken as evidence that the setup is truth tracking. But it
seems more intuitive that these results would have been met with indiffer-
ence. Scientists distrusted the first measurement because they believed that
the experiment suffered from a systematic ðyet unknownÞ bias. Concordant
results from a second identical experiment could thus be explained away by
saying that the systematic bias was again operating ðas it should if exper-
imenters were careful enough in reproducing the setupÞ. To make some
progress in the debate, OPERA researchers needed to find a way to decrease
the strength of the belief in the existence of a systematic bias. Rerunning
the exact same experiment over and over would not have achieved that.

The upshot of this discussion is that the result of Bovens and Hartmann,
as it stands, should not worry scientists and philosophers very much, if at
all. There is still room for an unqualified variety-of-evidence thesis when
the sources of evidence resemble the ones in science, rather than the ones in
the model.7

4. First Modification to the Model. Doubt has crept in: Can it not be
shown that a more appropriate modeling of the evidential sources still results
in a qualified variety-of-evidence thesis? In this section, I offer a negative
answer to this question by making a single modification to the model of
Bovens and Hartmann.
7. Hartmann ð2008, 108Þ later wrote the following statement about his assumption
regarding unreliability: “This way of modeling a partially reliable instrument is clearly a
strong idealization, which will not hold in many cases.”

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


Bovens and Hartmann’s modeling choices are guided by a specific in-
terpretation of the parameter a: for them it means that an unreliable evi-

104 FRANÇOIS CLAVEAU
dential source acts like a randomizer. This becomes clear in their discussion
of witnesses as a special case of an evidential source: “So, we assume that if
witnesses are not reliable, then they are like randomizers. It is as if they do
not even look at the state of the world to determine whether the hypoth-
esis is true, but rather flip a coin or cast a die to determine whether they
will provide a report to the effect that the hypothesis is true” ðBovens and
Hartmann 2003, 57Þ. While they interpret the parameter as capturing a
property of the evidential source, I would rather interpret it from the point
of view of the agent: just knowing that the source i is unreliable, ai is the
agent’s degree of belief that the report of this source will be positive.

It turns out that Bovens and Hartmann could have modeled this epistemic
fact in a different way. Such an alternative way specifies that an unreliable
evidential source is systematically biased, not randomizing. I want to em-
phasize at the outset that systematically biased sources of the kind I will
model do not cover all the potential kinds of unreliability in science. Ob-
viously, they do not cover randomizing sources ðif such sources existÞ. More
important, they fail to encompass the miscalibrated sources previously men-
tioned in section 2.

I still think that what I model as ‘systematically biased sources’ capture
important reasons why one can judge a source to be unreliable. Here are a
few hints at these reasons without any claim to be comprehensive. One
general class of cases comprises the diverse ways in which an evidential
report can be affected by a preconceived view of what is the ‘good’ answer.
That might come from researchers performing data mining until they get the
answer they want or from them simply falsifying their results because of
their sponsor’s interests. It can also come from institutional pressures in
science: peer review systematically favoring some sort of result or deeply
rooted hypotheses making scientists revise their experimental procedure
until the output fits ‘what is known’.

Another class of cases has to do with the risks of using something as a
stand-in ðas a modelÞ in order to learn about something else. If one uses, for
instance, an animal to learn about the potential side effects of a drug on
humans, the extrapolation might go wrong because there is some biological
mechanism in the model not present in the target ðor the other way aroundÞ,
which makes the drug have some effect in one group of subjects but not in
the other. The result from the model subjects will thus be systematically
biased when used as a report for the target subjects.

To model sources that are potentially systematically biased, I redefine the
reliability variable:

• The new reliability variable Ri 5 fri; bhi ; b:hi g, where ri stands, as be-
fore, for the proposition that the source is reliable, bh

i
stands for the
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


proposition that the source is biased toward a positive report for the
hypothesis regardless of its truth, and b:h stands for the proposition that

INDEPENDENCE IN VARIETY OF EVIDENCE 105
i

the source is biased toward a negative report.

This ternary variable ðall the previous variables were binaryÞ is arrived at
by giving a more finely grained specification of the proposition :ri. It is
now decomposed into two disjoint propositions—that is, we now have
:ri 5 bhi [ b:hi .

My first modification inserts this new variable into the previous model.
The probabilistic independencies that can be read off the Bayesian net-
works in figure 1 still hold. Furthermore, the specifications of the prior
probabilities h0 and ri in condition ð1Þ are retained. We need, however, to
specify more probabilities for Ri:

Pðbhi j:riÞ 5 ai and Pðb:hi j:riÞ 5 �ai: ð9Þ
This condition assigns prior probabilities to the propositions about positive
and negative biases given that the source is already known to be unreliable.
Note that what was interpreted as a ‘randomization parameter’ by Bovens
and Hartmann is used explicitly as a strength of belief here. Combining
condition ð9Þ with condition ð1Þ, we have what follows: the prior proba-
bility of a positive bias Pðbhi Þ is ai�ri, and the prior probability of a negative
bias Pðb:h

i
Þ is �ai�ri.

Finally, we need to expand table 1 by stating explicitly how likely ei is,
conditional on bhi and b

:h
i . This expansion gives us table 2. Since the un-

certainty that figured initially in PðeijH; RiÞ has been shifted to Ri, the evi-
dential variable Ei is now a deterministic function of H and Ri. This deter-
ministic relation might come as a surprise to some, but it should not be
surprising. If we remain committed to the irrelevance of an unreliable source
ðIUSÞ, the columns for bh

i
and for b:h

i
in table 2 must each contain the same

value twice. If instead of having 1 and 0 for these values, we opt for values
strictly between these two, we reintroduce into the model Bovens and Hart-
mann’s unreliability as randomizing. The counterexamples used in section 3
would thus apply. As long as we remain committed to the IUS condition, the
notion of a systematic bias must be captured by a deterministic function. In
future work, I will drop the IUS condition in the context of calibration issues,
but I keep it here since it seems pertinent for some sources of unreliability.

TABLE 2. PROBABILITY OF A POSITIVE

REPORT GIVEN THE VALUES OF

H AND RI: EXPANDED

PðeijH; RiÞ ri bhi b:hi
h 1 1 0
:h 0 1 0

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


The probabilities in table 2 and in equations ð9Þ give us two new versions
of the model. Define P 0ð�Þ as the joint probability distribution associated

106 FRANÇOIS CLAVEAU
I

with the independent-reliability situation ði.e., the distribution associated
with fig. 1AÞ and PS 0ð�Þ as the distribution associated with the shared-
reliability situation ði.e., the distribution associated with fig. 1BÞ. We can
now assess the variety-of-evidence thesis: Is it always the case that P*

I0ðhÞ
> P*

S0ðhÞ?
In fact, P*

I0ðhÞ is no different from P*I ðhÞ—that is, for the case of sources
with independent reliabilities, Bovens and Hartmann’s version and my ver-
sion give the same result ðproof in app. BÞ. This is welcome news, given that
Bovens and Hartmann’s version seems to capture what one means in saying
that two evidential reports are fully independent regarding a hypothesis—
that is, it concurs with what Shogenji ð2005, 308Þ presents as “a general
consensus among probability theorists on how to formalize the condition
that two pieces of evidence E1 and E2 are independent of each other with
respect to proposition A.”8

Things are different when we turn to the new version with shared reli-
ability. The posterior probability of the hypothesis is now ðproof in app. BÞ

P*
S
0ðhÞ 5 h0

h0 1 �h0LS0
; where LS0 5

a�r

r 1 a�r
: ð10Þ

The likelihood ratio LS0 is identical to the one resulting from an evidential
set with only a single element instead of two ðsee eq. ½A5�Þ. In other words,
adding a second positive report in this new shared-reliability model has no
effect on the degree of confirmation of the hypothesis. The reason for this
result is simple: the second report cannot be anything but consistent with
the first report in this model. The two evidential variables not only share
reliability; they also share the direction of the bias if they are indeed biased.
There is no longer the possibility of detecting that a source is unreliable by
finding discordant reports coming from this source. Since this possibility no
longer obtains, multiplying the reports from the same source becomes use-
less.

Is it still possible that the reports in the shared-reliability situation are
more confirmatory than the ones in the independent-reliability situation? No.
The posterior probability of h is strictly higher in the independency case
if the probability that the sources are reliable is higher than 0 ðsee app. B
for the proofÞ. There is thus no combination of admissible parameter values
for which having shared reliability is, ceteris paribus, better. Using the same
source again is not conducive to confirmation because it no longer holds the
promise of detecting the potential unreliability of the source. A second

8. It is a case of Sober’s conjunctive fork ðSober 1989; Fitelson 2001Þ.
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


independent report is thus necessarily more confirmatory. The variety-of-
evidence thesis holds without qualification in this version of the models.

INDEPENDENCE IN VARIETY OF EVIDENCE 107
5. Degrees of Independence. The result supporting the variety-of-evidence
thesis in the previous section suffers from a major limitation. While the
variety-of-evidence thesis explicitly compares more independent to less in-
dependent evidential elements, the comparison made with our two models is
between fully independent and fully dependent evidential elements. Our
comparison of confirmation was restricted to the two ends of a spectrum,
whereas the variety-of-evidence thesis deals with how confirmation changes
with changes in the degree of independence.

There is a simple way to model degrees of independence by extending
the setup of the previous section. The graphical representation of this ex-
tended model is in figure 3, and its associated probability distribution will be
labeled PFð�Þ. The modification here adds a probabilistic association be-
tween the two reliability variables R1 and R2.9 The rest of the model remains
intact.

The association between the reliability variables is fully captured by
specifying the probabilities for the nine possible combinations of their val-
ues. Panel A of table 3 offers a general notation for these nine possibilities.
For instance, qrr is the probability that both sources are reliable. The ele-
ments on the main diagonal ðqrr; qhh; q:h:hÞ are the probabilities associated
with the proposition that the two sources are in the same reliability state.
Note that the table already assumes symmetry between the two sources—
that is, Pðr1; bh2Þ 5 Pðbh1; r2Þ 5 qrh, and so forth. This assumption was also
used in the previous sections as part of the assumptions sufficient to meet
the ceteris paribus condition of the variety-of-evidence thesis.

In this new model, the posterior belief in the hypothesis given two
positive reports is ðproof in app. C.1Þ

P*F ðhÞ 5
h0

h0 1 �h0LF
; where LF 5

qhh

qrr 1 2qrh 1 qhh
: ð11Þ

Table 3 panels B and C give the specific values taken by the q’s for the two
extreme cases on which the previous sections focused. It can be easily ver-
ified using these values that the expression in equation ð11Þ reduces to
equation ð10Þ or ð6Þ for each of these extreme cases—that is, the model of the
9. Bovens and Hartmann ð2003, 75–77Þ offer a model with a super-reliability variable
that is specified as a common cause of the Ri’s. However, they do not use it to discuss the
variety-of-evidence thesis. Since this super-reliability variable is difficult to interpret and
since only modeling a probabilistic association between R1 and R2 is sufficient for my
goal here, I opt for the second option.

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


previous section is fully embedded into this one ðincluding its result for the
variety-of-evidence thesisÞ.

Figure 3. Extended model with degrees of independence.

108 FRANÇOIS CLAVEAU
Is there a ready measure of degrees of independence? My proposal is
based on the following consideration. Compare table 3 panels B and C. The
probability mass is all on the main diagonal in the first case. In other words, it
never happens that the two sources are in different reliability states. In the
case of full independence, the probability mass is more spread out since the
joint probability PðR1; R2Þ is simply the product of the marginal probabili-
ties ði.e.,PðR1ÞPðR2ÞÞ. In fact, each element on the main diagonal in table 3,
panel C, is exactly the square of the same element in table 3, panel B. This
fact suggests a specific metric to characterize degrees of independence.

TABLE 3. JOINT PROBABILITIES FOR THE

RELIABILITY VARIABLES ðASSUMING SYMMETRYÞ

PðR1; R2Þ r2 bh2 b:h2
A. General Case

r1 qrr qrh qr:h
bh
1

qrh qhh qh:h
b:h
1

qr:h qh:h q:h:h

B. Fully Shared Reliability

r1 r 0 0
bh
1

0 �ra 0
b:h
1

0 0 �r�a

C. Fully Independent Reliability

r1 r
2

r�ra r�r�a
bh
1

r�ra ð�raÞ2 �r2a�a
b:h
1

r�r�a �r2a�a ð�r�aÞ2

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


Define a variable d ∈ ½0; 1� that is interpreted as measuring the distance of
the evidential set from fully shared reliability—that is, when d 5 0 we have

INDEPENDENCE IN VARIETY OF EVIDENCE 109
no independence, when d 5 1 we have full independence, and when d is
strictly between 0 and 1, we have only partial independence. Given values
for r, a, and d, the elements on the main diagonal are

qrr 5 r
11d; qhh 5 ð�raÞ11d; q:h:h 5 ð�r�aÞ11d: ð12Þ

These relations entail that the probability mass is shifted away from the
elements on the main diagonal as the degree of independence increases. In
other words, it becomes less likely that the two sources share the same
reliability state.

With this variable d, the variety-of-evidence thesis can be restated.

Variety-of-evidence thesis. Ceteris paribus, yP*FðhÞ=yd > 0, for all ad-

missible values of r, a, and d.

The
hypo

pend
restatement of the thesis is thus that the posterior degree of belief in the

thesis invariably increases as we marginally increase the degree of in-
dependence of the evidential sources.
Before we assess this thesis, we need to specify how the off-diagonal

elements in table 3, panel A, change as d is modified. One obvious restric-
tion is that the sum of all the elements ðthe nine q’sÞ must be 1. The inter-
pretation of the ceteris paribus condition previously used also restricts the
values of the off-diagonal elements but not enough to ensure uniqueness. In
addition to these restrictions, I thus also stipulate that the marginal proba-
bilities of R1 and R2 are not a function of d—that is, PðriÞ 5 r, Pðbhi Þ 5 �ra,
and Pðb:h

i
Þ 5 �r�a, for i 5 f1; 2g and for all d ∈ ½0; 1� ðsee app. C.2Þ.

This model leads to a qualification of the variety-of-evidence thesis
ðproof in app. C.3Þ. Increasing the degree of independence leads to more
confirmation if and only if the following condition holds:

ð1 2 2�r�aÞln ð�raÞ 1 ð�r�aÞ11d ln
�
a

�a

�
< 0: ð13Þ

But there are combinations of admissible values for r, a, and d that violate
this condition.

Figure 4 presents graphically the different possibilities. For most com-
binations of r and a, the relationship between degree of independence and
confirmation is as stated by the variety-of-evidence thesis ðfig. 4B presents
such a caseÞ.10 Figure 4A shows that there are in fact two distinct regions of
the parameter space where the relationship between independence and con-

10. The proportion of the parameter space a � r where the relationship between inde-

ence and confirmation is not monotonically increasing—i.e., the area of the two

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


firmation is nonmonotonic. These two possibilities share ðextremelyÞ low
values of r. In other words, these are situations in which prior information

Figure 4. Nonmonotonicity is possible. A, Parameter combinations resulting in a
nonmontonic relationship between degrees of independence and confirmation; B,
monotonic relationship; C, low a; D, high a.

110 FRANÇOIS CLAVEAU
leads the agent to believe that it is highly unlikely that a given source is truth
tracking. The relationship is indeed always monotonic when the trust in the
source is above .18 ðin Bovens and Hartmann’s model this was .5Þ.

There are two features distinguishing the two nonmonotonic situations
from each other. First, as shown in figure 4A, they differ in their values for
a—that is, the probability that the report is positive given that the source is

gray regions in fig. 4A—is 10.3%. As a point of comparison, this proportion is 15.3% in

Bovens and Hartmann’s model ðas depicted in fig. 2Þ. If one considers instead the three
dimensional space a � r � d, only 2% of it gives yP*F hð Þ=yd < 0. These proportions
should not be interpreted as probabilities.

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions
-

http://www.jstor.org/page/info/about/policies/terms.jsp


unreliable. Second, as is evident by comparing figure 4C and 4D, the two
regions are associated with different shapes of nonmonotonicity ðconcave

INDEPENDENCE IN VARIETY OF EVIDENCE 111
vs. convex functionsÞ.
What is going on? As in Bovens and Hartmann’s model, what happens is

that, upon learning e1 and e2, the agent reassesses the probability that the
sources are reliable. The region of the space a � r � d where confirmation
decreases with independence is exactly the region where trust in the reli-
ability of the sources decreases with independence ðproof in app. C.4Þ. In
other words, to compare two evidential sets ðfor hÞ—say E 5 fe1; e2g and
E0 5 fe01; e02g, which differ only with respect to their degree of reliability
independence ðdÞ ðthe elements of E being more independent than the
elements of E0Þ—one simply needs to assess the following ratio for each set:

qrr 1 2qrh
qhh

: ð14Þ

The set with the higher ratio is more confirmatory than the other. The nu-
merator of this ratio captures the probability of realizations of R1 and R2 that
generate two evidential elements ðe1 and e2Þ that are indeed truth revealing for
h. The denominator is the probability that the two sources are producing
positive-but-garbage reports for h. The denominator of E will always be
smaller than that of E0. This fact might capture the intuitive appeal of the
variety-of-evidence thesis: it is less likely to get two garbage reports from
sources that are ðmoreÞ independent. But the full ratio is what ultimately
decides between E and E0.

Let me briefly discuss the only two situations in which the variety-of-
evidence thesis is turned upside down. First, for low values of a combined
with extremely low values of r ðas in fig. 4CÞ, getting two positive reports
comes as a surprise—it was judged far more likely to receive at least one
negative report because of the realization of b:h

i
. In this case, moving toward

independence is initially beneficial, but more independence becomes det-
rimental to confirmation as one approaches the extreme of full indepen-
dence. This result is interesting because it means that slightly departing
from full independence sometimes increases confirmation.

Second, for high values of a combined with extremely low values for r
ðas in fig. 4DÞ, getting two positive reports is not surprising; however, the
agent judges it highly likely that the information is worthless ðbecause bh1
and bh

2
are likely to be realizedÞ. In this case, a departure from full depen-

dence adversely affects confirmation. An implication of this nonmono-
tonicity is that the second positive report can be disconfirming h ði.e.,
PFðhje1; e2Þ < PFðhje1ÞÞ. Remember from section 4 that the posterior belief
in h after two fully dependent reports ði.e., d 5 0Þ is identical to the posterior
belief after a single report. Both are represented by the point at the extreme
left of the curve in figure 4D. All the points lying below this point are thus
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


cases in which the second report is disconfirming h. The agent puts so little
trust in the evidential sources that a second report is interpreted as a sign that

112 FRANÇOIS CLAVEAU
both sources are positively biased, and the initial ðslightÞ increase in the
belief for h is cut back.

6. Conclusion. The variety-of-evidence thesis seems to be a widespread im-
plicit guideline in scientific practice. This thesis says that, ceteris paribus,
the confirmatory power of an evidential set for a given hypothesis increases
with the diversity ði.e., the independenceÞ of the evidential elements in the
set. Thus, one should praise ‘independent evidence’ and be suspicious of
the rest.

Bovens and Hartmann ð2002, 2003Þ cast doubt on the universal applica-
bility of this thesis by showing with a simple model that, in some peculiar
epistemic situations, it is sometimes a disadvantage for confirmation to have
independent evidential elements, ceteris paribus. I have argued that the rel-
evance of this result is diminished by two characteristics of their model.

First, their idea that unreliable sources are randomizers leads them to
model fully dependent sources in a way that is unlikely to reflect how sci-
entists think about their sources of evidence. The problem is that Bovens
and Hartmann assume that two fully dependent sources still produce two
independent reports when they are unreliable ði.e., E1 ⫫ E2j:rÞ. Instead, in
actual scientific settings it seems to be the case that two reports coming from
fully dependent sources will always coincide, even when the sources are
unreliable. In section 4, I showed that the variety-of-evidence thesis is re-
habilitated once the independent-randomizer assumption is dropped and
replaced with the assumption that an unreliable source is systematically bi-
ased. This modification is compatible with the key intuition behind the notion
of reliability in Bovens and Hartmann’s model ði.e., IUSÞ.

Second, there is another serious limitation in Bovens and Hartmann’s
model, a limitation that my first modification of their model shares. The
comparison made to assess the variety-of-evidence thesis is between ex-
tremes; it is between fully independent and fully dependent evidential ele-
ments. The most relevant comparison is rather one of degree: less versus
more independence of the sources. In section 5, I showed that when the
model is modified to enable comparisons of degrees of independence, the
variety-of-evidence thesis needs to be qualified. There are special epistemic
situations wherein more independence does not give more confirmation.
This qualification only applies to a subinterval of the spectrum from full
dependence to full independence. Indeed, the two extremes of the spec-
trum always stand in the confirmatory relationship depicted by the variety-
of-evidence thesis.

Where do my modeling efforts leave us? First, the usual caveat about
idealization applies: it might well be that the way in which epistemic sit-
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


uations have been modeled here does not capture what is pertinent for the
variety-of-evidence thesis. It is certain that my model does not encompass

INDEPENDENCE IN VARIETY OF EVIDENCE 113
all the ways in which an evidential source can be unreliable ðe.g., the cali-
bration problems mentioned in sec. 2Þ.

Even if one accepts the idealizations, the conclusion to draw about the
variety-of-evidence thesis is not straightforward. One plausible reaction to
the result of the last section is as follows. The variety-of-evidence thesis can
break down in the extended model only if the agent has enormous doubts
about the reliability of the evidential source; she must judge it to be at least
82% likely that the source is unreliable. One could thus read the result as
highlighting the danger of using extremely weak evidential sources, rather
than as a direct refutation of the variety-of-evidence thesis. This thesis
could be interpreted as implicitly assuming that the evidential sources are
sufficiently trustworthy to begin with. The fate of the variety-of-evidence
thesis is not yet settled.

Appendix A: Bovens and Hartmann’s Version

One gets the two likelihood ratios LI and LS by using the probabilistic
information encoded in figure 1 together with equations ð1Þ, ð2Þ, and ð4Þ. I
start with the likelihood ratio for the independent-reliability version:

LI 5
PIðe1; e2j:hÞ
PIðe1; e2jhÞ

5
oR1;R2PIðe1j:h; R1ÞPIðR1ÞPIðe2j:h; R2ÞPIðR2Þ
oR1;R2PIðe1jh; R1ÞPIðR1ÞPIðe2jh; R2ÞPIðR2Þ

Given that the terms in the multiplications are either solely about source 1 or
source 2, I factorize by source:

5
Pi5f1;2goRiPIðeij:h; RiÞPIðRiÞ
Pi5f1;2goRiPIðeijh; RiÞPIðRiÞ

5
Pi PIðeij:h; riÞPIðriÞ 1 PIðeij:h;:riÞPIð:riÞ½ �
Pi PIðeijh; riÞPIðriÞ 1 PIðeijh;:riÞPIð:riÞ½ �

5
Pi 0ri 1 ai�ri½ �
Pi 1ri 1 ai�ri½ � 5

a1a2�r1�r2
ðr1 1 a1�r1Þðr2 1 a2�r2Þ

LI 5
ða�rÞ2

ðr 1 a�rÞ2 :

ðA1Þ

ðA2Þ
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


The third line results from plugging in the parameter values and then
simplifying; the fourth line imposes the ceteris paribus condition.

114 FRANÇOIS CLAVEAU
I do the same for the shared-reliability version:

LS 5
PSðe1; e2j:hÞ
PSðe1; e2jhÞ

5
oRPSðe1j :h; RÞPSðe2j:h; RÞPSðRÞ
oRPSðe1jh; RÞPSðe2jh; RÞPSðRÞ

5
PSðe1j:h; rÞPSðe2j:h; rÞPSðrÞ 1 PSðe1j:h;:rÞPSðe2j:h;:rÞPSð:rÞ

PSðe1jh; rÞPSðe2jh; rÞPSðrÞ 1 PSðe1jh;:rÞPSðe2jh;:rÞPSð:rÞ

5
0 � 0r 1 a1a2�r
1 � 1r 1 a1a2�r

5
a1a2�r

r 1 a1a2�r
ðA3Þ

LS 5
a2�r

r 1 a2�r
: ðA4Þ

Note that from ðA1Þ or ðA3Þ, it can easily be seen that the posterior belief in
h when only one positive report is known is

PðhjeiÞ 5
h0

h0 1 �h0Li
; where Li 5

ai�ri
ri 1 ai�ri

: ðA5Þ

The relation between P*I ðhÞ and P*SðhÞ holds as stated by the variety-of-
evidence thesis if and only if

h0
h0 1 �h0LI

>
h0

h0 1 �h0LS
⇔ LS > LI ⇔

a2�r

r 1 a2�r
>

a2�r2

ðr 1 a�rÞ2

⇔ ðr 1 a�rÞ2 > �rðr 1 a2�rÞ ⇔ r2 1 2a�rr 1 a2�r2 > �rr 1 a2�r2
⇔ ð1 2 �rÞ 1 2a�r > �r ⇔ 1 > 2�r 2 2a�r ⇔ :5 > �a�r:

Appendix B: Model with Unreliability as Systematic Bias

I compute the likelihood ratio for the independent-reliability version:

LI0 5
PI0ðe1; e2j:hÞ
PI0ðe1; e2jhÞ

5
Pi PI0ðeij:h; riÞPI0ðriÞ 1 PI0ðeij:h; bhi ÞPI0ðbhi Þ 1 PI0ðeij:h; b:hi ÞPI0ðb:hi Þ� �
Pi PI0ðeijh; riÞPI0ðriÞ 1 PI0ðeijh; bhi ÞPI0ðbhi Þ 1 PI0ðeijh; b:hi ÞPI0ðb:hi Þ� �

5
Pi½0ri 1 1ai�ri 1 0�ai�ri�
Pi½1ri 1 1ai�ri 1 0�ai�ri� 5

a1a2�r1�r2
ðr1 1 a1�r1Þðr2 1 a2�r2Þ

:

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


The last expression is identical to the right-hand side of equation ðA1Þ,
which proves that my model and Bovens and Hartmann’s model agree when

INDEPENDENCE IN VARIETY OF EVIDENCE 115

This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions
l

sources are reliability independent.
I do the same for the shared-reliability version:

LS0 5
PS0ðe1; e2j:hÞ
PS0ðe1; e2jhÞ

5
oRPS0ðe1j:h; RÞPS0ðe2j:h; RÞPS0ðRÞ
oRPS0ðe1jh; RÞPS0ðe2jh; RÞPS0ðRÞ

5
0 � 0r 1 1 � 1a�r 1 0 � 0�a�r
1 � 1r 1 1 � 1a�r 1 0 � 0�a�r 5

a�r

r 1 a�r
:

The last expression does not give equation ðA3Þ—that is, this version of the
shared-reliability situation does not concord with Bovens and Hartmann’s
version. In fact, it is equal to equation ðA5Þ, which expresses the likelihood
ratio for a single positive report.
I now prove that independent reliability is always better. To fulfill the

ceteris paribus clause, I again assume that the prior h0 is the same for both
models, that a1 5 a2 5 a, and that r1 5 r2 5 r. Then we have

P*
I
0ðhÞ > P*

S
0ðhÞ ⇔ LI0 < LS0 ⇔

a2�r2

ðr 1 a�rÞ2
<

a�r

r 1 a�r

⇔ a�r < r 1 a�r ⇔ 0 < r:

Thus, as soon as we have a nonnull prior probability that the evidentia
sources are reliable, reliability-independent sources are epistemically pref-
erable.

Appendix C: Extended Model

1. Posterior Belief in the Hypothesis. I focus on the likelihood ratio:

LF 5
PFðe1; e2j:hÞ
PFðe1; e2jhÞ

5
oR1;R2PFðe1; e2; R1; R2j:hÞ
oR1;R2PFðe1; e2; R1; R2jhÞ

5
oR1;R2PFðe1j:h; R1ÞPFðe2j:h; R2ÞPFðR1; R2Þ
oR1;R2PFðe1jh; R1ÞPFðe2jh; R2ÞPFðR1; R2Þ

5
PFðbhi ; bhj Þ

PFðri; rjÞ 1 2PFðri; bhj Þ 1 PFðbhi ; bhj Þ

5
qhh

qrr 1 2qrh 1 qhh
5 1 1

qrr 1 2qrh
qhh

� �21
;

http://www.jstor.org/page/info/about/policies/terms.jsp


where the second-to-last line uses the information in table 2 and the last line
uses table 3, panel A.

116 FRANÇOIS CLAVEAU
2. Conditions for the Off-Diagonal Elements. Using the notation from
table 3, panel A, I rewrite my condition that the marginal probabilities of R1
and R2 are not a function of d:

PðriÞ 5 r 5 qrr 1 qrh 1 qr:h;
Pðbhi Þ 5 �ra 5 qrh 1 qhh 1 qh:h;
Pðb:hi Þ 5 �r�a 5 qr:h 1 q:hh 1 q:h:h:

I then solve this system of equation for the off-diagonal elements in terms of
the diagonal elements, r and a ðI omit the simple algebraic manipulationsÞ.

qrh 5 r 1 �ra 2 :5ð1 1 qrr 1 qhh 2 q:h:hÞ;
qr:h 5 r 1 �r�a 2 :5ð1 1 qrr 2 qhh 1 q:h:hÞ;
qh:h 5 :5ð1 1 qrr 2 qhh 2 q:h:hÞ 2 r:

ðC1Þ

3. The Derivative of the Likelihood Ratio. We can rewrite the likelihood
ratio in ð11Þ by using information from the system of equations ðC1Þ:

LF 5
qhh

qrr 1 2qrh 1 qhh
5

qhh

qrr 1 qhh 1 2ð1 2 �r�aÞ 2 1 2 qrr 2 qhh 1 q:h:h
5

qhh

1 2 2�r�a 1 q:h:h
5

ð�raÞ11d
1 2 2�r�a 1 ð�r�aÞ11d ;

where the last equality uses condition ð12Þ. I take the derivative with respect
to d:

yLF
yd

5
ð�raÞ11dln ð�raÞð1 2 2�r�a 1 ð�r�aÞ11dÞ 2 ð�raÞ11dð�r�aÞ11dln ð�r�aÞ

½1 2 2�r�a 1 ð�r�aÞ11d�2

5
ð�raÞ11d½ð1 2 2�r�aÞln ð�raÞ 1 ð�r�aÞ11dðln ð�raÞ 2 lnð�r�aÞÞ�

½1 2 2�r�a 1 ð�r�aÞ11d�2

5
ð�raÞ11d½ð1 2 2�r�aÞln ð�raÞ 1 ð�r�aÞ11d lnða=�aÞ�

½1 2 2�r�a 1 ð�r�aÞ11d�2 :

The variety-of-evidence thesis maintains that yLF=yd < 0 for all admissible
parameter values. Verifying this:

yLF
yd

< 0 ⇔ ð1 2 2�r�aÞ ln ð�raÞ|fflfflffl{zfflfflffl}
<0

1 ð�r�aÞ11d|fflfflffl{zfflfflffl}
>0

ln
�
a

�a

�
< 0; ðC2Þ
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


which does not hold for some combination of parameter values ðsee fig. 4Þ.
The fact that two distinct regions of a � r lead to a reversal of the inequality

INDEPENDENCE IN VARIETY OF EVIDENCE 117
can be seen from expression ðC2Þ. The first term is positive ði.e., contrib-
uting to a reversal of the relationshipÞ if and only if ð1 2 2�r�aÞ < 0, or more
intuitively, :5 < �r�a. The second term is positive if and only if a > :5. It
follows that the two terms cannot be positive at the same time.

4. Posterior Belief in Reliability. Having a single reliable source is suffi-
cient for the two positive reports e1 and e2 to be truth revealing. The pos-
terior belief that at least one source is reliable is

Pðr1 [ r2je1; e2Þ 5 Pðr1; r2je1; e2Þ 1 2Pðri; bhj je1; e2Þ 1 2 Pðri; b:hj je1; e2Þ|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl}
50

5
Pðe1; e2jr1; r2ÞPðr1; r2Þ 1 2Pðe1; e2jri; bhj ÞPðri; bhj Þ

Pðe1; e2Þ

P*ðr1 [ r2Þ 5
h0ðqrr 1 2qrhÞ

h0ðqrr 1 2qrhÞ 1 qhh
5 1 1

qhh

h0ðqrr 1 2qrhÞ
� �21

:

For ease of manipulation, I use the last equality to define the variable DT
ðfor distrustÞ:

DT 5 P*ðr1 [ r2Þ21 2 1 5
qhh

h0ðqrr 1 2qrhÞ
:

Reusing a result in appendix C.1, I also define a variable C ðfor strength of
confirmationÞ:

C 5 L21
F

2 1 5
qrr 1 2qrh

qhh
:

Distrust and confirmation are related as

C 5
1

h0DT

from which it follows that

yC
yd

> 0 ⇔
yDT
yd

< 0 ⇔
yP*ðr1 [ r2Þ

yd
> 0:

In words, confirmation increases with reliability independence if and only if
posterior trust in the sources increases.
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp


REFERENCES

118 FRANÇOIS CLAVEAU
Bovens, Luc, and Stephan Hartmann. 2002. “Bayesian Networks and the Problem of Unreliable
Instruments.” Philosophy of Science 69 ð1Þ: 29–72.
——

CER

Con

Earm

Fitel

——

Hart

Horw
——

How

ICA

Istitu

Laliv

Myr

OEC

Pear

Shog

Sobe

Way
Whe

Whe
—. 2003. Bayesian Epistemology. Oxford: Oxford University Press.
N. 2012. “Neutrinos Sent from CERN to Gran Sasso Respect the Cosmic Speed Limit.”
ScienceDaily, June 8. http://www.sciencedaily.com/releases/2012/06/120608152339.htm.
taldi, Carlo R. 2011. “The OPERA Neutrino Velocity Result and the Synchronisation of Clocks.”
ArXiv, Cornell University Library. http://arxiv.org/abs/1109.6160.
an, John. 1992. Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory.
Cambridge, MA: MIT Press.
son, Branden. 1996. “Wayne, Horwich, and Evidential Diversity.” Philosophy of Science 63 ð4Þ:
652–60.
—. 2001. “A Bayesian Account of Independent Evidence with Applications.” Philosophy of
Science 68 ð3Þ: S123–S140.
mann, Stephan. 2008. “Modeling in Philosophy of Science.” In Representation, Evidence, and
Justification: Themes from Suppes, ed. Michael Frauchiger and Wilhelm K. Essler, 95–121.
Heusenstamm: Ontos.
ich, Paul. 1982. Probability and Evidence. Cambridge: Cambridge University Press.
—. 1998. “Wittgensteinian Bayesianism.” In Philosophy of Science: The Central Issues, ed.
Martin Curd and Jan A. Cover, 607–24. New York: Norton.
son, Colin, and Peter Urbach. 1993. Scientific Reasoning: The Bayesian Approach. 2nd ed. Chi-
cago: Open Court.
RUS Collaboration. 2012. “Measurement of the Neutrino Velocity with the ICARUS Detector
at the CNGS Beam.” Physics Letters B 713 ð1Þ: 17–22.
to Nazionale di Fisica Nucleare. 2011. “Particles Appear to Travel Faster than Light: OPERA
Experiment Reports Anomaly in Flight Time of Neutrinos.” ScienceDaily, September 23. http://
www.sciencedaily.com/releases/2011/09/110923084425.htm.
e, Rafael, Jan van Ours, and Josef Zweimüller. 2006. “How Changes in Financial Incentives
Affect the Duration of Unemployment.” Review of Economic Studies 73 ð4Þ: 1009–38.
vold, Wayne C. 1996. “Bayesianism and Diverse Evidence: A Reply to Andrew Wayne.” Phi-
losophy of Science 63 ð4Þ: 661–65.
D ðOrganization for Economic Cooperation and DevelopmentÞ. 2006. OECD Employment
Outlook: Boosting Jobs and Incomes. Paris: OECD.
l, Judea. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.
San Francisco: Kaufmann.
enji, Tomoji. 2005. “Justification by Coherence from Scratch.” Philosophical Studies 125:
305–25.
r, Elliott. 1989. “Independent Evidence about a Common Cause.” Philosophy of Science 56:
275–87.
ne, Andrew. 1995. “Bayesianism and Diverse Evidence.” Philosophy of Science 62 ð1Þ: 111–21.
eler, Gregory. 2009. “Focused Correlation and Confirmation.” British Journal for the Phi-
losophy of Science 60 ð1Þ: 79–100.
eler, Gregory, and Richard Scheines. 2011. “Causation, Association and Confirmation.” In Ex-
planation, Prediction, and Confirmation, ed. Dennis Dieks, Wenceslao J. Gonzalez, Stephan
Hartmann, Thomas Uebel, and Marcel Weber, 37–51. Dordrecht: Springer.
This content downloaded  on Wed, 30 Jan 2013 09:13:32 AM
All use subject to JSTOR Terms and Conditions

http://www.jstor.org/page/info/about/policies/terms.jsp