Durham Research Online

Deposited in DRO:

04 June 2015

Version of attached �le:

Published Version

Peer-review status of attached �le:

Peer-reviewed

Citation for published item:

Reiss, Julian (2015) 'A pragmatist theory of evidence.', Philosophy of science., 82 (3). pp. 341-362.

Further information on publisher's website:

http://www.jstor.org/stable/10.1086/681643

Publisher's copyright statement:

c© Philosophy of Science 2015.

Additional information:

Use policy

The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for
personal research or study, educational, or not-for-pro�t purposes provided that:

• a full bibliographic reference is made to the original source

• a link is made to the metadata record in DRO

• the full-text is not changed in any way

The full-text must not be sold in any format or medium without the formal permission of the copyright holders.

Please consult the full DRO policy for further details.

Durham University Library, Stockton Road, Durham DH1 3LY, United Kingdom
Tel : +44 (0)191 334 3042 | Fax : +44 (0)191 334 2971

http://dro.dur.ac.uk

CORE Metadata, citation and similar papers at core.ac.uk

Provided by Durham Research Online

https://core.ac.uk/display/42126651?utm_source=pdf&utm_medium=banner&utm_campaign=pdf-decoration-v1
http://www.dur.ac.uk
http://www.jstor.org/stable/10.1086/681643
http://dro.dur.ac.uk/15587/
http://dro.dur.ac.uk/policies/usepolicy.pdf
http://dro.dur.ac.uk


A Pragmatist Theory of Evidence

Julian Reiss*y

Two approaches to evidential reasoning compete in the biomedical and social sciences:
the experimental and the pragmatist. Whereas experimentalism has received consider-
able philosophical analysis and support since the times of Bacon and Mill (and continues
to enjoy attention and support in very recent work on causation and evidence), pragma-
tism about evidence has been neither articulated nor defended. The overall aim is to fill
this gap and develop a theory that articulates the latter. The main ideas of the theory will
be illustrated and supported by a case study on the smoking/lung cancer controversy in
the 1950s.

1. Introduction. There are two paradigms of reasoning from evidence at
work in the biomedical and social sciences (cf. Parascandola 2004). There
is, on the one hand, the experimental paradigm, according to which random-
ized experiments constitute the ‘gold standard’ of evidence and all other
methods are assessed in terms of how closely they resemble the gold stan-
dard. The experimental paradigm is currently dominant in all the domains
labeled ‘evidence-based’, which include parts of medicine, dentistry, nurs-
ing, psychology, education, social policy, and criminal justice, but also parts
of development economics.
There is, on the other hand, the pragmatist paradigm, according to which

scientific claims are inferred, using pragmatic criteria, from diverse bodies of
evidence that may but need not include experiments. Many scientists across
the biomedical and social sciences subscribe to the pragmatist paradigm, al-
beit usually less candidly than the proponents of experimentalism.

Received December 2013; revised November 2014.

*To contact the author, please write to: Department of Philosophy, Durham University, Dur-
ham DH1 3HN, UK; e-mail: julian.reiss@durham.ac.uk.

yA previous draft of this paper was discussed with the Centre for Humanities Engaging Sci-
ence and Society ðCHESSÞ research group at Durham University and improved considerably.
Thanks also to Bert Leuridan for comments. Financial support from projects FFI2008-01580/
Consolider Ingenio CSD2009-0056 and FFI2011-23267 of the Spanish Ministry of Science
and Innovation is gratefully acknowledged.

Philosophy of Science, 82 (July 2015) pp. 341–362. 0031-8248/2015/8203-0001$10.00
Copyright 2015 by the Philosophy of Science Association. All rights reserved.

341

http://www.jstor.org/page/info/about/policies/terms.jsp


The experimental paradigm has received considerable philosophical
analysis and support since the times of Bacon and Mill. Indeed, Mill’s
methods are best understood as accounts of controlled experimentation, and
more recent work on evidence and causality can be used to underwrite ran-
domized controlled trials (Mayo 1996; Woodward 2003; Cartwright 2007).
Even the philosophical literature that takes a critical stance toward evidence-
based medicine, policy, and practice tends to focus on the virtues and vices
of randomized experimentation.
The pragmatist paradigm is much harder to articulate and defend. Among

other things, the paradigm seems to raise more questions than it answers:
What are the supposedly ‘pragmatic criteria’? What is a diverse ‘body’ of
evidence? Just how ‘diverse’ does it have to be? And how do we know what
is to be included (as evidence) if there’s no standard against which to judge?
The aim of this article is to answer these questions. More broadly speaking,
I aim to develop a theory of evidence that articulates the pragmatist para-
digm and serves as an alternative to the experimentalist paradigm that cur-
rently dominates the discussion.
As the pragmatist theory of evidence is to serve as an alternative to a

paradigm that takes randomized experimentation as the gold standard, I will
focus on scientific domains where randomized experiments can be and are
frequently employed. This includes the domains mentioned above but ex-
cludes all those domains where controlled experiments are effectively epi-
stemic engines, such as large parts of physics and chemistry and basic/in vitro
research in the biomedical sciences. I shall also exclude historical sciences
such as cosmology, astronomy, astrophysics, geology, palaeontology, and ar-
chaeology. I do believe that the proposed account can be extended, but I will
leave the extension to future work.

2. Preliminaries. Before developing the theory, I need to prepare the ground
by distinguishing between two concepts of evidence, both of which are needed
in a satisfactory theory of evidence, laying out a number of desiderata a good
theory should satisfy, and describing a number of caveats for this article.
When we say we have evidence e for a scientific hypothesis h, we may

have either of two importantly different meanings in mind (Salmon 1975).
We might mean that e is a ‘mark’ or ‘sign’ or ‘symptom’ of the hypothesis
being true, that e is a piece of evidence for h. A correlation, say, between
two variables I and D is evidence in this sense for the hypothesis h: ‘I causes
D.’1 To learn that I and D are correlated supports (speaks in favor of ) the
hypothesis without yet constituting a reason to infer the hypothesis, even a

1. I use the variables I for ‘independent’ and D for ‘dependent’ variable instead of, say,
C and E for ‘cause’ and ‘effect’ in order to indicate that the causal relation is merely
putative.

342 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


weak one. This notion of evidence has therefore also been referred to as
“supporting evidence” (Rescher 1958, 83). I will, more concisely, call it
‘support’.
Alternatively, when we say that we have evidence e for a scientific hy-

pothesis h, we may mean that we have ‘proof’ or ‘warrant’ that h or that e
constitutes a ‘(weak, strong, etc.) reason to infer’ h, or that e is ‘body of
evidence’ for h. It is harder to find an unequivocal and simple example for
this type, but suppose that the correlation between I and D was established
in a well-designed randomized trial, treatment and control group are known
to be balanced, greatest care was taken to avoid coding and measurement
error, and so on; then this body of knowledge together constitutes what I will
call ‘warranting evidence’ or ‘warrant’.
The distinction I have in mind can be illustrated by a kind of interrogation

that is familiar from murder mysteries on TV. When the detective investi-
gating the murder case asks someone who is in one way or another related
to the murder (by being involved with the victim, being at the crime scene,
or what have you) for an alibi, the putative suspect frequently gets defen-
sive and replies, “Do you believe I have anything to do with the murder?
I would never . . . !” Detectives then often counter, “I do not believe any-
thing,” and then, “I only collect facts,” or “I have to exclude this possi-
bility,” or “I have to ask this.” If a putative suspect does not have an alibi,
this is a piece of information that speaks in favor of (or does not speak
against and is relevant to) the hypothesis that the putative suspect was the
murderer. As such it has nothing to do whatsoever with belief. It can, in a
different process (which often occurs at a later stage but can also be simul-
taneous), lead to a belief revision and inference to a hypothesis. However,
to collect facts and to make up one’s mind (i.e., to infer a hypothesis) are
two different activities. ‘Support’ relates to the collection of facts; ‘warrant,’
to making up one’s mind. ‘Evidence’, unfortunately, conflates the two.
A good theory of evidence should explicate both support and warrant.

We need, on the one hand, criteria or guidelines that tell us what kinds of
facts we have to collect in order to evaluate a hypothesis; we need to know
what facts are relevant to the hypothesis. We need, on the other hand, cri-
teria or guidelines that tell us how to assess the hypothesis, given the facts
we’ve collected in its support, or, conversely, criteria or guidelines that tell
us how much support of what kind we need in order to achieve a given
degree of warrant. We require criteria or guidelines that translate between
knowledge of the facts relevant to a hypothesis and judgments about the
hypothesis.
A theory of evidence that didn’t tell us about relevance would be imprac-

ticable; a theory that didn’t tell us about assessment would not be useful.
Here, then, is a first desideratum for us: the theory should be a theory of both
support and warrant.

PRAGMATIST THEORY OF EVIDENCE 343

http://www.jstor.org/page/info/about/policies/terms.jsp


Further, it is clear that warrant comes in degrees. We can have better and less
good reasons to infer a hypothesis; a hypothesis can be more or less war-
ranted. Thus, the second desideratum is that the theory is informative about
the degree to which evidence warrants a hypothesis. There is no presumption
here that degrees of warrant are probabilities, only that the theory allows hy-
potheses to be weakly ordered, at least sometimes, with respect to warrant.
Lastly, for a theory to be useful it should tell us about warrant in both

ideal and nonideal epistemic circumstances. Consider the following ideal
theory of evidence:

(ITE) Hypothesis h is strongly warranted if and only if the results e of a
flawless randomized controlled trial fit h (on a suitable notion of
‘fit’).

This would presumably get the judgment right in cases where it does apply, but
it would seldom if ever apply. A good theory continues to provide useful in-
formation when randomization fails or cannot be done, when hypotheses are
established by means of observational studies, when knowledge of the phe-
nomena of interest is limited, and so on. In sum, a good theory of evidence

1. distinguishes support and warrant;
2. provides an account of evidential support;
3. provides an account of warrant that allows warrant to come in de-

grees; and
4. applies to nonideal circumstances typical of science in practice.

Like John Norton, I maintain that justification for inductive inferences is
local and material (e.g., Norton 2003). One cannot say very much about ev-
idence and how it supports hypotheses at a level of high generality. Here I
will focus on a specific type of scientific hypotheses in a relatively small
range of domains. My examples concern scientific hypotheses expressing

• causal relations;
• between type-level variables (rather than token-level or relations of
actual causation); and

• in those parts of the biomedical and social sciences where randomized
experiments can be and are frequently used.

3. Support: The Eliminativist Hypothetico-Contextualist (EHC) Frame-
work. Thepragmatist theoryof evidence proposedhere isremotely related to
the hypothetico-deductive theory of confirmation. Hypothetico-deductivism
(HD), once defended by prominent philosophers of science (Ayer 1936/
1971; Popper 1963; Hempel 1966), has had bad press in philosophy for over
30 years. Clark Glymour once called it “hopeless” (Glymour 1980).

344 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


I will argue in what follows that the philosophers’ obituaries have been
premature and that we should not give up on the basic idea behind the the-
ory. My own framework therefore retains the ‘hypothetico’ of HD. Philo-
sophical critics are mistaken because they focus on the logical properties of
the ‘deductive’ part of the theory. I will argue that the fault lies with this
interpretation of the theory, not with the core idea behind the theory itself.
While my account differs considerably from standard HD as described by
philosophers, it captures the kind of reasoning we find in scientific practice
(in the areas on which I focus here) well.
HD holds that that which is deductively entailed by a hypothesis provides

support for it. More precisely,

(Standard-HD) A statement e provides support for hypothesis h if and
only if h (possibly in conjunction with suitable back-
ground knowledge) deductively entails e.

To be deductively entailed by a hypothesis is, however, neither necessary nor
sufficient for providing support. Hypotheses typically do not entail anything
about the specific data sets that are used in their support. For example, statistical
hypotheses do not entail any statements describing particular data sets; causal
hypotheses do not entail statements about correlations or invariance. The first
example is straightforward. Suppose we observe a series of 50 heads in 50 coin
tosses. This is certainly support for the hypothesis that the coin is biased. But
the hypothesis “This coin is biased” does not entail a description of this par-
ticular series of outcomes or any other.
Similarly, there is no guarantee that causal relations induce correlations in

the relevant data sets. If, to rehearse a standard counterexample, I causes
D via two different routes, say, directly and via an intermediary R as in fig-
ure 1, I can be marginally uncorrelated with D even if the variables are
correlated conditional on R.2 R might be a variable we don’t know about or
one that’s not measurable.

2. I should mention that this example is ruled out by what is called the “faithfulness
condition” (Spirtes, Glymour, and Scheines 2000) or “stability condition” (Pearl 2000).
Spirtes et al. argue that an exact cancellation of influences through the two different
routes has Lebesgue measure zero. Their argument has been criticized, however, for two
principal reasons. First, exact cancellations are often what we try to achieve with policies
(Hoover 2001). To the extent that our policies are successful, we should expect can-
cellations to occur. Second, real-world methods never allow us to determine whether or
not an exact cancellation has occurred anyway. Our philosophy should be relevant to
science as it is practiced, and from an empirical point of view there is no way of telling
whether an exact or rather a near-exact canceling has occurred (Cartwright 1999).
Faithfulness and stability are extremely powerful assumptions where they work, but we
should not bet on their being universally true axioms.

PRAGMATIST THEORY OF EVIDENCE 345

http://www.jstor.org/page/info/about/policies/terms.jsp


At any rate, in our example the support is a marginal correlation between
I and D. And a statement describing the marginal correlation is definitely
not entailed by the causal hypothesis.
Conversely, any statement entails itself, but no self-respecting biomed-

ical or social scientist would take the truth of a hypothesis as support for
itself.
‘Set of deductive consequences’ is, however, only one way to understand

the empirical content of a hypothesis. I propose to regard the relationship be-
tween hypothesis and its support as inductive rather than deductive. In par-
ticular,to determine the support of a hypothesis, we have to ask what patterns
in the data we would expect to hold if the hypothesis were true, given our
understanding of how the world works. To use a mundane example, mur-
derers often leave traces on murder weapons. But not to find a suspect’s
fingerprints on the murder weapon does not demonstrate the falsehood of
the hypothesis because fingerprints on the murder weapon may fail to be
detected for any number of reasons: the murderer wore gloves, she wiped
them off, she threw the weapon into a river, it rained, a cat licked them off, our
fingerprint detection technology failed, a member of our forensic team re-
ceived a bribe and lied about the result, and so on. Nevertheless, background
knowledge concerning how murders happen entitles one to expect a sus-
pect’s traces on a murder weapon under the supposition that he or she is the
murderer and therefore, if found, constitutes support.3

3. Let me make two remarks at this point. First, there is no sharp distinction between
background knowledge, on the one hand, and evidence or support, on the other. In the
context of a given case we might distinguish between entrenched beliefs and new
information that was produced in order to assess the hypothesis at hand, but there is no
presumption to the effect that background knowledge must be true or cannot be chal-
lenged. To the contrary, every factual claim that is used in the assessment of a hypothesis
can in principle be contested. I discuss the issue of the circumstances under and the
extent to which these claims should be challenged in some detail below. Second, the
notion of support is not unlike the Bayesian notion of ‘partial entailment’, albeit without
the probabilities. I cannot make a case against Bayesianism here in any detail. Let me
just say that there are no physical probabilities for conditional statements such as “X
leaves fingerprints on the murder weapon given X is the murderer” or “I and D are

Figure 1. I causes D through two different routes.

346 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


Similarly, a correlation is the kind of thing one is entitled to expect to find
if a causal hypothesis were true even though the absence of a correlation
does not prove the falsity of the hypothesis. Let us then characterize sup-
port as follows:

(S) e provides support for a hypothesis h if and only if e is a pattern in the
data we are entitled to expect to obtain under the supposition that h is
true (see Hempel 1966, 6).

A second problem of standard-HD has been referred to as the “problem of al-
ternative hypotheses” (see Mayo 1996, chap. 6): if e supports h, it may also
support alternatives that are incompatible with h; e on its own does not dis-
criminate between h and the alternatives the supposition of whose truth also
entitles us to expect e to obtain. A correlation between two variables I and D
may support the causal hypothesis h “I causes D,” but it may also support h

0
(“D

causes I”), h
00
(“A common factor C causes both I and D”), h

000
(“The correlation

between causally independent variables I and D was induced by conditioning
on a common effect E”), and many others.
A straightforward solution to this problem is to postulate that support for

a hypothesis is of two kinds: direct support, ed, which pertains to the hy-
pothesis of interest; and indirect support, ei, which pertains to the elimi-
nation of alternative hypotheses. So far we have only looked at direct sup-
port. Indirect support provides another element in ‘EHC’: eliminativism.
Indirect support is given by patterns in the data that are incompatible with

the truth of an alternative hypothesis. A suspect’s fingerprints on the mur-
der weapon are direct support for the hypothesis that the suspect killed the
victim. A second suspect’s alibi is indirect support because it helps to elim-
inate the hypothesis that the second suspect did it. Likewise, if a correlation
provides direct support for a causal hypothesis, a study that shows that no
common cause could be responsible for the correlation provides indirect
support for the hypothesis. Let us then define:

(S-d) ed provides direct support for a hypothesis h if and only if ed is a
pattern in the data we are entitled to expect to obtain under the
supposition that h is true.

(S-i) ei provides indirect support for a hypothesis h if and only if ei is a
pattern in the data that is incompatible with what we are entitled to
expect to obtain under the supposition of the truth of one of h’s al-
ternative hypotheses h

0
, h

00
, h

000
, and so on.

correlated given that I causes D,” to assume sharp subjective probabilities is hopelessly
unrealistic and misleading, and to assume vague probabilities is to give up most of the
advantages of Bayesianism. See Norton (2011) for an elaborate discussion. Possibility
and plausibility are the modalities adequate for evidential reasoning, not probability.

PRAGMATIST THEORY OF EVIDENCE 347

http://www.jstor.org/page/info/about/policies/terms.jsp


Definition S-i is in fact somewhat ambiguous. In what sense are h
0
, h

00
, h

000
, and

so on, alternatives to h? In the present context they are alternative accounts
of the evidence in favor of h. A suspect’s accidental arrival at the crime scene
and his picking up the murder weapon from the cold body is an alternative
account for finding his fingerprints on the weapon (the latter of which speaks
in favor of the initial hypothesis). Selection bias is an alternative account for
a correlation (the latter of which speaks in favor of the initial hypothesis).
The ambiguity is that there are alternative accounts for all pieces of sup-

port, not just for the direct support. A piece of indirect support is a second
suspect’s alibi. But an alibi is never ascertained with full certainty. Rather,
that the second suspect has an alibi is itself inferred from what she says,
what others have observed, CCTV recordings, gas station receipts, credit
card records, and so on. We thus have different pieces of indirect support
that help to ascertain that the second suspect does in fact have an alibi, which
rules her out as a perpetrator and thereby supports the initial hypothesis. But,
of course, each of these pieces of indirect support also comes with alter-
native accounts. If she in fact did it, that’s a good reason for saying that
she was in a restaurant with her girlfriend at the time of the crime. If the
girlfriend confirms this, their friendship or other kind of involvement may
account for her testimony. The gas station receipt could be someone else’s
and the credit card record from the restaurant faked. Additional indirect
evidence serves to rule out these possibilities.
Thus, each piece of indirect support can itself be accounted for by al-

ternative hypotheses. If, say, a common-cause hypothesis is an alternative
account of the correlation and a study that shows that there is no common
cause that could account for the correlation is the indirect support, then
there are alternative accounts for the results of this study. These too must be
eliminated by further indirect support. This leads to the following, amended
definition of indirect support:

(S-i*) ei provides indirect support for a hypothesis h if and only if ei is a
pattern in the data that is incompatible with what we are entitled to expect
to obtain under the supposition that (a) an alternative hypothesis able to
account for h’s direct support is true or (b) an alternative hypothesis able to
account for h’s prior indirect support is true.

Definition S-i* is not circular despite the occurrence of “indirect support” on
the left and on the right of the “if and only if.” The lowest-level indirect support
is defined in terms of direct support. Higher-level indirect support is defined in
terms of lower-level indirect support. However, there can be an infinite regress,
namely, when all higher-level pieces of indirect support continue to have al-
ternative hypotheses.
We have now ‘solved’ two problems of standard-HD by introducing four

new problems: (1) How do we know what we are entitled to expect to ob-

348 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


tain under the supposition of the truth of a hypothesis? (2) How do we
know what are the alternatives to h and which alternatives of a potentially
infinite set of possible alternatives to consider? (3) How are alternative hy-
potheses eliminated? (4) How is the infinite regress in definition S-i* stopped?
As we will see, the third element of the EHC framework, the context of a
causal inquiry, will provide the answers.

4. Causal Inquiries in Context. The notion of expectation employed here
is a contextual one. It is the context of a causal inquiry, itself given by
background knowledge about how the world works, the nature and pur-
pose of the inquiry, and certain normative commitments, that answers these
questions. Let us then turn to an analysis of the contributions of context for
each of the four questions raised above.

4.1. The Empirical Content of Causal Hypotheses. In previous work
I have argued that a problem with standard theories of causation—proba-
bilistic, regularity, interventionist, and process or mechanism—is that they
mistake evidence for whether or not a causal relation is present for the re-
lation itself or for constituting the meaning of causal claims. These are ver-
ificationist theories of causation and therefore suffer from the standard ob-
jections to verificationism (Reiss 2012a). In the present context, however,
the verificationism of these theories is just what we need to determine the
empirical content of a hypothesis. When we ask what we’d expect to find if
the causal hypothesis “I causes D” were true, the standard theories of cau-
sation provide the following answers:4

• a correlation between I and D;
• D’s changing after an intervention on I;
• I’s being a necessary or sufficient condition or both for D, or I’s being
an insufficient but nonredundant part of an unnecessary but sufficient
ðINUSÞ condition for D;

• a continuous process from I to D;
• a mechanism for the causal relation between I and D.

These expectations stem simply from general background knowledge about
how the world works. We know that causal relations typically issue in cor-

4. I omitted the counterfactual theory here, which is a sixth ‘standard’ theory of cau-
sation. While I do believe that counterfactual claims about the value of D that would
have obtained if the value of I had been different can constitute support for causal
hypotheses, their relation to causal claims is rather involved, and they are themselves
highly theoretical and require causal background knowledge to be established (Reiss
2009b, 2012b). To tease these complex relationships apart would require more space
than I have here and distract from the overall line of the argument.

PRAGMATIST THEORY OF EVIDENCE 349

http://www.jstor.org/page/info/about/policies/terms.jsp


relations and regularities and help to bring about change through interven-
tions. In the biomedical and social sciences we also know that causes typi-
cally do not produce their outcomes across spatiotemporal gaps and are often
‘structured’ in the sense of being dependent on underlying systems that are
made up of varieties of mechanisms.
The last two items require some further comments. Correlations, changes

in one variable following an intervention in another and one variable’s being
a necessary, sufficient, or INUS condition for another, are patterns in the data
that, while strictly unobservable, are fairly readily verifiable, given the data
(though see the remarks about coding errors below). This is not the case for
claims about processes and mechanisms. Claims about processes or mech-
anisms are themselves established hypothetico-contextually. When we hy-
pothesize that I causes D and that I causes D through a process or that a
mechanism is responsible for the relation between I and D, we can make
further hypotheses about the mode of action of the process or mechanism.
Each hypothesis about one of the parts of the process or mechanism will
license certain expectations about patterns in the data that should obtain
were the hypothesis true, and finding these patterns (and other patterns that
are incompatible with alternative hypotheses about the process or mecha-
nism) will establish the hypothesis. Conjoining a number of such hypoth-
eses will form a complex hypothesis about a process or mechanism, which
in turn constitutes support for the original causal hypothesis.
Generally speaking, we have learned over time how causal relations be-

have, both at a high level of abstraction and concerning more specific causal
relations in specific contexts. Whereas just 100 years ago causality was
tightly wedded to determinism, we have more recently become accustomed
to probabilistic causality. Thus, whereas a century ago we’d have expected
that an effect must happen if its cause had (and we could use this expecta-
tion to rule out a factor as a cause if its effect does not happen despite its
occurrence), for the most part we now expect causal relation to issue, at best,
in correlations. Similarly, control over phenomena is one of the main pur-
poses of learning causal relations since at least Bacon. However, the Lucas
critique (Lucas 1976) has taught us that at least in economics we cannot al-
ways rely on causal relations for policy (Reiss 2008). It is background knowl-
edge like this that determines the empirical content of causal hypotheses.
Everything said so far pertains to causal inquiries very generally. Focus-

ing on more narrow types of inquiry or specific inquiries provides further
information about what and what not to expect. To keep the discussion brief,
I will focus my remarks mainly on a single case study, the controversy sur-
rounding the hypothesis that smoking causes lung cancer in the 1950s.

4.2. Considering Alternatives. The direct support of the smoking/lung
cancer hypothesis consisted in correlations recorded mainly in retrospective

350 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


case-control studies, but by the mid-1950s also early results of a prospec-
tive study (Doll and Hill 1956). The main alternative account for a corre-
lation is that the correlation is spurious. But ‘spurious correlation’ is am-
biguous, and the way the term is often used is misleading.
Literally speaking, when a correlation is said to be spurious, one means

that the correlation is not genuine but merely apparent. This can happen for a
variety of reasons. One source is the inadvertent conditioning on the wrong
variable. Suppose that I and D are independent, dichotomous variables and
one’s data collection consists only of individuals where either I 5 true or
D 5 true. If so, then I and D will be correlated in the data set but not in the
general population.5 The same happens when one conditions on a joint ef-
fect. Other reasons for correlations being spurious include mismeasurement,
coding errors, sloppiness in keeping records, deliberate fraud, and so on.
‘Spurious correlation’ is more frequently but misleadingly used to refer to

a confounded causal relation. Here I and D are genuinely correlated, but the
correlation is due to a common cause or causality running in the opposite
direction from D to I.
A third case obtains when statistical properties of time series induce cor-

relations that cannot be causally explained. This is, for instance, the case
when two time series monotonically increase (Sober 2001) and, more gen-
erally speaking, when the two time series are nonstationary (Hoover 2003).
Correlations induced by properties of time series cannot readily be classified
as either ‘spurious’ or ‘confounded’.6

In general, an empirical reason is required for taking an alternative ac-
count of the direct support (or prior indirect support) to be relevant. In the
context of a scientific inquiry it would be inappropriate to advance a general
skeptical alternative, such as an evil-demon hypothesis (see Goldman 1976,
775). Among the empirical reasons are generic reasons that pertain to all
inquires of a given type and case-specific reasons. When correlations are

5. Define B ; I v D. I and D are probabilistically independent: Prob(D | I) 5 Prob(D).
Examining a data set that consists only of individuals for which either I or D is true is
equivalent to conditioning on B. Conditional on B, I and D are probabilistically depen-
dent: Prob(D | I, B) ≠ Prob(D | B). This is called Berkson’s paradox. That the data are
selected in this way is not always conspicuous to the researcher.

6. This is therefore an interesting case for theories of evidence that require the evidential
statement that e be true. Whether or not two nonstationary time series (i.e., time series
whose moments such as mean and variance change over time) are correlated is con-
troversial. Kevin Hoover argues that they are not; I argue that they are (Hoover 2003;
Reiss 2007). The facts about which both parties agree are as follows: if Xt and Yt are the
two nonstationary time series, (1) the Pearson correlation coefficient rX,Y ≠ 0, and (2) X
and Yare not causally connected. So if our evidential statement e 5 “X, Yare correlated,”
is e true or false?

PRAGMATIST THEORY OF EVIDENCE 351

http://www.jstor.org/page/info/about/policies/terms.jsp


recorded in observational studies, background knowledge tells us that se-
lection bias is always a relevant alternative. Similarly, Berkson’s paradox is
a relevant alternative when the studies draw on hospitalized patients.
In the smoking/lung cancer case, both types of alternatives were relevant.

One important alternative was Ronald Fisher’s ‘constitutional hypothesis’,
according to which a common genetic factor is responsible for the corre-
lation. Joseph Berkson pointed out that there is a danger of bias if the control
group is not selected in such a way as to represent (with respect to smoking
habits) the general population, which includes the lung cancer patients—
which was the case in the retrospective studies that were drawn on hospi-
talized patients. Mismeasurement (in this case, ‘diagnostic error’) too was
an alternative that was known to possibly account for the observed corre-
lation. If many of those who died of other diseases, such as tuberculosis,
were classified as lung cancer cases, a spurious association could be gen-
erated. At the time it was known, for instance, that an error in tuberculosis
diagnosis of only 11% could account for the entire recorded increase in lung
cancer (Gilliam 1955). This was, then, certainly a relevant alternative.
A researcher can also show an alternative to be relevant by presenting

direct support for it. Fisher supported his views about the smoking/lung cancer
link with a study demonstrating that monozygotic twins are more likely to
be alike with respect to their smoking behavior than dizygotic twins, even
if they were separated at birth (Fisher 1958). This is just what we would
expect if genetics played a role in determining smoking behavior. An al-
ternative for which there is direct support I will call a ‘salient’ alternative.
That cancer susceptibility was partly based on genetics was well known

at the time. The psychologist Hans Eysenck and his colleagues showed that
smoking was related to extroversion, which in turn had a genetic compo-
nent (Eysenck et al. 1960). A noteworthy feature of that study was that it
showed a dose-response effect: the more extroverted a person, the more she
smokes.

4.3. Eliminating Alternatives. Alternatives are eliminated by pointing
to patterns in the data that are incompatible with what we would expect to
be the case were an alternative true. The following are some patterns in the
data researchers have used in order to eliminate alternative hypotheses in
the smoking/lung cancer case:

• Confounding. A number of patterns in the data were appealed to to
eliminate alternative causal accounts. For example, there is a large
dose-response effect. Moderate smokers have a ninefold greater risk
of developing lung cancer than nonsmokers, while over-two-pack-a-
day smokers have at least a 60-fold greater risk. There was no known

352 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


genetic factor that could produce such a strong effect. A study of lung
cancer and blood groups (which were known to have a genetic basis)
showed a difference of only 27% (Fisher 1958). Further, there is a
strong stopping effect in that individuals who discontinue smoking
have a much lower risk of developing the disease. The genetic factor
cannot therefore be constant over an individual’s lifetime, which is
highly implausible given what was known about genetics (Cornfield
et al. 1959). Another piece of indirect support was that lung cancer
prevalence in males increased long before it did in females. If a genetic
factor were appealed to in order to explain this observation, there
would have to have been a mutation in males first and a few decades
later in females, a pattern that had not previously been observed (Corn-
field et al. 1959).

• Spurious correlation. In 1951, Doll and Hill sent questionnaires to
40,000 British doctors asking about smoking behavior and recorded
mortality subsequently. First results from this study became available
in the mid-1950s. These confirmed a dramatic increase in lung can-
cer risk among smokers but could not be accounted for by Berkson’s
paradox (Doll and Hill 1956).

• Diagnostic error. In the mid-1950s there were good reasons to be-
lieve that numerous death cases were misclassified. However, the mis-
classification hypothesis cannot explain micropatterns in the data. As-
suming that lung cancer prevalence was stable over time would mean
a diagnostic error of only 3% among those 35–44 years of age but 59%
among those 75 years or older. Similarly, there would be different rates
of diagnostic error for men and women (Gilliam 1955). It is certainly
possible that there are different error rates in different patient groups,
but that the error in older patients should be an order of magnitude
larger than that in younger patients is extremely unlikely.

Values do and should play a role in the decision whether or not to reject an
alternative in light of incompatible patterns in the data. If little hinges on the
decision, we may keep entertaining an alternative even in light of dramatic
indirect support. If, by contrast, a decision is likely to have significant wel-
fare consequences (as, of course, was the case with respect to alternatives to
the causal hypothesis in the smoking/lung cancer case), the standards for
rejecting an alternative should be lower. There are no strict rules, however,
that map the cost of maintaining a false alternative to a threshold of ‘strength
of support’ beyond which it becomes strictly irrational to do so.
It is therefore important to note that no amount of incompatible infor-

mation can ‘prove’ an alternative wrong. It would not necessarily be irra-
tional to continue to maintain that the constitutional hypothesis is correct

PRAGMATIST THEORY OF EVIDENCE 353

http://www.jstor.org/page/info/about/policies/terms.jsp


in light of a large dose-response effect—perhaps the smoking/lung cancer
gene has a very peculiar mode of action. The rejection of an alternative al-
ways remains a judgment. Direct support and indirect support suggest a
certain decision, but alternative decisions are possible and often defensible.
Over half a century on, we may be inclined to think that those on the “right”
side of the controversy had objectively better reasons than “those who were
wrong.” However, what one finds is “extremely well-written and cogent
papers that might have become textbook classics for their impeccable logic
and clear exposition of data and argument if only the authors had been on
the right side” (Vandenbroucke 1989, 3). Support and logic by themselves
do not compel a decision one way or another.

4.4. Ending the Regress. Each piece of indirect support has itself al-
ternative hypotheses able to account for it. The study that shows that ge-
netic factors can account for only 27% of cancer susceptibility may itself
be subject to all sorts of biases, confounding, mismeasurement, error, and
fraud. If we tried to rule out every one of these possibilities, we would never
reach a stage where we could accept any hypothesis.
One suggestion that has been made is that epistemic trust helps to de-

termine when to stop (Hardwig 1991). It would be impossible to control for
all the potential alternatives; thus, if we didn’t trust others, there would be
no scientific knowledge. If a study claims that there is a certain pattern in the
data, such as an association between two variables, as a general rule, we
take this as a fact.We presume that if we were to replicate the study on
the same data set, our investigations would yield the same result. We think
that this is so because scientists take a reasonable amount of care when
they make public assertions and because peer review constitutes a safeguard
against errors.
It would be quite naive, however, to hope that epistemic trust can do all

the work all the time. We don’t have to appeal to scandals such as that about
Vioxx in pharmaceutical research (Biddle 2007) or AusterityGate in eco-
nomics (Reiss 2014) to see that. On the one hand, there are general statisti-
cal reasons to believe that “most published research findings are false”
(Ioannidis 2005, e124). On the other hand, oftentimes there will be more
specific reasons to mistrust particular findings or claims.
In such environments it is hard to take the bulk of published research

results at face value. Nevertheless, we sometimes have to do that, or there
would be no scientific progress. The following are some pragmatic guide-
lines that can help end the regress. The first is a general, philosophically
motivated rule.

• Default entitlement: as a default rule, scientists are entitled to each
other’s claims. They should probe claims only when there are domain-

354 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


or case-specific reasons to do so. Justification in the sciences can be
said to have what Robert Brandom calls a “default and challenge
structure” (1994, 177). Scientists are entitled to each other’s claims in
the absence of appropriate reasons to think that they are not so enti-
tled. When entitlements are challenged, the reasons given must be
relevant in the context of a given causal inquiry. I understand claims
broadly to include the main study results but also claims about raw
data, as well as the protocols and methods used.

When there are relevant reasons to think that previous results should be
probed, the following examples for supporting guidelines may help. In each
case the guideline was at work in the smoking/lung cancer case, but it can
independently be motivated and defended. The list is, of course, not meant
to be exhaustive of the kinds of rules scientists use to eliminate alternative
hypotheses.

• Effect size: the larger the effect size a study reports, the smaller the
need for probing the result. Large effects can be a great help to the
elimination of alternative explanations because alternatives become
intolerably implausible. This criterion has limitations: it works only
with some kinds of alternatives (e.g., not if fraud is suspected) and
only in some circumstances (namely, when effect sizes are predicted
and large), but it can help greatly where it works.

• Manner and timing of the effect: the more specifically the manner and
timing of the effect match the expectation, the smaller the need for
probing the result. Like effect size, the timing and manner of the effect
can also be of great help with the elimination of alternative accounts.
We may expect some pattern in the data on the assumption of a given
alternative at some relatively abstract level of description, but, with
luck, not at a more microscopic level. If smoking causes lung cancer,
we expect more frequent smokers to have a higher risk, we expect
stopping to have a beneficial effect, we expect the cancer to develop
some time after an individual has taken up smoking rather than im-
mediately, and so on.

• Study characteristics: the smaller the number of background as-
sumptions that are needed to derive a study result and the smaller the
inferential gap between data and result, the smaller the need for
probing the result. There are large differences in the number and kind
of inferences made between studies. Many involve highly sophisticated
statistical techniques and shaky background assumptions. Others pro-
ceed on the grounds of well-entrenched procedures that have been
around for decades or even centuries. Yet others may simply report
summaries of the data in the form of histograms and tables, or even the

PRAGMATIST THEORY OF EVIDENCE 355

http://www.jstor.org/page/info/about/policies/terms.jsp


raw data themselves. All study results and all aspects of an individual
study can in principle have alternative accounts. However, to the extent
that claims involve a minimum amount or unproblematic inferences,
and unless there are overwhelming reasons to believe otherwise, these
claims can be (tentatively) accepted without further probing.

• Economic and other normative considerations: take into account eco-
nomic and other costs and benefits when deciding to stop or continue
probing the indirect support for a hypothesis. Causal inquiry does not
come for free. There are direct, opportunity, and ethical costs. These
costs have to be traded off against the benefits of reducing uncertainty.
The benefits of reducing uncertainty consist in the reduced chance of
accepting a false or rejecting a true hypothesis. There are no strict rules
on how to optimize the trade-off, and people holding different values
will differ in their assessments. What is clear, however, is that a rea-
sonable trade-off will seldom entail an indefinite continuation of chal-
lenging the indirect support for a hypothesis.

These rules helped to resolve the smoking/lung cancer controversy fairly
quickly. Researchers noted the parallel rise in cigarette consumption and
lung cancer and began to investigate the relationship only in the 1930s. By
the early 1950s, prospective studies were under way, and by the mid-1950s,
a large part of the medical community was convinced of the carcinogenicity
of cigarette smoke. Here are some of the facts that played a role in forming
the consensus:

• The effect is massive. In 1956 Doll and Hill calculated that smokers of
25 or more cigarettes per day increased their odds of dying from lung
cancer by a factor of about 24 (Doll and Hill 1956). By the end of the
1950s, data suggested that that factor could be as high as 60 (Cornfield
et al. 1959).

• The manner and timing of the effect are hard to account for by other
hypotheses. Lung cancer rates in the United States went up before they
did in Canada, in parallel with the difference in smoking patterns be-
tween the two countries. This is hard to account for by a genetic
modification. A genetic factor cannot account for the stopping effect.
Other environmental factors cannot account for the sex differences
in smoking behavior and cancer epidemiology. There is a large as-
sociation between pipe and cigar smoking and cancer of the buccal
cavity and larynx but not cancer of the lung.

• Many studies used came as close to epistemic bedrock as it gets.
Gilliam (1955) effectively ruled out the ‘diagnostic error’ hypothesis
by simply arranging mortality statistics according to age and sex. No
mathematics or statistical technique was involved here other than
drawing simple averages.

356 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


• There is a widely shared norm that public health should address the
fundamental causes of disease and aim to prevent adverse health out-
comes. Few researchers working in cancer epidemiology in the 1950s
were motivated primarily by a commitment to smokers’ enjoyment or
the profits of the tobacco industry.7 To form a consensus view con-
cerning the dangers of cigarette smoke has costs in the form of reduced
enjoyment (at least some smokers will give up in response), increased
worry, the health consequences of increased worry, and the financial
losses of all those involved in the production, marketing, and selling
of cigarettes. These obtain quite independently of the truth of the hy-
pothesis. If the hypothesis is true, but only if it is true, it has benefits
in the form of a reduced health burden due to smoking. If there hadn’t
been a normative consensus on values—that the uncertain benefits
outweigh the costs—it would have been a lot harder to form the epi-
stemic consensus.

5. Warrant: Counting Eliminated Alternatives. The account of warrant
I propose follows the EHC framework developed for support. Accordingly,
a scientific hypothesis is warranted to the extent that (a) it has direct sup-
port and (b) alternative accounts of the direct support and indirect support
have been eliminated. It is straightforward, then, to define different ‘grades’
of warrant. I propose to define four grades: proof, strong warrant, moderate
warrant, and weak warrant. Table 1 shows how they are defined.
Calling warrant of the highest grade ‘proof’ is consistent with the scien-

tific use of the term. For example, as early as 1953 Richard Doll wrote about
the smoking/lung cancer link, “The results [described in this paper] amount,
I believe to proof that smoking is a cause of bronchial carcinoma” (1953,
585). This concept of proof should, of course, not be confused with the math-
ematicians’ and logicians’ concept. In particular, to have proof for h does not
entail that h must be true, given the support. It can always be the case that an
alternative has been overlooked or that an alternative that has been elimi-
nated should not have been. So the concept is one of empirical or inductive,
not deductive, proof.8

The number of alternative accounts that have been eliminated is respon-
sible for the strength of warrant. Salient alternatives, that is, alternatives for
which there exists direct support, contribute more to the strength of the war-
rant than nonsalient alternatives because a true alternative is more likely to

7. Naomi Oreskes and Erik Conway’s ‘merchants of doubt’ target beliefs about the ad-
verse health consequences of secondhand smoke, not smoking itself (see Oreskes and
Conway 2010).

8. An important issue concerns the (possible) existence of hitherto-unconceived alter-
natives (see Stanford 2006). Perhaps we shouldn’t call a hypothesis proved if we have not

PRAGMATIST THEORY OF EVIDENCE 357

http://www.jstor.org/page/info/about/policies/terms.jsp


leave traces in the data than a false alternative. To have exactly three grades
of warrant short of proof is, of course, arbitrary, but it is consistent with
scientific practice, for instance, at the International Agency for Research on
Cancer (see IARC 2006).

6. Experiments, Instruments, and Pragmatism. I began this article by
distinguishing two approaches to reasoning from evidence in the biomed-
ical and social sciences: the experimentalist and the pragmatist. Now that I
have articulated the latter, what can we say about the relation between the
two? The main difference, as I see it, is their mode of justification. Experi-
mentalists are methodological foundationalists. They believe that some re-
sults are produced by methods that are intrinsically reliable and therefore
epistemically basic. The epistemically basic method is a well-designed and
well-executed randomized experiment.
Like other foundationalists, experimentalists have to address two fun-

damental issues (see Williams 2001, 85): First, how does one explain that
the chosen kinds of methods are regarded as intrinsically reliable? Second,
how can the success of other methods (those that are not intrinsically re-
liable) be explained with reference to the basic methods? One way to an-
swer the first question is to underwrite the method with a theory of the na-
ture of causality in such a way that the method can be shown to produce
reliable results. Mill’s view of causes as INUS conditions can be understood
this way (as underwriting his methods of agreement and difference; see
Mackie 1980, chap. 3 and appendix), and so can Woodward’s theory of cau-
sation as invariance under intervention (as underwriting experiments, espe-

TABLE 1. DIFFERENT GRADES OF WARRANT

Grade Name Direct Support plus Indirect Support That . . .

1 Proof Eliminates all (relevant) alternative accounts
2 Strong warrant Eliminates all salient alternative accounts and some that

are nonsalient
3 Moderate warrant Eliminates most alternatives, including some that are salient
4 Weak warrant Eliminates some alternative accounts

ruled out all relevant alternatives, whether already put forward or as yet unconceived. I’m
willing to bite the bullet with respect to this issue. Proof is a contextual matter, and if no
one has been able to come up with a plausible alternative account, it is not relevant in the
given context. This may, of course, lead to cases where a hypothesis comes out as proved,
and yet the true hypothesis hasn’t even been conceived yet. As long as everyone un-
derstands that proof is a contextual and fallible matter, this doesn’t seem to be too
problematic. What has been proved today can be revised tomorrow on the basis of new
findings, new technologies, new ideas. This happens all the time.

358 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


cially randomized experiments; see Woodward 2003). While I don’t think
that these are successful theories of causality (e.g., Reiss 2009a), let us, for
the sake of the argument, suppose that these defenses work. What about the
second issue?
The problem is that once experiments are regarded as epistemically basic

because intrinsically reliable, it becomes very hard to explain why data pro-
duced by other methods should be able to support causal claims. If (ideal)
experiments are the ‘gold standard’ of evidence, why are observational stud-
ies that record correlations among variables on which no intervention has
taken place evidence at all? Why demographic trends? Why a study that
records gender- and age-specific patterns in mortality data? To be sure, some
methods resemble experiments. The definition of an instrumental variable
in econometrics, for instance, is very similar to the definition of an ‘inter-
vention’ in Woodward’s theory (Reiss 2005). One can also show that ran-
domization is an instrumental variable (Heckman 1996). However, most
things resemble most other things in one respect or another, and experimen-
talism is silent about the kinds of respect in which a method must resemble
an experiment in order to be able to support a causal claim. It is silent, too,
about the extent to which dissimilarities should be punished (“At what level
of dissimilarity is a method mere silver standard or bronze, and when is it
rubbish?”). Evidence-based medicine and practice provide clumsy ‘hierar-
chies of evidence’ but no explanations of why a hierarchy should be thus and
not otherwise.
The pragmatist theory of evidence proposed here has no trouble ex-

plaining the success of experiments and instrumental-variable studies. Both
(well-designed) experiments and (well-designed) instrumental-variable stud-
ies are often reliable because they eliminate a host of alternatives—all al-
ternative causal hypotheses—in one fell swoop. It can also explain failure
when it occurs. Even a well-designed (natural or field) experiment can de-
liver botched results when variables are poorly measured, coding errors are
made, computer programs are implemented sloppily, or data are inappro-
priately pooled. Experimentation as such cannot protect from these errors,
and the theory proposed here highlights that all relevant alternatives (to the
extent that cost-benefit considerations mandate it) should be eliminated.
One might counter at this point that the contrast I draw between exper-

imentalism and pragmatism is exaggerated—that experimentalists are prag-
matists at heart, except that they have a narrow(er) understanding of what
good evidence is. I disagree, but I cannot give a full-blown empirical anal-
ysis of what scientists really believe here. Let me instead point out the fol-
lowing: (1) Even if this critic is right, the pragmatist alternative still needs to
be articulated—and this is what this article has aimed to do. (2) The prag-
matist theory proposed here can explain why randomized trials are suc-
cessful where they are and describe the conditions under which they can be

PRAGMATIST THEORY OF EVIDENCE 359

http://www.jstor.org/page/info/about/policies/terms.jsp


expected to be successful. Simply declaring them to be the ‘gold standard’
does not provide such an explanation. (3) The remaining, important differ-
ence is that there is no gold standard of evidence whatsoever in the proposed
account. Whoever regards certain kinds of experiment as the standard of
evidence takes a starting point in the method used. The account proposed
here instead begins with the hypothesis and inquires about what kinds of
facts we need to collect or learn in order to be entitled to infer the hypothesis.
That these facts can sometimes be learned in an experiment is trivially true
but, according to this account, contingent and of no deeper significance for
the justification of inferences.

7. Conclusions. Let me end by returning to the questions concerning the
pragmatist paradigm with which I began. The EHC framework presented
here gives the following answers:

Relevance. How do we know what is to be included as evidence (in the
sense of support) in the assessment of a hypothesis? There is no fully
general answer. Evidence for a hypothesis is given by what we are entitled
to expect under the supposition of the truth of the hypothesis. What we are
entitled to expect is given by background knowledge about how the world
works. Relevant to the assessment of a hypothesis is not only evidence in
favor or against the hypothesis but also evidence in favor or against rel-
evant alternative hypotheses.
Body of evidence. The body of evidence is given by the totality of the (di-
rect and indirect) support for a hypothesis. To what extent a hypothesis is
warranted is determined on the basis of the totality of its support.
Diversity of evidence. The body of evidence for a hypothesis is diverse in
two senses. First, there is the distinction between direct and indirect sup-
port. A hypothesis cannot be warranted unless supported by both. Second,
the indirect support, which is used to eliminate alternative hypotheses, will
be as diverse as the alternative hypotheses it helps to eliminate. It is quite
a different thing to show that the existence of a common cause is unlikely
than it is to show that variables have been measured correctly or that peer
review ensures that the chance of encountering coding or programming
errors is low.
Pragmatic criteria. Pragmatic criteria to address these issues have been
discussed throughout section 4 (for instance, that knowledge about corre-
lations; changes under intervention; necessary, sufficient, or INUS condi-
tions; and processes/mechanisms constitutes direct support for causal hy-
potheses; what kinds of alternative hypotheses to consider; that researchers
are entitled to other researchers’ results unless there are good reasons to
think that there is no such entitlements; and so on).

360 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp


This article aims to contribute to a growing body of literature on evidence
in the social and biomedical sciences. Unlike the earlier literature in the Car-
napian and Bayesian traditions, the more recent work takes scientific practice
a lot more seriously, both in terms of its greater use of knowledge about the
conditionsunderwhich science is practicedand in terms of its goal to develop
insights that are relevant to practicing scientists. The specific contribution I
hopetomakeistoprovidearealisticframework—aframeworkthatappliesto
epistemic conditions that are nonideal—for thinking about evidence across
the biomedical and social sciences within which more specific questions,
such as about whether or not both mechanistic evidence and probabilistic
evidence are required to establish a causal hypothesis (e.g., Russo and Wil-
liamson 2007), what the role of basic science is in evidence-based medicine
(e.g., La Caze 2011), or how to interpret hierarchies of evidence (Borgerson
2009), can be addressed fruitfully. Whether the framework delivers on this
promise is, alas, a matter for future research.

REFERENCES

Ayer, Alfred. 1936/1971. Language, Truth and Logic. Repr. London: Penguin.
Biddle, Justin. 2007. “Lessons from the Vioxx Debacle: What the Privatization of Science Can

Teach Us about Social Epistemology.” Social Epistemology 21 (1): 21–39.
Borgerson, Kirstin. 2009. “Valuing Evidence: Bias and the Evidence Hierarchy of Evidence-Based

Medicine.” Perspectives in Biology and Medicine 52 (2): 218–33.
Brandom, Robert. 1994. Making It Explicit: Reasoning, Representing, and Discursive Commitment.

Cambridge, MA: Harvard University Press.
Cartwright, Nancy. 1999. The Dappled World. Cambridge: Cambridge University Press.
———. 2007. “Are RCTs the Gold Standard?” BioSocieties 2 (2): 11–20.
Cornfield, Jerome, William Haenszel, Cuyler Hammond, Abraham Lilienfield, Michael Shimkin,

and Ernst Wynder. 1959. “Smoking and Lung Cancer: Recent Evidence and a Discussion of
Some Questions.” Journal of the National Cancer Institute 22:173–203.

Doll, Richard. 1953. “Bronchial Carcinoma: Incidence and Aetiology.” British Medical Journal 2
(4836): 585–90.

Doll, Richard, and Austin Bradford Hill. 1956. “Lung Cancer and Other Causes of Death in
Relation to Smoking: A Second Report on the Mortality of British Doctors.” British Medical
Journal 2 (5001): 1071–81.

Eysenck, Hans, Mollie Tarrant, Myra Woolf, and L. England. 1960. “Smoking and Personality.”
British Medical Journal 1 (5184): 1456–60.

Fisher, Ronald A. 1958. “Cancer and Smoking.” Nature 182:596.
Gilliam, Alexander. 1955. “Trends of Mortality Attributed to Carcinoma of the Lung: Possible

Effects of Faulty Certification of Deaths due to Other Respiratory Diseases.” Cancer 8:1130–
36.

Glymour, Clark. 1980. “Discussion: Hypothetico-Deductivism Is Hopeless.” Philosophy of Science
47:322–25.

Goldman, Alvin. 1976. “Discrimination and Perceptual Knowledge.” Journal of Philosophy 73
(20): 771–91.

Hardwig, John. 1991. “The Role of Trust in Knowledge.” Journal of Philosophy 88 (12): 693–708.
Heckman, James. 1996. “Randomization as an Instrumental Variable.” Review of Economics and

Statistics 78 (2): 336–41.
Hempel, Carl. 1966. The Philosophy of Natural Science. Upper Saddle River, NJ: Prentice Hall.

PRAGMATIST THEORY OF EVIDENCE 361

http://www.jstor.org/page/info/about/policies/terms.jsp


Hoover, Kevin. 2001. Causality in Macroeconomics. Cambridge: Cambridge University Press.
———. 2003. “Nonstationary Time-Series, Cointegration, and the Principle of the Common

Cause.” British Journal for the Philosophy of Science 54:527–51.
IARC. 2006. IARC Monographs on the Evaluation of Carcinogenic Risks to Humans: Preamble.

Lyon: International Agency for Research on Cancer.
Ioannidis, John. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8):

e124.
La Caze, Adam. 2011. “The Role of Basic Science in Evidence-Based Medicine.” Biology and

Philosophy 26 (1): 81–98.
Lucas, Robert. 1976. “Economic Policy Evaluation: A Critique.” Carnegie-Rochester Series on

Public Policy 1:19–46.
Mackie, John. 1980. The Cement of the Universe: A Study of Causation. Oxford: Oxford University

Press.
Mayo, Deborah. 1996. Error and the Growth of Experimental Knowledge. Chicago: University of

Chicago Press.
Norton, John. 2003. “A Material Theory of Induction.” Philosophy of Science 70 (4): 647–70.
———. 2011. “Challenges to Bayesian Confirmation Theory.” In Philosophy of Statistics, ed. P.

Bandyopadhyay and M. Forster. Dordrecht: Elsevier.
Oreskes, Naomi, and Erik Conway. 2010. Merchants of Doubt. New York: Bloomsbury.
Parascandola, Mark. 2004. “Two Approaches to Etiology: The Debate over Smoking and Lung

Cancer in the 1950s.” Endeavour 28 (2): 81–86.
Pearl, Judea. 2000. Causality: Models, Reasoning, and Inference. Cambridge: Cambridge Uni-

versity Press.
Popper, Karl. 1963. Conjectures and Refutations. London: Routledge.
Reiss, Julian. 2005. “Causal Instrumental Variables and Interventions.” Philosophy of Science 72

(Proceedings): 964–76.
———. 2007. “Time Series, Nonsense Correlations and the Principle of the Common Cause.” In

Causality and Probability in the Sciences, ed. F. Russo and J. Williamson, 179–96. London:
College Publications.

———. 2008. Error in Economics: Towards a More Evidence-Based Methodology. London:
Routledge.

———. 2009a. “Causation in the Social Sciences: Evidence, Inference, Purpose.” Philosophy of
the Social Sciences 39 (1): 20–40.

———. 2009b. “Counterfactuals, Thought Experiments and Singular Causal Analysis in History.”
Philosophy of Science 76:712–23.

———. 2012a. “Causation in the Sciences: An Inferentialist Account.” Studies in History and
Philosophy of Biology and Biomedical Science 43 (4): 769–77.

———. 2012b. “Counterfactuals.” In Oxford Handbook of the Philosophy of Social Science, ed.
H. Kincaid, 154–83. Oxford: Oxford University Press.

———. 2014. “Struggling over the Soul of Economics: Objectivity versus Expertise.” In Experts
and Consensus in Social Science, ed. C. Martini and M. Boumans. Cham: Springer.

Rescher, Nicholas. 1958. “A Theory of Evidence.” Philosophy of Science 25 (1): 83–94.
Russo, Federica, and Jon Williamson. 2007. “Interpreting Causality in the Health Sciences.”

International Studies in the Philosophy of Science 21 (2): 157–70.
Salmon, Wesley. 1975. “Confirmation and Relevance.” In Induction, Probability, and Confirma-

tion, ed. G. Maxwell and R. Anderson, 3–36. Minneapolis: University of Minnesota Press.
Sober, Elliott. 2001. “Venetian Sea Levels, British Bread Prices, and the Principle of the Common

Cause.” British Journal for the Philosophy of Science 52:331–46.
Spirtes, Peter, Clark Glymour, and Richard Scheines. 2000. Causation, Prediction, and Search.

Cambridge, MA: MIT Press.
Stanford, P. Kyle. 2006. Exceeding Our Grasp: Science, History, and the Problem of Unconceived

Alternatives. Oxford: Oxford University Press.
Vandenbroucke, Jan P. 1989. “Those Who Were Wrong.” American Journal of Epidemiology 130

(1): 3–5.
Williams, Michael. 2001. Problems of Knowledge. Oxford: Oxford University Press.
Woodward, James. 2003. Making Things Happen. Oxford: Oxford University Press.

362 JULIAN REISS

http://www.jstor.org/page/info/about/policies/terms.jsp

	Cit p_2:1: 
	Cit p_3:1: 
	Cit p_6:1: 
	Cit p_8:1: 
	Cit p_9:1: 
	Cit p_10:1: 
	Cit p_11:1: 
	Cit p_12:1: 
	Cit p_13:1: 
	Cit p_14:1: 
	Cit p_15:1: 
	Cit p_16:1: 
	Cit p_19:1: 
	Cit p_21:1: 
	Cit p_22:1: 
	Cit p_23:1: 
	Cit p_26:1: 
	Cit p_29:1: 
	Cit p_35:1: 
	Cit p_36:1: 
	Cit p_37:1: 
	Cit p_40:1: 
	Cit p_41:1: 
	Cit p_43:1: