11827 31..59


Laplace’s Demon and the
Adventures of His Apprentices

Roman Frigg, Seamus Bradley,

Hailiang Du, and Leonard A. Smith*y

The sensitive dependence on initial conditions ðSDICÞ associated with nonlinear models
imposes limitations on the models’ predictive power. We draw attention to an additional
limitation than hasbeen underappreciated,namely, structural model error ðSMEÞ. A model
has SME if the model dynamics differ from the dynamics in the target system. If a non-
linear model has only the slightest SME, then its ability to generate decision-relevant pre-
dictions is compromised. Given a perfect model, we can take the effects of SDIC into
account by substituting probabilistic predictions for point predictions. This route is fore-
closed in the case of SME, which puts us in a worse epistemic situation than SDIC.

1. Introduction. The sensitive dependence on initial conditions ðSDICÞ
associated with nonlinear models imposes limitations on the models’ pre-
dictive power. These limitations have been widely recognized and exten-

Received December 2012; revised June 2013.

*To contact the authors, please write to: Roman Frigg, Department of Philosophy, Logic, and
Scientific Method, London School of Economics and Political Science; e-mail: r.p.frigg@lse
.ac.uk. Seamus Bradley, Munich Centre for Mathematical Philosophy, Ludwig-Maximilians-
Universität München; e-mail: seamus.bradley@lrz.uni-muenchen.de. Hailiang Du, Centre
for the Analysis of Time Series, London School of Economics and Political Science; e-mail:
h.l.du@lse.ac.uk. Leonard A. Smith, Centre for the Analysis of Time Series, London School
of Economics and Political Science; e-mail: lenny@maths.ox.ac.uk.

yWork for this article has been supported by the London School of Economics’s Grantham
Research Institute on Climate Change and the Environment and the Centre for Climate Change
Economics and Policy funded by the Economics and Social Science Research Council and
Munich Re. Frigg acknowledges financial support from the Arts and Humanities Research
Council–funded ManagingSevere Uncertainty project andgrant FFI2012-37354 of theSpanish
Ministry of Science and Innovation ðMICINNÞ. Bradley’s research was supported by the Alex-
ander von Humboldt Foundation. Smith would also like to acknowledge continuing support
from Pembroke College, Oxford. We would like to thank Wendy Parker, David Stainforth,
Erica Thompson, and Charlotte Werndl for comments on earlier drafts and helpful discussions.

Philosophy of Science, 81 (January 2014) pp. 31–59. 0031-8248/2014/8101-0009$10.00
Copyright 2014 by the Philosophy of Science Association. All rights reserved.

31


sively discussed.1 In this article we draw attention to an additional problem
than has been underappreciated, namely, structural model error ðSMEÞ. A
model has SME if the model dynamics differ from the dynamics in the
target system. The central claim of this article is that if a nonlinear model
has only the slightest SME, then its ability to generate decision-relevant
probabilistic predictions is compromised. We will also show that SME in
fact puts us in a worse epistemic situation than SDIC. Given a perfect
model, we can take the effects of SDIC into account by substituting
probabilistic predictions for point predictions. This route is foreclosed in
the case of SME, which relegates both point predictions and accurate prob-
abilistic predictions to the sweet land of idle dreams.

To reach our conclusion, we retell the tale of Laplace’s demon, but with a
twist. In our rendering of the tale, the Demon has two apprentices, a Senior
Apprentice and a Freshman Apprentice. The abilities of the apprentices fall
short of the Demon’s in ways that turn them into explorers of SDIC and
SME. By assumption, the Demon can compute the unabridged truth about
everything; comparing his predictions with those of the apprentices will
reveal the ways in which SDIC and SME curtail our predictive abilities.2

In section 2 we introduce our three protagonists as well as basic elements
of dynamical systems theory, which provides the theoretical backdrop
against which our story is told. In section 3 we follow the apprentices on
various adventures that show how predictions break down in the presence
of SME. In section 4 we provide a general mathematical argument for our
conclusion, thereby defusing worries that the results in section 3 are idio-
syncrasies of our example and that they therefore fail to carry over to other
nonlinear models. In section 5 we briefly discuss a number of scientific
modeling endeavors whose success is threatened by problems with SME,
which counters the charge that our analysis of SME is philosophical hair-
splitting without scientific relevance. In section 6 we suggest a way of em-
bracing the problem, and in section 7 we draw some general conclusions.

2. The Demon and His Apprentices. Laplace ð1814Þ invites us to consider
a supreme intelligence who is able both to identify all basic components of

1. For a discussion of the unpredictability associated with nonlinear systems, see Werndl
ð2009Þ and references therein. For discussions of chaos more generally, see, e.g., Smith
ð1992, 1998, 2007Þ, Batterman ð1993Þ, and Kellert ð1993Þ.
2. In other tellings of the tale, we have referred to this triad as the Demon, his Appren-
tice, and the Novice; the impact of chaos on the Demon is discussed in Smith ð1992Þ,
and his Apprentice was introduced in Smith ð2007Þ. Of course, if the universe is in fact
stochastic, then the Demon will make perfect probability forecasts and appears rather
similar to I. J. Good’s Infinite Rational Org. In a deterministic universe, it is the ðseniorÞ
Apprentice who shares the similarity of perfect probabilistic forecasts.

32 ROMAN FRIGG ET AL.


nature and the forces acting between them and to observe these compo-
nents’ initial conditions. On the basis of this information, the Demon knows
the deterministic equations of motion of the world and uses his unlimited
computational power to solve them. The solutions of the equations of mo-
tion together with the initial conditions tell him everything he wants to
know so that “nothing would be uncertain and the future, as the past, would
be present to ½his� eyes” ð4Þ. This operationally omniscient creature is now
known as Laplace’s Demon.

Let us introduce some formal apparatus in order to give a precise state-
ment of the Demon’s capabilities. In order to predict the future, the Demon
possesses a mathematical model of the world. It is part of Laplace’s original
scenario that the model is a model of the entire world. However, nothing in
what follows depends on the model being global in this sense, and so we
consider a scenario in which the Demon predicts the behavior of a partic-
ular part or aspect of the world. In line with much of the literature on
modeling, we refer to this part or aspect of the world as the target system.
Mathematically modeling a target system amounts to introducing a dy-
namical system, X ; ft; mð Þ, which represents that target system. As indicated
by the notation, a dynamical system consists of three elements. The first
element, the set X , is the system’s state space, which represent states of the
target system. The second element, ft, is a family of functions mapping X
onto itself, which is known as the time evolution: if the system is in state
x0 ∈ X at time t 5 0, then it is in y 5 ft x0ð Þ at some later time t. The state x0
is called the system’s initial condition. In what follows we assume that ft is
deterministic.3 For this reason, calculating y 5 ft x0ð Þ for some future time t
and a given initial condition is making a point prediction. In the dynamical
systems we are concerned with in this article, the time evolution of a system
is generated by the repeated application of a map U at discrete time steps:
ft 5 U

t, for t 5 0; 1; 2; : : : ,4 where Ut is the result of applying U t times.
The third element, m, is the system’s measure, allowing us to say that parts of
X have certain sizes.

With this in place, we can describe Laplace’s Demon as a creature with
the following capabilities:

1. Computational Omniscience: he is able to calculate y 5 ft xð Þ for all t
and for any x arbitrarily fast.

2. Dynamical Omniscience: he is able to formulate the true time evo-
lution ft of the target system.

3. In fact, it suffices for ft to be forward deterministic; see Earman ð1986, chap. 2Þ.
4. This is a common assumption. For an introduction to dynamical systems, see Arnold
and Avez ð1968Þ.

LAPLACE’S DEMON AND HIS APPRENTICES 33


3. Observational Omniscience: he is able to determine the true initial
condition x0 of the target system.

If these conditions were met, the Demon could compute the future with
certainty. Laplace is quick to point out that the human mind “will always
remain infinitely removed” from the Demon’s intelligence, of which it
offers only a “feeble idea” ð1814, 4Þ. The question then is what these short-
comings are and how they affect our predictive abilities. It is a curious fact
that while the failure of computational and observational omniscience has
been discussed extensively, relatively little has been said about how not
being dynamically omniscient affects our predictive abilities.5 The aim of
this article is to fill this gap.

To aid our explorations, we provide the Demon with two apprentices—
the Senior Apprentice and the Freshman Apprentice. Like the master, both
apprentices are computationally omniscient. The Demon has shared the gift
of dynamical omniscience with the Senior Apprentice: they both have the
perfect model. But the Demon has not granted the Senior observational om-
niscience: she has only noisy observations and can specify the system’s ini-
tial condition only within a certain margin of error. The Freshman has not
yet been granted either observational or dynamical omniscience: he has nei-
ther a perfect model nor precise observations.

Both apprentices are aware of their limitations and come up with cop-
ing strategies. They have read Poincaré and Lorenz, and they know that a
chaotic system’s time evolution exhibits SDIC: even arbitrarily close initial
conditions will follow very different trajectories. This effect, also known as
the butterfly effect, makes it misinformative to calculate y 5 ft z0ð Þ for an
approximate initial condition z0 because even if z0 is arbitrarily close to the
true initial condition x0, ft z0ð Þ and ftðx0Þ will eventually differ significantly.

To account for their limited knowledge about initial conditions, each ap-
prentice comes up with a probability distribution over relevant initial states,
which accounts for their observational uncertainty about the system’s ini-
tial condition. Call such a distribution p0 xð Þ; the subscript indicates that
the distribution describes uncertainty in x at t 5 0.6 The relevant question
then is how initial probabilities change over the course of time. To answer
this question, they use ft to evolve p0 xð Þ forward in time ði.e., to calculate
ptðxÞÞ. We use square brackets to indicate that ft½ p0ðxÞ� is the forward time
image of p0ðxÞ. The time evolution of the distribution is given by the
Frobenius-Perron operator ðBerger 2001, 126–27Þ. If the time evolution is
one-to-one, this operator reduces to ptðxÞ 5 p0ðf2tðxÞÞ.

5. See, however, Smith ð2002Þ and McWilliams ð2007Þ.
6. Our argument does not trade on the specific form of p0 xð Þ; we assume p0 xð Þ is ideal
given the information available.

34 ROMAN FRIGG ET AL.


The idea is simple and striking: if p0ðxÞ provides them with the proba-
bility of finding the system’s state at a particular place in X at t 5 0, then
pt xð Þ is the probability of finding the system’s state at a particular place at
any later time t. And the apprentices do not only make the ðtrivialÞ state-
ment that pt xð Þ is a probability distribution in a purely formal sense of be-
ing an object that satisfies the mathematical axioms of probability; they are
committed to the ðnontrivialÞ claim that the probabilities are decision rel-
evant. In other words, the apprentices take pt xð Þ to provide us with pre-
dictions about the future of sufficient quality that we ought to place bets,
set insurance premiums, or make public policy decisions according to the
probabilities given to us by pt xð Þ.

This solves the Senior Apprentice’s problem, but the Freshman has a
further obstacle to overcome: the fact that his model has a structural model
error ðSMEÞ. We face a SME when the model’s functional form is rele-
vantly different from that of the true system. In technical terms, by SME
we mean the condition when the dynamical equations of the model differ
from the true equations describing the system under study: in some cases
we can write fM

t
5 fT

t
1 dt, where f

M
t
is the dynamics of the model, fT

t
is

the true dynamics of the system, and dt is the difference between the two.
7

The Freshman’s solution to this problem is to adopt what he calls the
closeness-to-goodness link. The leading idea behind this link is the maxim
that a model that is close enough to the truth will produce predictions that
are close enough to what actually happens to be good enough for a certain
predictive task. Given that we consider time evolutions that are generated
by the iterative application of a map, this idea can be made precise as
follows. Let UT be the Demon’s map ðwhere the subscript T stands for
‘True’, as the Demon has the true modelÞ, and let UF be the Freshman’s
approximate time evolution. Then DU :5 UT 2 UF is the difference between
the two maps, assuming they share the same state space. Furthermore, let
pT
t
xð Þ be probabilities obtained under the true time evolution ðwhere

fTt 5 U
t
TÞ, and pFt ðxÞ the probabilities that result from the approximate time

evolution ðwhere fFt 5 UtFÞ; Dpðx; tÞ is the difference between the two. The
closeness-to-goodness link says that if DU is small, then Dpðx; tÞ is small too
for all times t, presupposing an appropriate notion of being small. The
notion of being small can be explained in different ways without altering the

7. Note that this equation assumes that the model and the system share the same state
space, that is, that they are subtractable ðsee Smith 2006Þ. They need not be. Also note
that SME contrasts with parameter uncertainty, where the model shares the true system’s
mathematical structure, yet the true values of certain parameters are uncertain in the
model. Parameters may be uncertain when the mathematical structure is perfect, but they
are indeterminate given SME: no set of parameter values will suffice to perfect the
model.

LAPLACE’S DEMON AND HIS APPRENTICES 35


conclusion. Below we quantify DU in terms of the maximal one-step error
and Dpðx; tÞ in terms of the relative entropy of the two distributions.

3. The Apprentices’ Adventures. The Demon schedules a tutorial. The Se-
nior Apprentice claims that while her inability to identify the true initial
condition prevents her from making valid point predictions, her probability
forecasts are good in the sense that, conditioned on the information the De-
mon allows her ðspecifically her initial probability distribution p0 xð ÞÞ, she
is able to produce a decision-relevant distribution pt xð Þ for all later times t.
The Freshman does not want to play second fiddle and ventures the bold
claim that dynamical omniscience is as unnecessary as observational om-
niscience and that he can achieve the decision relevance using an imperfect
model and the closeness-to-goodness link.

The all-knowing Demon requires them to put their skills to test in a
concrete situation in ecology: the evolution over time of a population of
rapidly reproducing fish in a pond. To this end, they agree to introduce the
population density ratio rt: the number of fish per cubic meter at time t
divided by the maximum number of fish the pond could accommodate per
cubic meter. Hence rt lies in the unit interval 0; 1½ �. Then they go away and
study the situation.

After a while they reconvene and compare notes. The Freshman suggests
that the dynamics of the system can be modeled successfully with the well-
known logistic map:

rt11 5 4rt 1 2 rtð Þ; ð1Þ

where the difference between times t and t 1 1 is a generation ðwhich, for
ease of presentation, we assume to be 1 weekÞ. Recall from section 2 that a
dynamical system is a three-partite entity consisting of a state space X , a
time evolution operator ft ðwhere ft 5 Ut if the time evolution is gener-
ated by the repeated application of a map U at discrete time stepsÞ, and a
measure m. The Freshman’s model is a dynamical system that consists of
the state space X 5 0; 1½ �; his time evolution fF

t
is generated by iteratively

applying 4rt 1 2 rtð Þ, which is UF; m is the standard Lebesgue measure on
0; 1½ �.
The Demon and the Senior Apprentice know the true dynamical law for

rt:

~rt11 5 1 2 εð Þ4~rt 1 2 ~rtð Þ 1 ε
16

5
~rt 1 2 2~r

2
t 1 ~r

3
t

� �� �
; ð2Þ

where ε is a small parameter. The tilde notation is introduced and justified
in Smith ð2002Þ. The right-hand side of equation ð2Þ, which we call the
quartic map, is UT; applying UT iteratively yields f

T
t
.

36 ROMAN FRIGG ET AL.


It is immediately clear that the Freshman’s model lacks a small structural
perturbation: as ε → 0 the Demon’s map converges toward the Freshman’s.
Figure 1 shows both UT and UF for ε 5 0:1, illustrating how small the dif-
ference between the two is.

We now associate the DU with f
F
t ’s one-step error: the maximum dif-

ference between fF
t
and fT

t
xð Þ for x ranging over the entire X . The maxi-

mum one-step error of the model is 5 � 1023 at x 5 0:85344, where rt11
5 0:50031 and ~rt11 5 0:49531, and hence it is reasonable to say that DU is
small. Applying the closeness-to-goodness link, the Freshman now expects
Dpðx; tÞ to be small too. That is, starting with the same initial probability
distribution p0 xð Þ, he would expect pTt xð Þ and pFt ðxÞ to be least broadly
similar. We will now see that the Freshman is mistaken.

Since it is impossible to calculate pTt xð Þ and pFt ðxÞ with pencil and paper,
we resort to computer simulation. To this end, we partition X into 32 cells,
which, in this context, are referred to as bins. These bins are now the atoms
of our space for evaluating predictions: in what follows we calculate the

Figure 1. Equation ð1Þ in dotted line and equation ð2Þ in shaded line, with rt and ~rt
on the X-axis and rt11 and ~rt11 on the Y-axis. Color version available as an online
enhancement.

LAPLACE’S DEMON AND HIS APPRENTICES 37


probabilities of the system’s state x being in a certain bin. This is of course
not the same as calculating a continuous probability distribution, but since
nothing in what follows hangs on the difference between a continuous
distribution and one over bins, and for the sake of notational ease, we refrain
from introducing a new variable and take ‘pTt xð Þ’ and ‘pFt ðxÞ’ to refer to
the probabilities of bins. Similarly, a computer cannot handle analytical
functions ðor real numbersÞ, and so we represent p0 xð Þ by an ensemble of
1,024 points. We first draw a random initial condition ðaccording to the
invariant measure of the logistic mapÞ. By assumption this is the true ini-
tial condition of the system at t 5 0, and it is designated by the cross in
figure 2a. We then choose an ensemble of 1,024 points consistent with the
true initial condition. These 1,024 points form our ensemble, shown as a
distribution in figure 2a. Dividing the numbers on the Y-axis by 1,024 yields
an estimate of the probability for the system’s state to be in a particular bin.

Figure 2. Evolution of the initial probability distribution under the Freshman’s
approximate dynamics ðblackÞ and the Senior’s true dynamics ðgrayÞ. The gray
cross marks the Demon’s evolution of the true initial condition; the black cross is
the Freshman’s evolution of the true initial condition. Y-axis in d is rescaled to make
the details more visible. Color version available as an online enhancement.

38 ROMAN FRIGG ET AL.


We now evolve all these points forward both under the Senior’s dy-
namics ðgray linesÞ and the Freshman’s dynamics ðblack linesÞ. Figures 2b–
2d show how many points there are in each bin at t 5 2, t 5 4, and t 5 8.

While the two distributions overlap relatively well after 2 and 4 weeks,
they are almost completely disjoint after 8 weeks. Hence, for this x0 these
calculations show the failure of the closeness-to-goodness link: DU being
small does not imply that Dpðx; tÞ is also small for all t. In fact, for t 5 8,
Dpðx; tÞ is as large as can be because there is no overlap at all between the
two distributions.8

Two important points emerge from this example. The first point is that
even though chaos undercuts point predictions, one can still make informa-
tive probabilistic predictions. The position of the gray cross is appropri-
ately reflected by the gray distribution at all times: the gray probability dis-
tribution remains maximally informative about the system’s state given the
information available.

The second and more unsettling point is that the ability to reliably make
decision-relevant probabilistic forecasts is lost if nonlinearity is combined
with SME. Even though the Freshman’s dynamics are very close to the De-
mon’s, his probabilities are off track: he regards events that do not happen
as very likely, while he regards what actually happens as very unlikely. So
his predictions here are worse than useless: they are fundamentally mis-
leading. Hence, simply moving an initial distribution forward in time under
the dynamics of a model ðeven a good oneÞ need not yield decision-relevant
evidence. Even models that yield deep physical insight can produce disas-
trous probability forecasts. The fact that a small SME can destroy the utility
of a model’s predictions is called the hawkmoth effect.9 The effect illus-
trates that the closeness-to-goodness link fails.

This example shows that what truly limits our predictive ability is not
SDIC but SME. In other words, it is the hawkmoth effect rather than the
butterfly effect that decimates our capability to make decision-relevant
forecasts. We can mitigate against the butterfly effect by replacing point
forecasts with probabilistic forecasts, but we have no comparable move
with force against the hawkmoth effect. And the situation does not change
in the long run. It is true that distributions will spread with time and as
t → `. As the distribution approaches the system’s natural measure it be-
comes uninformative. But becoming uninformative and being misleading
are very different vices.

8. This notion is made precise in terms of relative entropy below.

9. Thompson ð2013Þ introduced this term in analogy to the butterfly effect. The term
also emphasizes that SME yields a worse epistemic position than SDIC: hawkmoths are
better camouflaged and less photogenic than butterflies.

LAPLACE’S DEMON AND HIS APPRENTICES 39


One could object that the presentation of our case is biased in various
ways. The first alleged bias is the choice of the particular initial distribu-
tion shown in figure 2a. This distribution, so the argument goes, has been
carefully chosen to drive our point home, but most other distributions would
not be misleading in such a way, and our result only shows that unexpected
results can occur every now and then but does not amount to a wholesale
rejection of the closeness-to-goodness link.

There is of course no denying that the above calculations rely on a
particular initial distribution, but that realization does not rehabilitate the
closeness-to-goodness link. We have repeated the same calculations with
2,048 different initial distributions ðchosen randomly according to the nat-
ural measure of the logistic mapÞ, and so we obtain 2,048 pairs of pTt xð Þ
and pFt ðxÞ for t 5 2, t 5 4, and t 5 8.

So far we operated with an intuitive notion of the difference between
two distributions. But in order to analyze the 2,048 pairs of distributions, we
need a formal measure of the difference between two distributions. We
choose the so-called relative entropy:

SðpFt jpTt Þ :5 E1
0

pFt ln
pFt
pTt

� �
dx;

where ‘ln’ is the natural logarithm.10 The relative entropy provides a mea-
sure for the overlap of two distributions. If the distributions overlap per-
fectly—pFt equals p

T
t —their ratio is then one in the logarithm, and the

entropy is zero; the more dissimilar the distributions, the higher the value of
SðpFt jpTt Þ. Hence, it is reasonable to consider Dpðx; tÞ :5 SðpFt jpTt Þ. Figure 3
shows a histogram of the relative entropy of our 2,048 distributions at t 5 8.

The histogram shows that the Freshman’s probabilities are in line with
the Senior’s only in about a quarter of the cases. Almost half of the dis-
tribution pairs have relative entropy 7 or more. The two distributions shown
in figure 2d have a relative entropy of 8.23.11 So our histogram shows that
at t 5 8 almost half of all distribution pairs are as disconnected as those
in figure 2d and, hence, are seriously misleading.

There is a temptation to respond that this does not show that probabilities
are useless; it only shows that we should not use these probabilities when
they are misleading. The problem with this suggestion is that outside our

10. In our case the integral becomes a sum over the bins of the partition. For a discussion
of relative entropy and information theory, see Curd and Thomas ð1991Þ.
11. Given that our ensemble is only finite, we assign the probability 1=ð1; 024 � 32Þ to
any bin with no ensemble member at all. If that bin occurs, then the entropy would be
~10.4 nats. Hence, ~10.4 reflects the maximum value of the entropy that can be observed
in these experiments.

40 ROMAN FRIGG ET AL.


thought experiment we have no means to tell when that happens. The only
thing we have is the model, which we know to be imperfect in various ways.
Our tale shows that model probabilities and probabilities in the world can
separate dramatically, but we do not know where and when. In cases in
which we have no means of separating the good from bad cases,12 we had
better be on guard.

The second alleged bias is the use of an 8-week forecast: had we used a
different lead time, say 2 or 4 weeks, the Freshman’s endeavors would have
been successful because at t 5 4 his distribution is close the Senior’s. Un-
fortunately this is insufficient: regularly getting the probability distribution
only slightly wrong is enough to face catastrophic consequences.

To see this, let us observe the Freshman’s next endeavor. Still not ac-
cepting the Demon’s evaluation, he opens the Pond Casino. The Pond Ca-
sino functions like a normal casino in that it offers bets at certain odds on

12. In the case of recurrent dynamics, we may have such means; see Smith ð1992Þ.

Figure 3. Histogram of the relative entropy of 2,048 pairs of distributions at t 5 8.
Color version available as an online enhancement.

LAPLACE’S DEMON AND HIS APPRENTICES 41


certain events, the difference being that the events on which punters can
place bets are not outcomes of the spinning of a roulette wheel but future
values of rt. The Freshman takes the above division of the unit interval into
32 bins, which are his basic events ðsimilar to the slots of a roulette wheelÞ,
and offers to take bets based on a four-step forecast. More specifically,
playing a ‘round’ in the Pond Casino at time t amounts to placing a bet at t
on bin Bi, where the outcome is whether the system is in Bi at t 1 4. So if
you bet, say, on B31 at t 5 3, you win if rt57 is in B31.

Had the Freshman offered bets on an eight-step forecast, one would
expect him to fail given that his probabilities at t 5 8 are fundamentally
misleading. Given that his probabilities look close to the Senior’s at t 5 4,
however, he holds the hope that he will do well.

What is the payout for a winning bet? Let A be an event that can obtain
in whatever game is played in a casino. The odds o Að Þ the casino offers on
A are the ratio of payout to stake. If, for instance, the casino offers o Að Þ 5 2
ð‘two for one’Þ, a punter who bets £1 on A gets £2 back when A obtains.
Within the context of standard probability theory, odds are usually taken
to be the reciprocals of probabilities: o Að Þ 5 1=p Að Þ. When flipping an un-
biased coin, the probability for heads is 0.5, and if you bet £1 on heads
and win, you get £2 back.13 The Freshman follows this convention and takes
the reciprocals of pF

t
ðxÞ in a four-step forecast as his odds.

Now a group of nine punters enters the casino. Each has £1,000, and
they adopt a simple strategy. In every round, the first punter bets 10% of
his total wealth on events with probability in the interval ð1=2; 1�. We call
this strategy fractional betting ðwith f 5 1=10Þ for the probability interval
ð1=2; 1�.14 The second punter does the same with events with probability
in ð1=4; 1=2�, the third with events with ð1=8; 1=4�, and so on, with
ð1=16; 1=8�, ð1=32; 1=16�, ð1=64; 1=32�, ð1=128; 1=64�, ð1=256; 1=128�,
½0; 1=256�. The minimum bet the casino accepts is £1, so if a punter’s
wealth falls below £1 he is effectively broke and has to leave the game.

Using the same initial distribution as above ðshown in fig. 2aÞ, the Pond
Casino now offers odds reflecting the Freshman’s probabilities. The out-
comes of bets are of course determined by the true dynamics. We now gen-
erate a string of outcomes based on the true dynamics and trace the punters’

13. We use so-called odds-for throughout this article. They give the ratio of total payout
to stake. Odds-to give the ratio of net gain to stake ðnet gain is the payout minus the
stake paid for the betÞ. Odds-for and odds-to are interdefinable: if the odds-for for an
event are a=b, then the odds-to are a 2 bð Þ=b. Since in this case odds-for are equal to
1=p Að Þ, the odds-to are 1 2 p Að Þ=p Að Þ, which is equal to p :Að Þ=p Að Þ, where :A is
‘not A’.

14. The argument does not depend on fractional betting, which we chose for its sim-
plicity. Our conclusions are robust in that they hold for other betting strategies.

42 ROMAN FRIGG ET AL.


wealth, which we display in figure 4 as a function of the number of rounds
played.

We see that the punters have the time of their lives. Three of them make
huge gains very soon, and a further four follow suit a bit later. After 2,500
rounds, seven out of nine punters have increased their wealth at least ten-
fold, while only two of them have gone bust. So the punters take a huge
amount of money off the casino.

There is a temptation to make the same move as above and argue that
this is a ‘bad luck event’ due to the particular initial distribution, which
should not be taken as indicative of the casino’s performance in general. We
counter in the same vein and consider again 2,048 randomly chosen ini-
tial probability distributions. For each of these we let the game take place
as before. If the above was a rare special event, then one would expect
to see different results in the other 2,047 runs. Since producing another
2,047 plots like the one seen in figure 4 is not a viable way to present the
outcomes, we assume that the casino starts with a capital of £1,000,000 and
calculate the time to bust. Figure 5 is a histogram of how the casino per-

Figure 4. Wealth of nine punters as a function of the number of rounds played.
Color version available as an online enhancement.

LAPLACE’S DEMON AND HIS APPRENTICES 43


forms with our 2,048 different initial distributions. Once more the picture
is sobering. Most casinos go bust after just a few rounds, and the last one is
going out of business after 40 rounds. Offering odds based on pFt ðxÞ is
disastrous.

Recall that the punters betting against the apprentice are not using any
sophisticated strategy and have no extra knowledge to gain an advantage
over the house. They are not, for instance, keeping track of the past as clever
punters would ðand indeed do in card-counting systems for games like
blackjack whereby the bettor exploits the information contained in the past
sequence of cardsÞ. In such a scenario the bettor is using more informed
probabilities than the implied probabilities of the casino’s odds, and it is
indeed no surprise if the casino loses money against such bettors.

Our punters are not of this kind. They simply bet on the basis of the
values of the odds offered. One punter just bets on all events with implied
probabilities in the range ð1=16; 1=8�. The information is entirely sym-
metrical—the punters know nothing that the house does not know. Hence,
our worry is not just that the apprentice loses money: a punter with access
to the system probabilities could obviously do well against the house. Our

Figure 5. Histogram of time to bust for 2,048 distributions. Color version available
as an online enhancement.

44 ROMAN FRIGG ET AL.


worry is that the house does disastrously even against punters who know
no more than the house.

Frustrated with his failures, the Freshman cannot help himself and starts
peeping over the Demon’s shoulder to get the exact initial condition. He
convinces the Demon to repeat the entire casino adventure, but rather than
moving probability distributions forward in time, he now calculates the
trajectory of the true initial condition ðwhich he gleans from the DemonÞ
under his dynamical law. This, he thinks, will guarantee him a success. For
want of space we do not follow his further adventures in detail, and in fact
there is no need to. A look at figure 2 suffices to realize that he has set
himself up for yet another fiasco. The gray crosses in figure 2 are the true
time evolution of the true initial condition; the black crosses are the Fresh-
man’s time evolution of the true initial condition. We see that the trajec-
tories of the true initial condition under the two dynamical laws soon be-
comes completely different, and any prediction generated with the model is,
once again, seriously misleading. So even if the Freshman was observa-
tionally omniscient, he would not be able to generate decision-relevant
predictions. SME is a serious issue independently of SDIC. The moral is now
unavoidable: offering odds according to the probabilities of an imperfect
model can be disastrous even when information is entirely symmetrical be-
tween all parties.

4. From Example to Generalization. An obvious line of criticism would
be to argue that the problems we describe are specific to the logistic map
and do not occur in other systems. So the question is, how general are the
effects we have discussed in the last section? To answer this question we
review a number of mathematical results about the structural stability of
dynamical systems. Our conclusion will be sober. There are special cases in
which the above effects do not occur,15 but in general there are no such
assurances. Not only are there no general stability results; there are in fact
mathematical considerations suggesting that the effects we describe are
generic. So we urge a shift of the onus of proof: rather than assuming that
nonlinear models are structurally stable and asking the skeptic to make his
case, the default assumption ought to be that models are not structurally
stable and hence exhibit the effects we describe. Using a particular model
for predictive purposes therefore requires an argument to the effect that the
model is structurally stable.

Roughly speaking, a dynamical system is structurally stable if its tra-
jectories change only a little if the equation is changed only a little. An-
dronov and Pontrjagin ð1937Þ presented the first systematic study of struc-

15. Integrable Hamiltonian systems, which respect the Kolmogorov-Arnold-Moser
theorem, being one example with structural stability.

LAPLACE’S DEMON AND HIS APPRENTICES 45


tural stability, providing both a definition of structural stability and a theo-
rem. They consider a two-dimensional system that is defined on a disk D2

in the plane with the equations dx=dt 5 P x; yð Þ and dy=dt 5 Q x; yð Þ. We
obtain the perturbed system by adding a differentiable function to each equa-
tion: dx=dt 5 P x; yð Þ 1 p x; yð Þ and dy=dt 5 Q x; yð Þ 1 q x; yð Þ. The origi-
nal system is structurally stable if and only if for any real number ε > 0
thereisarealnumberd > 0 such that there exists a smooth ε-homeomorphism
hε : D

2 → D2 that transforms the trajectories of the original system into tra-
jectories of the perturbed systems. Being an ε-homeomorphism means that
whenever the absolute value of both p x; yð Þ and q x; yð Þ as well as their
first derivatives are <d, then the homeomorphism moves each point in D2

by less than ε.
Given this definition of structural stability, Andronov and Pontrjagin for-

mulate a theorem saying that for a system of the above kind to be structur-
ally stable, it is necessary and sufficient that the following two conditions
be satisfied: ðiÞ singularities and closed orbits are hyperbolic, and ðiiÞ there
is no trajectory connecting saddle points. However, it turned out that there
were problems with their proof. A different proof was given by Peixoto
and Peixoto ð1959Þ.16 Peixoto ð1962Þ went on to generalize the result to
flows on a compact two-dimensional manifold M. He showed that in the
space of all differentiable flows on orientable manifolds, structurally stable
systems are open and dense in that space relative to the Cr topology. This is
often summarized in the slogan that structural stability is generic.

Two-dimensional flows, however, are rather special, which raises the
question of what the situation in higher dimensions is.17 While the definition
of structural stability carries over swiftly to higher dimensions, generalizing
Andronov and Pontrjagin’s theorem to higher dimensional spaces was a
formidable problem that turned into a research program spanning almost
half a century. The mathematical details cannot be reviewed here; we sketch
the main line of argument, which is sufficient for our purposes.

Smale ð1967Þ formulated the so-called Axiom A, which essentially says
that the system is uniformly hyperbolic.18 The strong transversality con-
dition says that stable and unstable manifolds must intersect transversely at
every point. Palis and Smale ð1970Þ conjectured that a system is structurally
stable if and only if it satisfies Axiom A and the strong transversality
condition. Proving this result turned out to require a concerted effort and

16. Their proof was based on a slightly different definition of structural stability than
the one given in the last paragraph, but it can be shown that the two definitions are
equivalent.

17. They are special not least because they cannot exhibit chaos ðBarreira and Valls
2012, chap. 7Þ.
18. For details, see Robinson ð1976Þ.

46 ROMAN FRIGG ET AL.


was brought to a conclusion by Mañé ð1988Þ for diffeomorphisms ð‘maps’Þ
and Hayashi ð1997Þ for flows.

The relation between structural stability and the Demon scenario is ob-
vious: if the original system is the true dynamics, then the true dynamics
has to be structurally stable for the Freshman’s close-by model to yield
close-by results. This raises the question whether the systems we are in-
terested in satisfy the above conditions ðand hence are structurally stableÞ.
This question does not seem to be much discussed, but available results
suggest a negative conclusion. Smale ð1966Þ showed that structural sta-
bility is not generic in the class of diffeomorphisms on a manifold: the set
of structurally stable systems is open but not dense. So there are systems
that cannot be approximated by a structurally stable system. More recently,
Smith ð2002Þ and Judd and Smith ð2004Þ presented an argument for the
conclusion that if the model’s and the system’s dynamics are not identical,
then “no state of the model has a trajectory consistent with observations of
the system” ðJudd and Smith 2004, 228Þ. Consistency here is defined by
the observational noise in the measurements: it quickly becomes clear that
there is no model trajectory that could have produced the actual observa-
tions; no model trajectory can shadow the measurements ðSmith 2007Þ.
This result holds under very general assumptions.

This has a direct consequence for situations like those considered in
sections 2 and 3. If the true dynamics is structurally unstable, then the dy-
namics of a model with model error ðno matter how smallÞ will eventually
differ from the true dynamics, resulting in the same initial conditions evolv-
ing differently under the two dynamical laws. Given this, we would expect
probability distributions like p0 xð Þ to evolve differently under the two dy-
namical laws, and we would expect pT

t
xð Þ and pA

t
xð Þ to have growing rela-

tive entropy. We emphasize that these are plausibility assumptions; to the
best of our knowledge there are no rigorous proofs of these propositions.
Plausibility arguments, however, are better than no arguments at all. And
there is certainly no hint of an argument to the effect that high-dimensional
systems are structurally stable. So the challenge stands: those using non-
linear models for predictive purposes have to argue that the model they use is
one that is structurally stable, and this is not an easy task.

5. Imperfect Models in Action. Our thought experiment has close real-
world cousins. In most scientific scenarios the truth is beyond our reach ðif
such a thing even existsÞ, and we have to rest content with imperfect mod-
els—it is a well-rehearsed truism that all models are wrong. Scientists, like
the Freshman, are in the situation that they have to produce predictions with
a less than perfect model. Some of these predictions are then used to assess
the risk of future outcomes. In particular, insurers and policy makers are
like the owner of the Pond Casino: they have to set premiums or make pol-

LAPLACE’S DEMON AND HIS APPRENTICES 47


icies on the basis of imperfect model outcomes. Examples can be drawn
from domains as different as load forecasting in power systems ðFan and
Hyndman 2012Þ, inventory demand management ðSnyder, Ord, and Beau-
monta 2012Þ, weather forecasting ðHagedorn and Smith 2009Þ, and climate
modeling ðMcGuffie and Henderson-Sellers 2005Þ.

But how can nonlinear models be so widely used if their predictive
power is as limited as we say it is? Are we overstating the case, or is science
embroiled in confusion? The truth, we think, lies somewhere in the mid-
dle. The limitations on prediction we draw attention to are debilitating for
mathematical precision but not for valuable insight. Hence, at least some
scientific projects would need to rethink their methodology in the light of
our discussion. A model can be an informative aid to understanding phe-
nomena and processes while at the same time being maladaptive if used
for quantitative prediction. As far as we can see, the question of whether the
hawkmoth effect threatens certain modeling projects has not yet attracted
much attention, and we would encourage those engaged in quantitative
prediction in the short run, and even qualitative prediction in the long run, to
lend more thought to the matter.19

Another challenge along the same lines argues for the opposite conclu-
sion: if we are interested in long-term behavior, we do not need detailed
predictions at all and can just study the natural measure of the dynamics.
The natural measure reflects a system’s long-term behavior after the initial
distribution ‘washes out’; it is therefore immaterial where we started. It then
does not matter that on a medium timescale the distributions look different
because we are simply not interested in them.

This view gains support from the fact that we seem to have revealed only
half of the truth in section 3. If we continue evolving the distribution for-
ward to higher lead times, we find that for this particular model-system pair
the two distributions start looking more similar again and, moreover, that
they start looking rather like the natural measure of the logistic map. This is
shown in figure 6 for t 5 16 and t 5 32. Perhaps if all we need is to make
reliable predictions in the long run, then the ‘medium term aberrations’ seen
in figure 2 need not concern us at all.

Again, while there is similarity in this case, it cannot be expected to
happen universally. Implicit in this proposal is the assumption that natural
measures of similar dynamical laws are similar—because unless the model

19. For model error in weather forecasting, see Orrell et al. ð2001Þ, while for climate
forecasting, see Smith ð2002Þ and McWilliams ð2007Þ and consider criticisms of
UKCP09 ðFrigg, Smith, and Stainforth 2013Þ. UKCP09 offers detailed high-resolution
probability forecasts across the United Kingdom out to the 2090s; the hawkmoth effect
poses a serious challenge for any rational applications of this particular predictive en-
deavor. This fact casts no doubt on the reality or risks of anthropogenic climate change,
for which there is evidence both from basic physical science and observations.

48 ROMAN FRIGG ET AL.


Figure 6. Same scenario as in figure 2 but for lead times ðaÞ t 5 16 and ðbÞ t 5 32.
Color version available as an online enhancement.


and the system have the same natural measures there is no reason to assume
that adjusting beliefs according to the natural measure of the model could be
informative. While figure 6 is suggestive for this model-system pair ðand
even this would remain to be shown rigorouslyÞ, there is every reason to
believe that in general natural measures do not have this property. Fur-
thermore, unlike the Demon’s pond, models of many real target systems,
such as the world’s climate system, are not stationary and do not have
invariant measures at all. This forecloses a response along the above lines.
Long-term quantitative prediction is difficult.

How severe the problem is depends on how detailed the predictions one
wishes to make are. In general there is a trade-off between precision and
feasibility. In the above example it is trivially true that rt lies between zero
and one; we can reliably predict that rt will not fluctuate outside those
bounds ðin the modelÞ. And there are certainly other general features of the
system’s behavior one can gain confidence in with experience. What we
cannot predict is that rt will assume a particular value x ∈ X or will lie in a
relatively small area around x at a particular point in time, nor can we give
probabilities for this to happen. Whether a project runs up against problems
with the hawkmoth effect depends on whether it tries to make predictions of
the latter kind.20

6. A Tentative Suggestion: Sustainable Odds. So far we have discussed
problems with imperfect models and pointed out that there is no easy fix.
One natural reaction would be to throw in the towel and conclude that the
best option would be not to use such models at all. This would be throw-
ing out the baby with the bathwater. Models often show us how things
work, and, as we have seen above, in some cases at least a model provides
some quantitative insight. So the question is, how can we use the infor-
mation in a model without being too dramatically misled?

This question has no easy answer because in real science we cannot
just peep over the Demon’s shoulder and compare our models with the
true dynamics—real scientists are like the Freshman without the Demon
ðor infinite computer powerÞ. So what could the Freshman do to improve
his interpretation of model simulations without trying to turn into a De-
mon ðwhich he cannotÞ? Failure to grasp this nettle is to pretend he is the

20. Space constraints again prevent us from engaging in detailed case studies. We note,
nevertheless, that UKCP09 aims to make exactly such predictions by forecasting, for
instance, the temperature on the hottest day in central London in 2080, and the project
is advertised as providing “daily time series of a number of climate variables from a
weather generator, for the future 30-yr time period, under three emission scenarios.
These are given at 5km resolutions across the UK, the Isle of Man and the Channel
Islands” ðJenkins et al. 2009, 8Þ. Worries about the implications of the hawkmoth effect
are not just a hobbyhorse for academic philosophers.

50 ROMAN FRIGG ET AL.


Demon. In this section we make a tentative proposal to leave probabilism
behind and use nonprobability odds.

As we noted above, the odds o Eð Þ on E are the ratio of total payout
to stake. If there is a probability p Eð Þ for E, then fair odds on E are tra-
ditionally taken to be the reciprocals of the probabilities: o Eð Þ 5 1=p Eð Þ.
This need not be so: we can just as well take odds as our starting point and
say that the longer the odds for an event E, the more surprising it is if the
event occurs. Odds thus understood do not necessarily have any connec-
tion to probabilities. Let a :5 E1; : : : ; Enf g be a complete set of events,21
let o Eið Þ, i 5 1; : : : ; n, be the odds on all the events in a, and define
s 5 oni51 1=o Eið Þ½ �. They are probability odds only if s 5 1; they are non-
probability odds otherwise.22 Furthermore, let us call p Eið Þ :5 1=o Eið Þ the
betting quotients on Ei. The p are ‘probability-like’ in that they are num-
bers between zero and one, with one indicating that the obtaining of an event
is no surprise at all and zero representing a complete surprise.

With this in place, let us continue our thought experiment. The Fresh-
man wants to try to run a casino without going bust. From his last expe-
rience he knows that using probability odds set according to pA

t
xð Þ appears

a recipe for disaster. So he decides to shorten his odds to guard against
loss. Of course you can always guard against loss by not paying out any
net gain at all and merely returning the stake to punters when they win ði.e.,
by setting all o Eið Þ 5 1Þ. This, however, is not interesting to punters, and
they would not play in his new casino. So the Freshman aims to offer a
game that is as attractive as possible, by offering odds that are as long as
possible, but only so long that he is unlikely to go bust unexpectedly.

There are different ways of shortening odds. Perhaps the simplest way is
to impose a threshold v on the pt Eið Þ: ptðEtÞ5 pFt ðEiÞ if pFt ðEiÞ> v, and
ptðEtÞ 5 v if pFt ðEiÞ≤ v, where v can be any real number so that 0 ≤ v ≤ 1.
We call odds thus calculated threshold odds. For the limiting case of v 5 0
the pt Eið Þ correspond to probabilities, and the respective odds correspond
to probabilistic odds. It is important to emphasize that the threshold rule
applies to all possible events and not only the atoms of the partition—the
idea being that one simply does not offer p’s smaller than v no matter what
the event under consideration is. In particular, the rule applies simulta-
neously to events and their negation. If, for instance, we set v 5 0:2 and
have pF

t
ðEiÞ 5 0:95 ðand hence, by the axioms of probability, pFt ð:EiÞ 5

0:05Þ, then pt Eið Þ 5 :95 and pt :Eið Þ 5 0:2, where :Ei is the negation of
Ei ði.e., the nonoccurrence of EiÞ.

This move is motivated by the following observation. In figure 2 we see
that, based on pFt , we sometimes offer very long odds on events that are in

21. We only consider discrete and countable event spaces.

22. Nonprobability odds have been introduced in Judd ð2007Þ and Smith ð2007Þ.

LAPLACE’S DEMON AND HIS APPRENTICES 51


reality ði.e., according to pTt Þ very likely to happen. It is with these events
that we run up huge losses. Putting a lower bound on the pt Eið Þ amounts to
limiting large odds and thus the amount one pays out for an actual event
that one’s model wrongly regarded as unlikely.

We now repeat the scenario of figure 4 with one exception: the Fresh-
man Apprentice now offers nonprobability odds with a thresholds of v 5
0:05, v 5 0:1, and v 5 0:2. The result of these calculations is shown in
figures 7a, 7b, and 7c, respectively.

We see that this strategy brings some success. Already a very low
threshold of v 5 0:05 undercuts the success of five out of seven punters,
and only two still manage to take money off the casino. A slightly higher
threshold of v 5 0:1 brings the number of successful punters down to one.
So for v 5 0:2 the Freshman Apprentice achieves his goal of running a
sustainable casino.

The second way of shortening odds is damping. On this method the
betting quotients are given by pt Eið Þ 5 1 2 b 1 2 pFt Eið Þ

� �
, where the damp-

ing parameter b is a real number 0 ≤ b ≤ 1. We see that for b 5 1 the pt
correspond to probabilities. We call odds thus calculated damping odds. We
now repeat the same calculations as above, and the results are very similar
ðwhich is why we are not reproducing the graphs hereÞ. For b 5 0:95 only
two punters succeed ðindeed the same two as aboveÞ. With a slightly
stronger damping of b 5 0:9 only one is still winning ðagain the same as
aboveÞ, and for b 5 0:8 all punters are either losing or not playing at all
ðbecause no bets in their range are on offerÞ.

The moral of this last part of our tale is that shortening odds, either by
introducing a threshold or by damping, can provide some protection against
losses. In doing so the Freshman has attempted to introduce what we call
sustainable odds. There are no doubt better ways to construct sustainable
odds and better meet the challenges to their use in decision support. How
to construct more useful varieties of sustainable odds is the question for a
future project. For now we just note that while probability odds are easier
to use, using them leads to disaster. Furthermore, we can regard the amount
of deviation of the shortening parameters from their ‘probability limits’
ði.e., the deviation of v from zero and of b from oneÞ as a measure of the
model inadequacy: the greater this deviation, the less adequate the model.

We would like to point out that also this last part is closer to reality than
it seems. The sustainable yet interesting casino is modeled on a coopera-
tive insurance company. Rather than playing for gain, the ‘bets’ placed are
insurance policies bought to compensate for losses suffered should cer-
tain events happen. What makes our insurance a cooperative insurance is
its attempt to offer a full payout ðto fully compensate its clientsÞ at the
lowest rates that allow it to operate in a sustainable way ðan insurance

52 ROMAN FRIGG ET AL.


company that goes bust is of little useÞ. So our nonprobability odds casino
has a close real-world cousin, and the morals drawn above are relevant
beyond the tale of Laplace’s Demon.

So far we have shown that one is all but certain to go bust when allowing
bets on model probabilities. The conclusion of our argument might be seen
as a decision-theoretic one: that it is pragmatically advantageous to adopt
nonprobabilistic odds. This is not the interpretation we favor. We prefer to
see it as an epistemological argument, albeit one that involves talk of bet-
ting. We are not making any decision-theoretic assumptions in coming to
our conclusions. We mean for our agent to be shortening his odds due to
epistemological flaws, not just so as to avoid bad outcomes. Talk of casi-
nos, betting, and going bust helps to put an epistemic problem into focus—
the main point is that the pragmatic flaw ðsystematic and statistically pre-
mature ruinÞ points to an epistemological flaw in the agent’s representation
of belief.

Needless to say, the use of nonprobability odds raises a host of issues.
How exactly should nonprobability odds inform decision making? Pre-
sented with nonprobability odds, what decision rules should we apply?
These are important questions for decision theory and rational choice, but
we cannot discuss these here.

An attempt to dismiss these issues quickly might be to try to bring these
issues back into well-charted territory by denying that nonprobability odds
are really sui generis items. Regarding them as such, so the argument goes,
is a red herring because, even if we have odds whose inverses do not add
up to one, it is trivial to renormalize them, and we then retrieve the homely
probabilities for which there are well-worked-out decision theories.

Unfortunately things are not as simple. The problem is that the p do not
satisfy the axioms of probability even if they are renormalized to add up to
one. The source of the problem is that nonprobability odds do not respect
the symmetry between betting for and betting against that is enshrined into
probabilities. For probabilities, we have p Eð Þ 1 p :Eð Þ 5 1 for any event
E.23 Nonprobability odds need not add up to one: p Eð Þ 1 p :Eð Þ can take
any value greater than one ðwhich is easy to see in the case of threshold
oddsÞ. For this reason the p are not probabilities, and renormalizing is not
an easy route back into the well-charted territory of probabilism. And, of
course, the renormalized odds need not prove sustainable.

Furthermore, one might worry that these nonprobabilistic odds do not
have the requisite connection to degrees of belief in order for them to play
the role of fixing degrees of belief. That is, one might worry that such odds

23. Odds-for for the negation are derived from probabilities by taking p :Eð Þ 5 1 2
p Eð Þ and then applying the shortening rule.

LAPLACE’S DEMON AND HIS APPRENTICES 53


Figure 7. Wealth of punters as a function of the number of rounds played with the
casino offering threshold odds, with thresholds of ðaÞ 0.05, ðbÞ 0.1, and ðcÞ 0.2.
Color version available as an online enhancement.


allow one to avoid the pragmatically bad consequences of model error, but
they do not line up with degrees of belief. For example, Williamson ð2010Þ
argues that symmetry—the claim that your limiting price to sell a bet should
be equal to your limiting price to buy that bet—is an intuitive part of what
he calls the ‘betting interpretation’ of degrees of belief: “While we do in,
practice, buy and sell bets at different rates, the rate at which we would
be prepared to both buy and sell, if we had to, remains a plausible inter-
pretation of strength of belief ” ð37Þ. Others disagree and do suggest that
nonsymmetrical odds can serve as a ðperhaps partialÞ characterization of
strength of belief ðsee, e.g., Dempster 1961; Good 1962; Levi 1974; Suppes
1974; Kyburg 1978; Walley 1991; Bradley 2012Þ. If one knows one’s model
is imperfect, it is hard to see a successful case in favor of symmetrical odds
from model-based probabilities as relevant to rational belief or action.

We would not like to leave the issue without a brief remark about Dutch
books. One might worry that our Freshman is subject to a Dutch book when
he offers nonprobabilistic odds. That is, one might worry that a smarter
bettor might be able to guarantee to make money out of the apprentice by
buying a set of bets that guarantee the bettor a sure gain, whatever happens.
This is not the case. This is for the same reason that casinos cannot be Dutch

Figure 7. Continued.

LAPLACE’S DEMON AND HIS APPRENTICES 55


booked. In a casino, you cannot bet on ‘not red’ with symmetrical proba-
bility to ‘red’.

In connection with this point, it is worth pointing out an analogy between
the current project and the standard Dutch book argument. The latter argues
from a pragmatic flaw ðbeing subject to a Dutch bookÞ to an epistemic
conclusion ðyour degrees of belief ought to satisfy the probability calcu-
lusÞ. We take ourselves to be doing the same sort of thing: we argue from
a pragmatic flaw ðhouses go bust faster than expected, statisticallyÞ to an
epistemic conclusion ðnonprobability oddsÞ. That is, we do not take our-
selves to be merely making the point that one can avoid going bankrupt by
shortening one’s odds. We are making the stronger claim that in the pres-
ence of model error, model probabilities sanction only nonprobability de-
grees of belief.

We conclude this section with an explanation of why one final response
to our argument will not succeed. One might respond that we get wrong
probabilities because we use probabilities in a bad way. From a Bayesian
perspective one could point out that by using one particular model to gen-
erate predictions we have implicitly assigned a prior probability of 1 to that
model. Given that we have no reason to assume that this model is true—
indeed, there are good reasons to assume that it is not—this confidence is
misplaced, and one really ought to take uncertainty about the model into
account. This can be done by using probabilities: put a probability measure
on the space of all models that expresses our uncertainty about the true
model, generate predictions with all those models, and take some kind of
weighted aggregate of the result. This, so the argument goes, would avoid
the above problem, which is rooted in completely ignoring second-order
uncertainty about models.

Setting aside the fact that it is unfeasible to generate predictions with an
entire class of models, in practice there are theoretical limitations that
ground the project. The first problem is that it is not clear how to circum-
scribe the relevant model class. This class would contain all possible mod-
els of a target system. But the phrase ‘all models’ masks the fact that math-
ematically this class is not defined, and indeed it is not clear whether it
is definable at all. The second problem is that even if one could construct
such a class in one way or another, there are both technical and conceptual
problems with putting an uncertainty measure over this class. The technical
problem is that the relevant class of models would be a class of functions,
and function spaces do not come equipped with measures. In fact, it is not
clear how to put a measure on function spaces.24 The conceptual issue is
that even if the technical problem could be circumvented somehow, what

24. This is a well-known problem in the foundations of statistical mechanics; see Frigg
and Werndl ð2012Þ.

56 ROMAN FRIGG ET AL.


measure would we chose? The model class will contain an infinity of
models, and it is at best unclear whether there is a nonarbitrary measure on
such a set that reflects our uncertainty about model choice. And even if
one can form a revised probability distribution in light of higher-order
doubt about the model, it will still be inaccurate relative to the distribu-
tion given by the true model.25 Finally, we, like the Freshman, are restricted
to sampling from the set of all conceivable models, which need not contain
a perfect model even if such a thing exists. For these reasons this response
does not seem to be workable.

7. Conclusion. We have argued that model imperfection in the presence of
nonlinear dynamics is a poison pill: treating model outputs as probabil-
ity predictions can be seriously misleading. Many operational probability
forecasts are therefore unreliable as a guide to rational action if interpreted
as providing the probability of various outcomes. Yet not all the models
underlying these forecasts are useless.

This raises the question, what conclusion we are to draw from the insight
into the unreliability of models? An extreme reaction would be to simply
get rid of them. But this would probably amount to throwing out the baby
with the bathwater because imperfect models can be qualitatively infor-
mative. Restricting models to tasks of purely qualitative understanding is
also going too far. The question is how we can use the model where it
provides insight while guarding against damage where it does not. Finding
a way of doing this is a challenge for future research. We have indicated
that one possible route could be to use nonprobability odds, but more needs
to be said about how these can be used to provide decision support, and
there may be altogether different ways of avoiding the difficulties we
sketch. We hope this article leads merely to a wider acknowledgment that
these challenges are important and their solution nontrivial.

REFERENCES

Andronov, Aleksandr A., and Lev Pontrjagin. 1937. “Systèmes Grossiers.” Doklady of the Academy
of Sciences of the USSR 14:247–51.

Arnold, Vladimir I., and André Avez. 1968. Ergodic Problems of Classical Mechanics. New York:
Benjamin.

Barreira, Luis, and Claudia Valls. 2012. Ordinary Differential Equations: Qualitative Theory. Wash-
ington, DC: American Mathematical Society.

Batterman, Robert W. 1993. “Defining Chaos.” Philosophy of Science 60:43–66.
Berger, Arno. 2001. Chaos and Chance: An Introduction to Stochastic Aspects of Dynamics. Ham-

burg: de Gruyter.

25. We thank an anonymous referee for drawing our attention to this point.

LAPLACE’S DEMON AND HIS APPRENTICES 57


Bradley, Seamus. 2012. “Dutch Book Arguments and Imprecise Probabilities.” In Probabilities
Laws and Structures, ed. Dennis Dieks, Wenceslao Gonzalez, Stephan Hartmann, Michael
Stoeltzner, and Marcel Weber, 3–17. Berlin: Springer.

Curd, Thomas M., and Joy A. Thomas. 1991. Elements of Information Theory. New York: Wiley.
Dempster, Arthur. 1961. “Upper and Lower Probabilities Induced by a Multivalued Mapping.”

Annals of Mathematical Statistics 38:325–39.
Earman, John. 1986. A Primer on Determinsim. Dordrecht: Reidel.
Fan, Shu, and Rob J. Hyndman. 2012. “Short-Term Load Forecasting Based on a Semi-parametric

Additive Model.” IEEE Transactions on Power Systems 27:134–41.
Frigg, Roman, Leonard A. Smith, and Dave A. Stainforth. 2013. “The Myopia of Imperfect Climate

Models: The Case of UKCP09.” Philosophy of Science 80, no. 5, forthcoming.
Frigg, Roman, and Charlotte Werndl. 2012. “Demystifying Typicality.” Philosophy of Science

79:917–29.
Good, Irving J. 1962. “Subjective Probability as the Measure of a Non-measurable Set.” In Logic,

Methodology and Philosophy of Science, ed. Ernest Nagel, Patrick Suppes, and Alfred Tarski,
319–29. Stanford, CA: Stanford University Press.

Hagedorn, R., and Leonard A. Smith. 2009. “Communicating the Value of Probabilistic Forecasts
with Weather Roulette.” Meteorological Applications 16:143–55.

Hayashi, Shuhei. 1997. “Invariant Manifolds and the Solution of the C1-Stability and Q-Stability
Conjectures for Flows.” Annals of Mathematics 145:81–137.

Jenkins, Geoff, James Murphy, David Sexton, Jason Lowe, and Phil Jones. 2009. “UK Climate
Projections.” Briefing report, Department for Environment, Food and Rural Affairs, London.

Judd, Kevin. 2007. “Nonprobabilistic Odds.” Working paper, University of Western Australia.
Judd, Kevin, and Leonard A. Smith. 2004. “Indistinguishable States.” Pt. 2, “The Imperfect Model

Scenario.” Physica D 196:224–42.
Kellert, Stephen. 1993. In the Wake of Chaos. Chicago: University of Chicago Press.
Kyburg, Henry. 1978. “Subjective Probability: Criticisms, Reflections, and Problems.” Journal of

Philosophical Logic 7:157–80.
Laplace, Marquis de. 1814. A Philosophical Essay on Probabilities. New York: Dover.
Levi, Isaac. 1974. “On Indeterminate Probabilities.” Journal of Philosophy 71:391–418.
Mañé, Ricardo. 1988. “A Proof of the C1 Stability Conjecture.” Publications Mathématiques de

l’Institut des Hautes Études Scientifiques 66:161–210.
McGuffie, Kendal, and Ann Henderson-Sellers. 2005. A Climate Modelling Primer. Chichester,

NY: Wiley.
McWilliams, James C. 2007. “Irreducible Imprecision in Atmospheric and Oceanic Simulations.”

Proceedings of the National Academy of Sciences 104:8709–13.
Orrell, David, Leonard A. Smith, Tim Palmer, and Jan Barkmeijer. 2001. “Model Error in Weather

Forecasting.” Nonlinear Processes in Geophysics 8:357–71.
Palis, Jacob, and Stephen Smale. 1970. “Structural Stability Theorems.” In Global Analysis, ed.

Shiing-Shen Chern and Stephen Smale, 223–31. Proceedings of Symposia in Pure Mathe-
matics 14. Providence, RI: American Mathematical Society.

Peixoto, Marilia C., and Maurício M. Peixoto. 1959. “Structural Stability in the Plane with Enlarged
Boundary Conditions.” Anais da Academia Brasileira de Ciências 31:135–60.

Peixoto, Maurício. 1962. “Structural Stability on Two Dimensional Manifolds.” Topology 2:
101–21.

Robinson, Clark. 1976. “Structural Stability of C1 Diffeomorphisms.” Journal of Differential
Equations 22:28–73.

Smale, Stephen. 1966. “Structurally Stable Systems Are Not Dense.” American Journal of Math-
ematics 88:491–96.

———. 1967. “Differentiable Dynamical Systems.” Bulletin of the American Mathematical Soci-
ety 73:747–817.

Smith, Leonard A. 1992. “Identification and Prediction of Low-Dimensional Dynamics.” Physica
D 58:50–76.

———. 2002. “What Might We Learn from Climate Forecasts?” Proceedings of the National
Academy of Sciences of the USA 4:2487–92.

58 ROMAN FRIGG ET AL.


———. 2006. “Predictability Past Predictability Present.” In Predictability of Weather and Cli-
mate, ed. Tim Palmer and Renate Hagedorn, 217–50. Cambridge: Cambridge University
Press.

———. 2007. Chaos: A Very Short Introduction. Oxford: Oxford University Press.
Smith, Peter. 1998. Explaining Chaos. Cambridge: Cambridge University Press.
Snyder, Ralph D., J. Keith Ord, and Adrian Beaumonta. 2012. “Forecasting the Intermittent

Demand for Slow-Moving Inventories: A Modelling Approach.” International Journal of
Forecasting 28:485–96.

Suppes, Patrick. 1974. “The Measurement of Belief.” Journal of the Royal Statistical Society B
36:160–91.

Thompson, Erica L. 2013. “Modelling North Atlantic Storms in a Changing Climate.” PhD diss.,
Imperial College, London.

Walley, Peter. 1991. Statistical Reasoning with Imprecise Probabilities. London: Chapman & Hall.
Werndl, Charlotte. 2009. “What Are the New Implications of Chaos for Unpredictability?” British

Journal for the Philosophy of Science 60:195–220.
Williamson, Jon. 2010. In Defense of Objective Bayesianism. Oxford: Oxford University Press.

LAPLACE’S DEMON AND HIS APPRENTICES 59