The No Alternatives Argument Richard Dawid Stephan Hartmann Jan Sprenger February 9, 2013 Abstract Scientific theories are hard to find, and once scientists have found a theory H, they often believe that there are not many distinct al- ternatives to H. But is this belief justified? What should scientists believe about the number of alternatives to H, and how should they change these beliefs in the light of new evidence? These are some of the questions that we will address in this paper. We also ask under which conditions failure to find an alternative to H confirms the theory in question. This kind of reasoning (which we call the No Alternatives Argument) is frequently used in science and therefore deserves a care- ful philosophical analysis. Contents 1 Introduction 2 2 The conceptual framework 4 1 3 The No Alternatives Argument 8 4 Discussion I: A quantitative analysis of NAA 14 5 Discussion II: The number of alternatives and the problem of underdetermination 16 6 Conclusions 22 A Proof of the No Alternative Theorems 24 B Proof of the results in Section 5 27 1 Introduction We typically confirm or disconfirm a scientific hypothesis with a piece of em- pirical evidence. For example, the observation of a black raven confirms the hypothesis that all ravens are black, and certain clicks in a particle detector confirm the existence of the top quark. However, there are situations where empirical evidence is unattainable over long periods of time. Such situations arise with particular force in contemporary high energy physics, where the characteristic empirical signatures of theories like Grand Unified Theories or string theory must be expected to lie many orders of magnitude beyond the reach of present day experimental technology. They are entirely common also in scientific fields such as palaeontology or anthropology, where scien- tists must rely on the scarce and haphazard empirical evidence they happen to find in the ground. Interestingly, scientists are at times quite confident 2 regarding the adequacy of their theories even when empirical evidence is largely or entirely absent. Trust in a theory H in such cases must be based on what we call non- empirical evidence. Such evidence may be gathered by observation but it does not fall into the intended domain of H, and it cannot be related to H by another scientific theory. In particular, we cannot directly calculate the probability of the evidence given theory H. Rather than pertaining to any phenomenon described by H, non-empirical evidence for H can, for ex- ample, consist in observations about the research process leading up to the construction of H. From an empiricist point of view, arguments relying on non-empirical evidence may be regarded as mere speculation: they neither contribute to actual theory confirmation nor do they have objective scientific weight. We challenge this claim by exploring the following case: scientists develop a con- siderable degree of trust in a theory H because, despite considerable efforts, no alternatives to H have been found that meet a number of theoretical and empirical constraints. We call this argument the No Alternatives Argument (NAA).1 If valid, it would demonstrate the possibility of non-empirical theory confirmation. In order to formalise this argument, we introduce the concept of the number of alternative theories to H. Then, we relate it to the empirical ad- equacy of H and the scientists’ success at finding an alternative (Sect. 2). 1The name of the argument stems from its crucial premise (scientists have not yet found a suitable alternative to H); it does not draw the conclusion that there are no alternatives to H. 3 On that basis, we construct a probabilistic model of NAA and prove the possibility of non-empirical theory confirmation (Sect. 3). Next, we show that the significance of NAA in scientific reasoning depends on the scientists’ subjective judgements (Sect. 4). We provide a couple of results that show how these judgements are affected by evidence, and tentatively explore a Meta-Inductive Argument (MIA) for intersubjective agreement on the num- ber of alternatives to H (Sect. 5). Finally, we put our findings into a broader context and briefly look at applications in epistemology and philosophy of science (Sect. 6). Throughout the paper, we operate in the framework of Bayesian epistemology.2 2 The conceptual framework In order to understand the problem of non-empirical theory confirmation, we contrast it with its empirical counterpart. We call some evidence E empirical evidence for H if and only if (i) E falls into the (broadly construed) intended domain of H; (ii) E is logically or probabilistically related to H; (iii) E is observed. Consider now the proposition T that hypothesis H is empirically ade- quate, which we understand as the property of being consistent with past and future observations. If one measures confirmation in terms of increase 2Recent surveys of Bayesian epistemology are Hájek and Hartmann ([2010]) and Hart- mann and Sprenger ([2010]). Applications of Bayesian epistemology to scientific reasoning are given in Bovens and Hartmann ([2003]) and Howson and Urbach ([2006]). Through- out this paper, we follow the convention that propositional variables are printed in italic script, and that the instantiations of these variables are printed in roman script. 4 in degrees of belief, as Bayesians typically do, T is confirmed by E whenever P(T|E) > P(T). Throughout the paper, we use this inequality as a criterion for when a piece of evidence confirms a hypothesis. The evidence can be observed perceptually or by means of measurement instruments, as common in modern science. We would like to investigate whether the observation F—the lack of al- ternatives to H—constitutes evidence for T in this Bayesian sense. But how is it possible that F raises the subjective degree of belief in T? After all, F is neither deductively nor probabilistically implied by H. It does not even fall into the intended domain of H. Does F then qualify as (non-empirical) evidence in an argument from ignorance (Walton [1995]; Hahn and Oaksford [2007]; Sober [2009]), such as: if H were not empirically adequate, then we would have disproved it before? The most plausible way to solve this problem is to deploy a two step process. First, we find a statement that does predict evidence of the type F. Then, we show that this statement is probabilistically relevant to the empirical adequacy of H. In the case of NAA, our non-empirical evidence FA consists in the fact that scientists have not found any alternatives to a specific solution of a research problem, despite looking for them with considerable energy and for a long time. Then it is straightforward to identify a natural candidate for a statement that predicts FA, namely that there are few or no alternative theories to H. Apparently, this would render FA more likely than a scenario where a large (not necessarily infinite) number of possible alternative theories can be constructed: in the latter case, one might expect that scientists would have already found an alternative. 5 The number k of possible scientific theories which can account for a cer- tain set of data is in turn relevant for the degree of belief in the empirical adequacy of H. We assume that scientists who develop a theory in accordance with available data do not have a perfectly reliable method to select the true theory if false theories can be constructed that are also consistent with the available data. Under this condition, a lower number of possible scientific theories that can account for a certain set of empirical data increases the de- gree of belief that the theory developed by scientists is adequate. The stated assumption seems to be fairly plausible in science: scientists often come up with an incorrect, but fruitful theory when they begin to investigate a new field. Bohr’s model of the atom is perhaps a good example for this claim. Based on this reasoning, we introduce a random variable Y measuring the number of alternatives to H and taking values in the natural numbers. Yk := {Y = k} expresses the proposition that there are k adequate and distinct alternatives which satisfy a set of theoretical constraints C, are consistent with the existing data D, and give distinguishable predictions for the outcome of some set E of future experiments. We will later show that, via its effect on the Yk, the non-empirical evidence FA confirms H under plausible conditions. Inferences about the number of alternatives to a theory H naturally de- pend on what counts as a genuine alternative. This is, in turn, sensitive to the specific scientific context. Therefore we leave the individuation problem to the scientific community which typically has the best grip on what should count as a distinct theory. Moreover, for the No Alternatives Argument, we only require the premise that the number of alternatives to H be possibly 6 finite, that is, P(Y = ∞) < 1. (1) In order to motivate this assumption, we put two constraints on the indi- viduation of scientific theories which, in our opinion, duly reflect accepted scientific practice. First, different theories make different predictions. If two theories make exactly the same predictions, then we consider them to be identical. For ex- ample, we consider the De Broglie-Bohm version and the Copenhagen version of quantum mechanics as representing the same theory as long as they do not give different empirical predictions (Cushing [1994]). As a consequence, we are only interested in arriving at empirically adequate theories, and not in the more ambitious goal of finding theories that are true under a given interpretation (cf. van Fraassen [1980], and footnote 5 later on). Second, different theories provide different solutions to a given scientific problem. That is, theories which only differ in a detail, such as the precise value of a parameter or the existence of a physically meaningless dummy variable, do not count as different theories. For example, the simple Higgs model in particle physics is treated as one theory, although the hypothesised and perhaps finally discovered Higgs particle could have different mass values. Generally, if it were enough to slightly modify the value of a certain parameter in order to arrive at a new theory, then coming up with new theories would be an easy and not very creative task. Inventing a novel mechanism or telling a new story of why a certain phenomenon came about is much harder. Beyond these general guidelines, the specifics of theory individuation must 7 depend on the context in which a no alternatives argument is formulated by scientists. Scientists often formulate no alternatives arguments at the level of general conceptual principles while allowing for a large spectrum of specific realizations of those principles.3 In such cases, they have a fairly inclusive conception of what counts as one theory. But apart from satisfying (1), our argument does not depend on how these distinctions are made. We are now ready to proceed to a formal analysis of the No Alternatives Argument. 3 The No Alternatives Argument In the No Alternatives Argument or NAA, the non-empirical evidence con- sists in the observation that scientists have not yet found an alternative to H. This observation is taken to indicate that there are actually not too many alternatives to H, and thus indirectly as an argument for H. Focusing on the case of string theory, Dawid ([2006], [2009]) calls this the argument of no choice. 3For example, since the 1980s particle physicists strongly supported a no alternatives argument with respect to the Higgs mechanism. That is, they believed that no alternatives to a gauge theory that was spontaneously broken by a Higgs sector of scalar fields could account for the available empirical data. They did not claim, however, that only one way of realizing the Higgs sector was possible. They knew that the Higgs sector could consist of one or several complex scalar fields, of elementary or constituent scalars, etc. Therefore, physicists strongly believed based on NAA that the Higgs sector would be observed at the LHC experiment but did not have particular trust in any of the specific models of the Higgs sector. Their NAA clearly was placed at the level of physical principles rather than specific models. 8 T FA Figure 1: The Bayesian Network representation of the two-propositions sce- nario. Following this line of reasoning, we reconstruct NAA based on the no- tion that there exists a specific but unknown number k of possible scientific theories. As stated above, these theories have to satisfy constraints C, ex- plain data D and predict the outcomes of the experiments E. We will then show that failure to find an alternative to H raises the probability of H being empirically adequate and thus confirms H. To do so, we introduce the binary propositional variables T and FA, already briefly encountered in Sect. 2. T takes the values T The hypothesis H is empirically adequate. ¬T The hypothesis H is not empirically adequate. and FA takes the values FA The scientific community has not yet found an alternative to H that fulfills C, explains D and predicts the outcomes of E. ¬FA The scientific community has found an alternative to H that fulfills C, explains D and predicts the outcomes of E. We would now like to explore under which conditions FA confirms H, that is, when P(T|FA) > P(T) . (2) 9 This equation suggests a direct influence of T on FA. See Figure 1 for a Bayesian Network representation of this scenario. But since a direct influence is blocked by the non-empirical nature of FA, we introduce a third variable Y which mediates the connection between T and FA. Like in the previous section, Y has values in the natural numbers, and Yk corresponds to the proposition that there are exactly k hypotheses that fulfil C, explain D and predict the outcomes of E. We should also note that the value of FA—that scientists find/do not find an alternative to H—does not only depend on the number of available alternatives, but also on the difficulty of the problem, the cleverness of the scientists, or the available computational, experimental, and mathematical resources. Call the variable that captures these complementary factors D, and let it take values in the natural numbers, with Dj := {D = j} and dj := P(Dj). The higher the values of D, the more difficult the problem. 4 It is clear that D has no direct influence on Y and T (or vice versa), but that it matters for FA and that this influence has to be represented in our Bayesian Network. We now list five plausible assumptions that we need for showing the validity of the No Alternatives Argument. A1. The variable T is conditionally independent of FA given Y : T ⊥⊥ FA|Y (3) 4For the purpose of our argument, it is not necessary to assign a precise operational meaning to the various levels of D. It is sufficient that they satisfy a natural monotonicity assumption with regard to their impact on FA—see condition A3 in the main text. 10 Y TFA D Figure 2: The Bayesian Network representation of the four-propositions sce- nario. Hence, learning that the scientific community has not yet found an alternative to H does not alter our belief in the empirical adequacy of H if we already know that there are exactly k viable alternatives to H. A2. The variable D is (unconditionally) independent of Y : D ⊥⊥ Y (4) Recall that D represents the aggregate of those context-sensitive factors that affect whether scientists find an alternative to H, but that are not related to the number of suitable alternatives. In other words, D and Y are orthogonal to each other by construction. These are our most important assumptions, and we consider them to be eminently sensible. Figure 2 shows the corresponding Bayesian Network. To complete it, we have to specify the prior distribution over D and Y and the conditional distributions over FA and T , given the values of their parents. This is done in the following three assumptions. A3. The conditional probabilities fkj := P(FA|Yk, Dj) (5) 11 are non-increasing in k for all j ∈ N and non-decreasing in j for all k ∈ N. The (weak) monotonicity in the first argument reflects the intuition that for fixed difficulty of a problem, a higher number of alternatives does not decrease the likelihood of finding an alternative to H. The (weak) monotonicity in the second argument reflects the intuition that increasing difficulty of a problem does not increase the likelihood of finding an alternative to H, provided that the number of alternatives to H is fixed. A4. The conditional probabilities tk := P(T|Yk) (6) are non-increasing in k. This assumption reflects the intuition that an increase in the number of alternative theories does not make it more likely that scientists have already identified an empirically adequate theory. A5. There is at least one pair (i,k) with i < k for which (i) yi yk > 0 where yk := P(Yk), (ii) fij > fkj for some j ∈ N, and (iii) ti > tk. In particular, this assumption implies that yk < 1 for all k ∈ N because otherwise, a pair satisfying (i) could not be found. With these five assumptions, we can show that (proof in appendix A): Theorem 1. If Y takes values in the natural numbers N and assumptions A1 to A5 hold, then FA confirms T, that is, P(T|FA) > P(T). 12 We have therefore shown that FA confirms the empirical adequacy of H under rather weak and plausible assumptions. In line with the introduction of Y in section 2, we have assumed that Y only takes values in the natural numbers. This might be seen as evad- ing the skeptical argument that there may be infinitely many (theoretically adequate, empirically successful, ...) alternatives to H. Therefore we now ex- tend the theorem by explicitly allowing for the possibility Y∞ := {Y = ∞}, and we modify our assumptions accordingly. In particular, we observe that A5 entails P(Y∞) < 1, define f∞j := P(FA|Y∞, Dj), t∞ := P(T|Y∞) and demand that fij ≥ f∞j ∀i,j ∈ N f∞i ≤ f∞j ∀i,j ∈ N with i < j (7) ti ≥ t∞ ∀i ∈ N . (8) These requirements naturally extend assumptions A3 and A4 to the case of infinitely many alternatives. Then, we obtain the following generalization of the NAA: Theorem 2. If Y takes values in N ∪ {∞} and assumptions A1 to A5 hold together with their extensions (7) and (8), then FA confirms T, that is, P(T|FA) > P(T). That is, even if we concede to the skeptic that there may be infinitely many alternatives to H, she must still acknowledge the validity of NAA as long as her degrees of belief satisfy P(Y∞) < 1. 5 This is, in our mind, a quite substantial and surprising result. Philosophy of science has focused on logical 5At this point, the reader is able to acknowledge why we make inferences about the empirical adequacy of H, instead of its truth. In that case, we could always construct 13 and probabilistic relations between theory and evidence, but has neglected this particular evidential support and failed to acknowledge its validity. Note that only a dogmatic skeptic who insists on P(Y∞) = 1 can deny the validity of NAA. But Theorem 2 convinces anyone whose attitude is genuinely skeptical, that is, someone who does not want to commit herself with respect to the probability of (in)finitely many alternatives to H. Convincing such a fair and non-committal skeptic is, to our mind, much more important than convincing dogmatists who just deny our premises. The following two sections discuss the amount of confirmation that NAA confers on H, and provide more mathematical results regarding the problem of assessing the number of alternatives to H. 4 Discussion I: A quantitative analysis of NAA We have seen that NAA can be used in support of a proposed theory. The question remains, however, whether the resulting support is of significant strength and whether using NAA in a specific situation is justified. To fa- cilitate the reading, we conduct this analysis for the finite case (Theorem 1) only; the infinite case is analogous. The Bayesian Network representation of NAA in Figure 2 suggests that such significance is difficult to attain by NAA on its own without further supportive reasoning. According to Figure 2, FA may confirm an instance of infinitely many empirically indistinguishable alternatives to H (e.g., by introducing mean- ingless dummy variables), jeopardizing the validity of NAA. 14 D (limitations to the scientists’ abilities to solve difficult problems) as well as an instance of Y (limitations to the number of possible theories). It is then easy to see that for all l ∈ N, P(Dl|FA) = P(Dl, FA) P(FA) = dl · ∑ k yk fkl∑ j,k dj yk fkj . (9) Hence the ratio measure of confirmation6 can be computed as r(Dl, FA) := P(Dl|FA) P(Dl) = ∑ k yk fkl∑ j,k dj yk fkj . (10) We cannot provide fully general conditions for when this expression is greater than 1. However, we observe that the expression on the right hand side of equation (10) is non-decreasing in l since the fkl are non-decreasing in l for fixed k (see assumption A3). That is, the degree of confirmation that FA lends to Dl, as expressed by the ratio measure, typically increases with l. Thus, FA confirms the claim that the problem at hand is rather complicated (i.e., that it has a high rank l) and disconfirms the claim that it is not particularly complicated (i.e., that it has a low rank l). The turning point l∗ depends on the precise values of the parameters in question. To accentuate the resulting problem, note that the situation could be such that D∗ := {D ≥ l∗}—the proposition that the problem has difficulty rank l∗ or higher—receives more confirmation than T. While failure to find an alternative confirms the empirical adequacy of H, this failure would also confirm, and to a larger degree, the hypothesis that the problem is too com- plicated for our current science. This alternative explanation of FA weakens 6The choice of this particular confirmation measure is guided by simplicity and trans- parency; nothing significant changes if we move to a different confirmation measure. 15 the significance of NAA. To successfully apply NAA, one has to amend the qualitative claim shown in section 3 with a comparative claim, namely that FA confirms T more than D ∗. But such a statement is sensitive to the specific parameter assignments as well as to the chosen confirmation measure—and therefore hard to prove in general. So far we have left the parameters dk,fkj, tk and yk largely unrestricted and assumed that they reflect the subjective degrees of belief of a scientist. Hence, different scientists may assign different values to these parameters, which implies that the assessment of the significance of NAA will differ from scientist to scientist in the absence of further rational constraints. Given that science strives for objectivity or at least for intersubjective agreement, this is an unfortunate situation. So we have to examine what empirical evidence can tell us about the (probable) number of alternatives to a given theory H. This is the subject of the next section. 5 Discussion II: The number of alternatives and the problem of underdetermination In this section, we present some results on how evidence changes our beliefs about the number of alternatives to a theory H. In other words, we try to develop a dynamic perspective on the underdetermination problem, as opposed to the more static NAA. Some of our results here only apply to an arbitrarily large, but finite number of alternatives. Notably, this does not trivialize the problem. As 16 the following result shows, the fundamental problem of theoretical underde- termination persists even in that case (proof of all results in this section in appendix B): Proposition 1. For any N ∈ N and any 1 ≥ ε > 0, an agent’s belief function P may jointly satisfy (i) P(Y = ∞) = 0, (ii) P(Y ≤ N) ≥ 1 − ε, and (iii) 〈Y 〉 := ∑∞ k=0 k P(Yk) = ∞. In this notation, 〈Y 〉 denotes the expectation value of Y . In other words, an agent might rule out an infinite number of alternatives to H, be strongly convinced that there are few alternatives to H, and yet retain the belief that our best guess regarding the number of alternatives to H is “indefinitely large” or “greater than any number that we can imagine”.7 In other words, Proposition 1 points out the possibility of a strong epis- temic tension within a single agent regarding the number of alternatives to a theory H. This tension gives an interesting twist to the problem of theo- retical underdetermination: the agent might believe that H is fundamentally underdetermined by evidence (because our best guess for the number of al- ternatives is indefinitely large), but still rule out that there are infintely many alternatives. Let us now study whether such a belief structure is responsive to evidence E, be it empirical or non-empirical. First, we ask the following question: Can an agent who believes that 〈Y 〉 = ∞, and perhaps even that P(Y = ∞) > 0, rationally revise her belief in the light of evidence E such that 〈Y 〉E—the 7This phenomenon is well-known from paradoxes of decision theory, such as the valu- ation of the St. Petersburg Game, but to our knowledge, this epistemic counterpart has not been explored before. 17 expectation value of Y under the posterior distribution P(·|E)—is finite? In other words, is it possible that the (strong) underdetermination problem vanishes? The answer is yes. The following theorem lists four different sufficient conditions for such a belief change. Theorem 3. Assume that 〈Y 〉 = ∞ and that P(Y = ∞) < 1. Then any of the following conditions on evidence E with P(E) 6= 0 is sufficient for 〈Y 〉E < ∞. 1. The sequence (k ·P(E|Yk))k∈N∪∞ is bounded. 2. There are α,β > 0 be such that α + β > 2, and that (kαP(E|Yk))k∈N∪∞ and (kβP(Yk))k∈N are bounded. 3. ∑∞ k=0 P(E|Yk) < ∞, P(E|Y∞) = 0 and there is a N0 ∈ N such that (P(Yk))k∈N is, for all k ≥ N0, monotonically decreasing. 4. P(E|Yk) → 0 and there is an α > 0 such that lim sup k→∞ k2+α |P(E|Yk) −P(E|Yk−1)| < ∞. (11) These four conditions have different rationales, but all of them constrain the rate of decline of P(E|Yk) as k increases. That is, the more alternatives there are, the less likely is E. The first and second condition basically amount to P(E|Yk) ∈ O(1/kα) for a suitable exponent α > 0. The third condition makes a similar constraint by demanding that ∑∞ k=0 P(E|Yk) converges, and the fourth condition controls the differences between the values of P(E|Yk) for neighboring values of k. In particular, the intuitive condition P(E|Yk) = 1/k is sufficient for the theorem to hold. 18 Note that only the second condition makes an assumption about the rate of decline of P(Yk). This is in line with the idea that we have little grip on the rational beliefs about the number of empirically adequate alternatives, whereas we are in a better position to assess how our evidence E is affected by the number of alternatives. As already stated, the punch line of all four conditions is that P(E|Yk) converges fast enough to zero. For evidence E that is related to an empirical test of H, this assumption is reasonable: if there are more and more alterna- tives, why should H, instead of an unconceived alternative (Stanford [2006]), survive empirical tests? Thus, if large values of Y make little difference re- garding our trust in the predictions of H, then we will abandon the belief that the expected number of alternatives is infinite. This is exactly what we would expect intuitively. Second, we ask under which circumstances evidence E lowers the ex- pected number of alternatives if that value is already finite. In answer to this question, we can demonstrate the following theorem: Theorem 4. Assume P(Y = ∞) = 0. Let Y+k denote the proposition that there are at least k alternatives to theory H, and let Y−k denote the proposition that there are at most k−1 alternatives to H. Then, if P(E|Y+k ) ≤ P(E|Y − k ) for all k ∈ N and P(E|Y+k ) < P(E|Y − k ) for at least one k > 0, it will also be the case that 〈Y 〉 > 〈Y 〉E. In other words, if the likelihood of evidence E decreases with the available number of alternatives to H, then the expected number of alternatives will be smaller a posteriori than it was a priori. 19 The condition of the theorem can be satisfied by empirical as well as non- empirical evidence. For contrastive empirical evidence, that is, data which confirm a particular set of theories and disconfirm others, this is straightfor- ward. And this covers most of everyday data in scientific experiments. But even non-empirical evidence such as FA := “the scientists have not yet found an alternative to H” may satisfy the conditions of Theorem 4 since such an observation supports H only via an inference about the number of suitable alternatives. Finally, we also ask the question of whether the belief dynamics of under- determination are unidirectional or bidirectional. That is, can an agent who believes that Y takes finite values only (i.e., that 〈Y 〉 < ∞) come to the be- lief that 〈Y 〉E = ∞ for some evidence E? Interestingly, this is impossible. No empirical evidence is able to overturn the verdict that the expected number of alternatives to H is finite: Proposition 2. If 〈Y 〉 < ∞, then for any evidence E (empirical or non- empirical) with P(E) 6= 0, 〈Y 〉E < ∞. This means that the belief that the expected number of alternatives is finite is not responsive to empirical evidence: once you believe it, you will always believe it, independently of which evidence you receive. This points to an interesting asymmetry: evidence can change the belief that there are infinitely many alternatives, but it cannot reverse the belief that there are finitely many alternatives. The asymmetry between Theorem 3 and Propo- sition 2 confirms the suspicion that empirical evidence usually lowers the ex- pected number of alternatives. This finding agrees with the observation that 20 convergence in scientific research occurs more frequently than divergence. That said, it is still unclear what determines our beliefs on the number of alternatives in the first place. What should our priors for the Yk look like? This question is hard to answer in general, but we would like to sketch a reasoning procedure, called the Meta-Inductive Argument (MIA) that might help us to get a grip on the distribution of the number of alternative theories. The gist of MIA is best illustrated by a special case. It is notoriously difficult to find a theory that makes the correct predictions, rather than just to accommodate existing data (Kahn et al. [1992]; Hitchcock and Sober [1994]). But remarkably, scientists have often succeeded at identifying such a theory. Now, if there are a lot of alternative solutions to a given problem, then there is no reason to assume that the scientists identified the one theory which will prevail in the future. Thus, repeated predictive success within a particular scientific research program suggests a particular explanation, namely that there are few suitable alternative theories in the given theoretical context. This argument resembles the no miracles argument in the debate about scientific realism. Now, assume that a novel theory H shows similarities to theories H1, H2, etc., in the same scientific research program. The joint feature of these theories may be a certain theoretical approach, a shared assumption, or any other relevant characteristic. Let us assume that a substantial share of the theories to which H is similar have been empirically confirmed. Assume further that for those theories, we have empirically grounded posterior beliefs about the number of alternatives. Then, it seems reasonable to use these posteriors as priors for the number of alternatives to H. After all, H is quite 21 similar to H1, H2, etc. In statistics, such a way of grounding “objective” prior beliefs in past experience is referred to as the empirical Bayes method (Carlin and Louis [2000]). If this move is accepted, then one is in a much better position to evaluate the significance of NAA, due to agreement on the prior probabilities of the Yk. Admittedly, our account of MIA remains informal and provides at best a partial justification for the practical significance of NAA. On the other hand, formalizing MIA and strengthening the link between both arguments strikes us as a promising route for further research. Be this as it may, we would like to stress that even without MIA, the validity of the NAA is a surprising and substantial philosophical result and that even the degree of confirmation that this argument provides can in principle be large. 6 Conclusions In this paper, we have completed three tasks: (i) we have formalised the No Alternatives Argument and explored under which conditions non-empirical evidence confirms a scientific theory H; (ii) we have studied the problem of theoretical underdetermination from the angle of how beliefs about the num- ber of alternatives to H change in the light of evidence; and (iii) we have sketched the Meta-Inductive Argument for assessing the number of alterna- tives to H. We conclude this paper by sketching future research projects. From a normative point of view, a rigorous formalization of MIA would be desirable. Also, we plan to relate the formal argument of this paper more closely to 22 case studies from scientific practice. Here we are particularly interested in the case of string theory and reasoning strategies employed in fields such as palaeontology and anthropology where contingent evolutionary details have to be reconstructed based on scarce and highly incomplete evidence. We will explore what role NAA plays in these fields, and how good the argument actually is. Finally, the scope of NAA needs to be determined more exactly. In par- ticular, we dare an outlook on whether NAA can be applied in philosophy, too. Two potential applications come to mind. First, Inference to the Best Explanation (Lipton [2004]; Douven [2011]) can, to a certain extent, be ex- plicated in terms of NAA. In as much as the notion “best explanation” is understood as “the only genuinely satisfactory explanation”, the fact that no other genuinely satisfactory explanation has been found can play the role of the claim of no alternatives in our argument, supporting the empirical adequacy of the currently best explanation. The structure of the argument and the formal result would be similar; only the interpretation would change from “empirically adequate” to “best explanation”. We conjecture that the validity of IBE can sometimes be analyzed in terms of a NAA. Second, one may ask whether NAA could also play a role in confirming general philosophical theories. The reputation of a philosophical theory is often based on the understanding that no other consistent answer has been found or is perhaps not even conceivable. Can reasoning of this kind be supported by NAA? In principle, the answer to this question is yes, but there is a problem: philosophical theories do not have a record of empirical testing. We will be unable to quantify the significance of NAA with empirical 23 data. Philosophy thus provides us with a neat example of the promises and limits of non-empirical theory confirmation beyond scientific contexts. Funding Austrian Research Fund (FWF): [P22811-G17] to R.D., Netherlands Organ- isation for Scientific Research (NWO) (016.104.079 to J.S.). Acknowledgements We are indebted to audiences in Brisbane, Canberra, Copenhagen, Konstanz, Lund, Munich, Nancy, Singapore, Stockholm, Tilburg and Vienna for use- ful discussions and feedback. We also like to thank Simon Friedrich, two anonymous referees of this journal and in particular Frederik Herzberg, who commented on this paper at FEW 2012, for providing us with critical remarks and useful suggestions for improving the manuscript. A Proof of the No Alternative Theorems Proof of Theorem 1: FA confirms T if and only if P(T|FA) −P(T) > 0, that is, if and only if ∆ := P(T, FA) −P(T)P(FA) > 0. 24 We now apply the theory of Bayesian Networks to the structure depicted in Figure 2, using assumptions A1 (T ⊥⊥ FA|Y ) and A2 (D ⊥⊥ Y ): P(FA) = ∞∑ i=0 ∞∑ j=0 P(FA|Yi, Dj)P(Yi, Dj) = ∞∑ i=0 ∞∑ j=0 dj yi fij P(T) = ∞∑ k=0 P(T|Yk)P(Yk) = ∞∑ k=0 tk yk P(T, FA) = ∞∑ i=0 P(FA, T|Yi)P(Yi) = ∞∑ i=0 yi P(FA|Yi) P(T|Yi) = ∞∑ i=0 yi ti ( ∞∑ j=0 P(FA|Yi, Dj) P(Dj|Yi) ) = ∞∑ i=0 ∞∑ j=0 dj yi ti fij Hence, we obtain, using ∑ k∈N yk = 1, ∆ = ( ∞∑ i=0 ∞∑ j=0 dj yi ti fij ) − ( ∞∑ i=0 ∞∑ j=0 dj yi fij ) ( ∞∑ k=0 yk tk ) = ( ∞∑ i=0 ∞∑ j=0 dj yi ti fij )( ∞∑ k=0 yk ) − ( ∞∑ i=0 ∞∑ j=0 dj yi fij ) ( ∞∑ k=0 tk yk ) = ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 (dj yi yk ti fij −dj yi yk tk fij) = ∞∑ j=0 dj ∞∑ i=0 ∞∑ k 6=i=0 yi yk fij (ti − tk) = ∞∑ j=0 dj ∞∑ i=0 ∑ k>i (yi yk fij (ti − tk) + yk yi fkj (tk − ti)) = ∞∑ j=0 dj ∞∑ i=0 1 2 ∞∑ k 6=i=0 yi yk (fij (ti − tk) + fkj (tk − ti)) = 1 2 ∞∑ i=0 ∞∑ j=0 ∞∑ k 6=i=0 dj yi yk (ti − tk) (fij −fkj) > 0 25 because of A3-A5 taken together: A3 entails that the difference (fij −fkj) is non-negative, A4 does the same for the (ti−tk), and A5 entails that these differences are strictly positive for at least one pair (i,k). Hence, the entire double sum is strictly positive. Proof of Theorem 2: We perform essentially the same calculations as in the proof of Theorem 1 and additionally include the possibility {Y = ∞}.8 Defining f∞j := P(FA|DjY∞) leads us to the equalities P(FA) = ∞∑ i=0 ∞∑ j=0 dj yi fij + ∞∑ j=0 dj y∞ f∞j P(T) = ∞∑ k=0 tk yk + t∞ y∞ P(FA, T) = ∞∑ i=0 ∞∑ j=0 dj yi ti fij + ∞∑ j=0 dj t∞ y∞ f∞j from which it follows, using limK→∞ ∑K k=1 yk = 1 −y∞, that P(FA)P(T) = ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj tk yi yk fij + ∞∑ i=0 ∞∑ j=0 dj t∞ yi y∞ fij + ∞∑ j=0 ∞∑ k=0 dj tk yk y∞ f∞j + ∞∑ j=0 dj t∞ y 2 ∞ f∞j P(FA, T) = 1 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj ti yi ykfij + ∞∑ j=0 dj t∞ y∞ f∞j With the definition ∆ := ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj ti yi ykfij − ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj tk yi yk fij 8The notation suggests that ∞ is already included in the summation index, but the infinity sign on top of the sum is just the shortcut for the limit of the sequence of all natural numbers. Thus, the case Y = ∞ has to be treated separately. 26 we observe that ∆ > 0, as shown above in the proof of Theorem 1 (the parameter values satisfy the relevant conditions A3-A5). It then follows that P(FA, T) −P(T) P(FA) = ∆ + y∞ 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj ti yi ykfij + ∞∑ j=0 dj t∞ y∞ (1 −y∞) f∞j − ∞∑ i=0 ∞∑ j=0 dj t∞yi y∞ fij − ∞∑ i=0 ∞∑ j=0 dj ti yi y∞ f∞j = ∆ + y∞ 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj ti yi ykfij + y∞ 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj t∞ yi yk f∞j − y∞ 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj t∞yi yk fij − y∞ 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj ti yi yk f∞j = ∆ + y∞ 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj yi yk(tifij + t∞f∞j − t∞fij − tif∞j) = ∆ + y∞ 1 −y∞ ∞∑ i=0 ∞∑ j=0 ∞∑ k=0 dj yi yk(ti − t∞)(fij −f∞j) > 0 since the extensions of A3 and A4 imply fij ≥ f∞j and ti ≥ t∞ (equations (7) and (8)), independent of the values of i and j. B Proof of the results in Section 5 Proof of Proposition 1: The proof proceeds by construction. For instance, let P(Y ≤ N) = 1 − ε, let P(Yk) = C/k2 ∀k > N, and choose C such that∑ k>N P(Yk) = ε is satisfied. (The series ∑ k 1/k 2 converges.) Then, it is 27 easy to check that 〈Y 〉 ≥ ∞∑ k=N+1 k P(Yk) ≥ C ∞∑ k=N+1 1 k = ∞. Proof of Theorem 3: Note first that, since P(E) 6= 0, 〈Y 〉E = ∑ k∈N∪∞ k P(Yk|E) = 1 P(E) ∑ k∈N∪∞ k P(Yk) P(E|Yk) = 1 P(E) ( ∞∑ k=1 k P(Yk) P(E|Yk) + lim K→∞ K P(Y∞) P(E|Y∞) ) . All four conditions of the theorem require, either explicitly or implicitly, that P(E|Y∞) be equal to zero. Therefore, the second summand above vanishes and we can focus our analysis of 〈Y 〉E on k < ∞. We begin by proving the sufficiency of the first condition. Assume that the expression (k · P(E|Yk))k∈N is bounded, that is, there is a B > 0 such that k ·P(E|Yk) < B. Then it will be the case that 〈Y 〉E = 1 P(E) ∞∑ k=1 k P(Yk) P(E|Yk) ≤ B · 1 P(E) ∞∑ k=1 P(Yk) < ∞, proving the sufficiency of the first condition. Related to this is the case that kα ·P(E|Yk) ≤ Aα and kβ ·P(Yk) ≤ Aβ for all k ∈ N and some constants Aα,Aβ > 0, with the additional constraints 28 α,β > 0 and α + β > 2. Then we have 〈Y 〉E = 1 P(E) ∞∑ k=1 k1−α−β (kα P(E|Yk)) ( kβ P(Yk) ) ≤ 1 P(E) AαAβ ∞∑ k=1 k1−(α+β) < ∞ because by assumption, 1 − (α + β) < −1, ensuring the convergence of the series. In the remainder of the proof we will focus on the properties of the series ∞∑ k=1 k P(Yk)P(E|Yk) (12) which is sufficient for examining the convergence properties of 〈Y 〉E. We now proceed to proving the sufficiency of the third condition. We assume that ∑∞ k=1 P(E|Yk) < ∞ and that there is a N0 ∈ N such that P(Yk) ≥ P(Yk+1) for all k ≥ N0. By Dirichlet’s criterion (Knopp [1964], p. 324), ∑∞ k=1 k P(Yk)P(E|Yk) converges if (i) ∑∞ k=1 P(E|Yk) < ∞ and (ii) k P(Yk) → 0 monotonically. The first condition is fulfilled by assumption. The second clause of the criterion can, without loss of generality, be replaced by demanding that for N0 ∈ R, (k P(Yk))k∈N be monotonically decreasing for all k ≥ N0. Assume that the second clause of the criterion is not satisfied, and that there is a sequence of natural numbers nk such that nkP(Ynk ) < nk+1P(Ynk+1 ). (13) Then the (sub)sequence (nk P(Ynk ))k would not converge to zero, and conse- quently, (k P(Yk))k would not converge to zero. However, for some k ≥ N0, 29 P(Yk) is by assumption a monotonically decreasing sequence. Furthermore, for such sequences, if ∑ k P(Yk) exists (which is the case here), then also k P(Yk) → 0 (Knopp [1964], p. 125). Hence, a subsequence (nk P(Ynk ))k with property (13) cannot exist and the second part of the Dirichlet criterion is satisfied. Thus, the third condition of Theorem 3 is indeed sufficient. Finally, we demonstrate the joint sufficiency of (i) P(E|Yk) → 0 and (ii) there is an α > 0 such that lim sup k→∞ k2+α |P(E|Yk) −P(E|Yk−1)| < ∞. In particular, there exists a C > 0 such that k2+α |P(E|Yk) −P(E|Yk−1)| ≤ C, ∀k ∈ N. Moreover, let C′ := 2C ∑∞ k=1 1/k 1+α. By Abel’s formula (Knopp [1964], p. 322), we can rewrite the partial sums of the series ∑∞ k=1 k P(Yk) P(E|Yk) in the following way: N∑ k=1 k P(Yk) P(E|Yk) = N∑ k=1 ( k∑ j=1 j P(Yj) ) (P(E|Yk) −P(E|Yk+1)) + ( N∑ j=1 j P(Yj) ) P(E|YN+1). Note that the re-ordering of the terms does not affect the convergence prop- erties since (12) has only positive members. It is now sufficient to show that both summands on the right side are uniformly bounded in N since this would mean that (12) has bounded partial sums and is thus convergent. 30 We begin by showing that the first summand is uniformly bounded:∣∣∣∣∣ N∑ k=1 ( k∑ j=1 j P(Yj) ) (P(E|Yk) −P(E|Yk+1)) ∣∣∣∣∣ ≤ N∑ k=1 ( k∑ j=1 j k P(Yj) ) 1 k1+α k2+α|P(E|Yk) −P(E|Yk+1)| ≤ C N∑ k=1 ( k∑ j=1 P(Yj) ) 1 k1+α ≤ C ∞∑ k=1 1 k1+α ≤ C′, and the resulting bound is independent of N. For the second term, because of P(E|Yk) → 0, there is, for any k ∈ N, a N0(k) such that ( k∑ j=1 j P(Yj) ) P(E|YN0(k)) ≤ C ′/2. (14) 31 Then we can calculate( k∑ j=1 j P(Yj) ) P(E|Yk+1) ≤ ( k∑ j=1 j P(Yj) ) |P(E|Yk) −P(E|Yk+1)| + ( k∑ j=1 j P(Yj) ) P(E|Yk+1) ≤ . . . ≤ ( k∑ j=1 j P(Yj) )N0(k)−1∑ l=k |P(E|Yl) −P(E|Yl+1)|   + ( k∑ j=1 j P(Yj) ) P(E|YN0(k)) ≤ ( k∑ j=1 j k P(Yj) )N0(k)−1∑ l=k k l2+α l2+α |P(E|Yl) −P(E|Yl+1)|   + C′/2 ≤ ( k∑ j=1 P(Yj) )N0(k)−1∑ l=k C l1+α   + C′/2 ≤ C ( ∞∑ l=1 1 l1+α ) + C′/2 ≤ C′, proving the uniform boundedness of the second summand and thereby the sufficiency of the fourth and last condition for 〈Y 〉E < ∞. Proof of Theorem 4: Let us define Y+k := {Y ≥ k} Y − k := {Y < k} We have assumed that P(E|Y+k ) ≤ P(E|Y − k )∀k ∈ N, with inequality for at least one k > 0. Since Y+k and Y − k are an exhaustive partition of the probability space, this entails that Y +k and E are negatively relevant to each other, and that P(Y+k |E) ≤ P(Y + k ) ∀k ∈ N, (15) 32 with inequality for at least one k > 0. Since P(Yk) = P(Y + k ) −P(Y + k+1), we obtain by a simple diagonalization trick 〈Y 〉 = ∞∑ k=0 k P(Yk) = ∞∑ k=0 ( k P(Y+k ) −k P(Y + k+1) ) = 0 ·P(Y+0 ) + ∞∑ k=1 ( kP(Y+k ) − (k − 1)P(Y + k ) ) = ∞∑ k=1 P(Y+k ), (16) and similarly 〈Y 〉E = ∞∑ k=1 P(Y+k |E). (17) Combining (16) and (17), we conclude 〈Y 〉E = ∞∑ k=1 P(Y+k |E) < ∞∑ k=1 P(Y+k ) = 〈Y 〉 because of P(Y+k |E) ≤ P(Y + k )∀k ∈ N (see (15)), and because we have as- sumed inequality for at least one k > 0. Proof of Proposition 2: By a straightforward application of Bayes’ The- orem: 〈Y 〉E = ∞∑ k=1 k P(Yk|E) = 1 P(E) ∞∑ k=0 k P(Yk)P(E|Yk) ≤ 1 P(E) ∞∑ k=0 k P(Yk) = 1 P(E) 〈Y 〉 < ∞. Richard Dawid 33 Department of Philosophy and Institute Vienna Circle University of Vienna Universitätsstr. 7 1010 Vienna Austria http://homepage.univie.ac.at/richard.dawid/ richard.dawid@univie.ac.at Stephan Hartmann Munich Center for Mathematical Philosophy Ludwig Maximilians-Universität München Ludwigstr. 31 80539 München Germany http://www.stephanhartmann.org Stephan.Hartmann@lrz.uni-muenchen.de Jan Sprenger Tilburg Center for Logic and Philosophy of Science Tilburg University PO Box 90153 5000 LE Tilburg The Netherlands http://www.laeuferpaar.de 34 j.sprenger@uvt.nl References Bovens, L., and Hartmann, S. [2003]: Bayesian Epistemology, Oxford: Oxford University Press. Carlin, B., and Louis, T. [2000]: Bayes and Empirical Bayes Methods for Data Analysis, London: Chapman & Hall. Cushing, J. [1994]: Quantum Mechanics: Historical Contingency and the Copenhagen Hegemony, Chicago: The University of Chicago Press. Dawid, R. [2006]: ‘Underdetermination and Theory Succession from the Perspective of String Theory’, Philosophy of Science, 73, pp. 298–322. Dawid, R. [2009]: ‘On the Conflicting Assessments of the Current Status of String Theory’, Philosophy of Science, 76, pp. 984–96. Douven, I. [2011]: ‘Abduction’, in E. Zalta (ed.), The Stanford Encyclopedia of Philosophy (Spring 2011 Edition), 〈plato.stanford.edu〉. Fraassen, B. van [1980]: The Scientific Image. Oxford: Oxford University Press. Hahn, U., and Oaksford, M. [2007]: ‘The Rationality of Informal Argumen- tation: A Bayesian Approach to Reasoning Fallacies’, Psychological Review, 114, pp. 704–32. 35 Hájek, A., and Hartmann, S. [2010]: ’Bayesian Epistemology’, in J. Dancy et al. (eds), 2010, A Companion to Epistemology, Oxford: Blackwell, pp. 93–106. Hartmann, S., and Sprenger, J. [2010]: ‘Bayesian Epistemology’, in S. Ber- necker and D. Pritchard (eds), 2010, Routledge Companion to Episte- mology, London: Routledge, pp. 609–20. Hitchcock, C., and Sober, E. [2004]: ‘Prediction Versus Accommodation and the Risk of Overfitting’, British Journal for the Philosophy of Science, 55, pp. 1–34. Howson, C., and Urbach, P. [2006]: Scientific Reasoning: The Bayesian Approach, third edition, La Salle: Open Court. Kahn, J., Landsburg, S., and Stockman, A. [1992]: ‘On Novel Confirma- tion’, British Journal for the Philosophy of Science, 43, pp. 503–16. Knopp, K. [1964]: Theorie und Anwendung der unendlichen Reihen, Berlin: Springer. Lipton, P. [2004]: Inference to the Best Explanation, second edition, Lon- don: Routledge. Sober, E. [2009]: ‘Absence of Evidence and Evidence of Absence: Eviden- tial Transitivity in Connection with Fossils, Fishing, Fine Tuning and Firing Squads’, Philosophical Studies, 143, pp. 63–90. Stanford, K. [2006]: Exceeding Our Grasp: Science, History, and the Prob- lem of Unconceived Alternatives, New York: Oxford University Press. 36 Walton, D. [1995]: Arguments from Ignorance, Philadelphia: Penn State University Press. 37