ELIMINATIVE ABDUCTION EXAMPLES FROM MEDICINE Forthcoming in Studies in History and Philosophy of Science Alexander Bird Abstract Peter Lipton argues that inference to the best explanation (IBE) involves the se- lection of a hypothesis on the basis of its loveliness. I argue that in optimal cases of IBE we may be able to eliminate all but one of the hypotheses. In such cases we have a form of eliminative induction takes place, which I call ‘Holmesian in- ference’. I argue that Lipton’s example in which Ignaz Semmelweis identified a cause of puerperal fever better illustrates Holmesian inference than Lipto- nian IBE. I consider in detail the conditions under which Holmesian inference is possible and conclude by considering the epistemological relations between Holmesian inference and Liptonian IBE. keywords inference to the best explanation, Peter Lipton, abduction; Holme- sian inference; eliminative induction. 1 Introduction Many, probably most, scientific realists believe that inference to the best explana- tion (IBE), broadly construed, is at the heart of science. There is a plethora of tech- niques, methods, rules of thumb, heuristics and so forth that are used to generate scientific knowledge and which do not fit the IBE mould. Nonetheless, many of our most interesting theoretical discoveries have been made with the application of IBE, including our discoveries concerning unobservable entities and processes. In the light of this, it is an extraordinary achievement that Peter Lipton has given us the authoritative account of what IBE is and how it contributes to theory choice and confirmation. Given the capacity of philosophers for disagreement and for generating theories, one might have thought that Lipton might have had a num- ber of rivals concerning this absolutely crucial topic. But that just does not seem to be the case. Lipton’s Inference to the Best Explanation seems truly to be a Kuh- nian paradigm in the philosophy of science. For those working on inference to the best explanation, it is the text that sets our agenda, that lays out the problems we must contend with, and which, in many dimensions, is an exemplar of how we should carry out our work. In this paper I pursue some normal (philosophy of ) sci- ence in the Liptonian tradition. While not seeking any revolutionary change to that paradigm, I do want to suggest that there is an important respect in which Lipton’s picture of IBE needs supplementing. 1 I will start by outlining Lipton’s conception of IBE. I’ll then mention an anomaly that arises in his discussion of his central illustrative case, that of Ignaz Semmelweis and puerperal fever. Outlining the relevant features of that case, and of another case, the discovery of the cause of AIDS, will give an indication of why that anomaly arises, and what supplement to Liptonian IBE is thereby required. That supplement states that in some cases of IBE our evidence permits us to select just one potential ex- planation as the explanation, because it is the only potential explanation consistent with the evidence. This I call Holmesian Inference. 2 Inference to the best explanation Inference to the best explanation is about choosing among explanations. It is a mat- ter of choosing among potential explanations of some phenomenon the one that is the best by certain criteria. If there is a suitable best explanation, IBE says that we may infer that it is the actual explanation. If some hypothesis provides the actual explanation of a phenomenon, then that hypothesis is true. How do we choose among potential explanations? According to Lipton, IBE is a two-stage process, where both stages are filters of potential explanations (?: 56–64): Stage 1: The first stage filters out the implausible explanations. The imaginative capacity of scientists generates all the plausible potential explanations and just leaves the remainder unconsidered. Stage 2: At the second stage, scientists investigate the live potential ex- planations that have passed through the first filter, and ultimately rank them according to their explanatory goodness, in order to select the top ranking explanation as the explanation. Lipton explains that explanatory goodness, what he calls ‘loveliness’ must be dis- tinguished from likeliness, since the aim of IBE is to guide our estimates of like- ness on the basis of loveliness. In Lipton’s view loveliness is a matter of potential understanding—a lovely explanation is one that would give us a high degree of un- derstanding of the relevant phenomena were it to be true (and, I would add, known to be true). Two qualifications need to be made concerning the second stage: (Q1) For the best explanation to be inferred it must be significantly bet- ter than its nearest rival. If two competing explanations are both good enough, and one is slightly better than the other, our faith in that slightly better one must be slim. While Lipton does not mention this, it is a clear corollary of his account. (Q2) For the best explanation to be inferred it should normally, consid- ered on its own, be a sufficiently good explanation of enough evidence. If our best explanation is a weak explanation even of a large quantity of data (?: 63, 154), or explains only a limited amount of evidence well, then that is some reason to doubt that it is the the actual explanation. (Later I shall consider amendments to both qualifications.) Both stages in IBE raise important philosophical questions. A crucial question concerns the first stage. Since it filters out so many logically possible explanations, what confidence can we have that the actual explanation is allowed through? Why 2 should the imagination of scientists have the capacity to pick on the true explana- tion among those it creates? The problem here is one that Lipton (?: 152) calls ‘Un- derconsideration’. The stage 2 ranking is no good at all if the actual explanation hasn’t made it through stage 1 on account of the scientists’ failure to think of it. Assuming that the actual explanation is among those investigated at stage 2, two problems immediately raise their heads, which Lipton calls ‘Hungerford’s objection’ and ‘Voltaire’s objection’. The former borrows Margaret Hungerford’s line in Molly Bawn, that beauty is in the eye of the beholder, to raise the worry that the loveliness of explanations may be too subjective to have any relationship to the truth. Voltaire’s objection suggests that the IBE enthusiast has an unjustified Panglossian faith that the actual world is the loveliest of all possible worlds. Even if loveliness is objective, there will be many worlds where it does not correlate with truth. So why think that truth and loveliness correlate in ours? In passing I shall mention a hypothesis formulated by David ?), that all these problems have a Kuhnian answer. The fundamental idea is that our standards of goodness are set by Kuhnian exemplars. It is similarity, in the relevant respect, to the paradigms of good science that govern the field in question, that makes for good- ness of explanation. That answers Hungerford’s problem. Note that the exemplars are themselves selected on grounds that extend beyond loveliness alone. It is em- pirical success in solving scientific puzzles where other paradigms have failed that is the principal driver behind the selection of new paradigms. Despite the problems of incommensurability, the development of science is progressive, it is a history of increasing puzzle-solving power. An answer to Voltaire’s objection can build on this, albeit in a non-Kuhnian way. Let’s say for sake of argument that an exemplar has not only puzzle-solving power but also a high truth-content. Then one might expect puzzle-solutions modelled on that exemplar to have at least a better than random chance of latching on to the truth also. The standards of similarity, the qualities that make for explanatory goodness, will then be truth-tropic, even if they are not fully general and sempiternal. Such standards may be local to a particular field at a particular stage in its development, but that does not prevent them from being truth-friendly in their locality. Of course, this depends on starting with an exemplar with high truth-content. But that’s not a problem for two reasons. First, the problem was to show that truth and goodness could be correlated, not that they must be. This answer shows how they can be without the world being in any way special. Voltaire’s problem is that we set our standards of loveliness first, and then expect the world to live up to them. This examplar-based response says that the world itself can play a part in setting the appropriate standards. Secondly, the fact that empirical data and often the puzzles themselves are generated by the world means that as long as there is a genuine puzzle-solving tradition in place, it will have a component that favours truth over falsity; it would not be a surprise that well-established puzzle-solving tra- ditions have exemplars that have high truth-content. According to this view, explanatory goodness resides in something like Kuhn’s five values—values whose application is determined by exemplars. This differs from Lipton’s conception of loveliness as potential understanding. I don’t intend to adju- dicate between these views of explanatory goodness, since I shall argue that in some cases at least we do not need any explanatory goodness at all. That is because, in some cases, inference to the best explanation is inference to the only explanation— the problem of Underconsideration notwithstanding. 3 3 The Semmelweis case (again) I’ll now move on to the first medical case I wish to discuss, the well-known history of Ignaz Semmelweis and puerperal fever. This case is central to Lipton’s defence of IBE, having previously been discussed by Hempel and by others. The heuristic advantage is clear: by comparing different accounts of inference and confirmation against a common case, their relative merits can more easily be judged. There is in Lipton’s discussion what seems to me to be an anomaly. Since this is his most detailed case study of IBE, in which various hypotheses are considered that might explain a phenomenon, from which one is selected as being the explanation, one might expect some discussion of why the selected explanation is lovelier than the others. We should be told what lovely-making features this explanation has that its rivals lack or possess in lesser degree. But in fact Lipton does not present us with such a discussion. And this suggests to me that the application of IBE in this case does not depend on loveliness or goodness at all. The Semmelweis case is well-known, and so I shall not spend too much time on the principal facts. In 1844 Ignaz Semmelweis graduated from the Vienna Medical School and decided to study obstetrics. He was appointed assistant to the professor of obstetrics, Johann Klein, first in 1846 and then again in 1847. Klein was respon- sible for one of the two labour wards at the Allgemeine Krankenhaus, the General Hospital in Vienna. Many poorer women came to the hospital to give birth and of these women a large proportion, up to one sixth in some years, contracted puer- peral (or childbed) fever, which was almost always fatal. It was widely known that the death rates were considerably higher in Klein’s ward, Division I, than in the other ward, Division II, run by Professor Franz Xavier Bartsch. Semmelweis sought some feature of Division I that would explain its high rate of mortality. These are the prin- cipal hypotheses he considered initially: (S1) Overcrowding in Division I. (S2) Epidemic influences and climate. (S3) Rough examinations by the medical students in Division I. (S4) Psychological effect of the priest passing through the ward on his way to deliver extreme unction to dying women. (S5) Women in Division I delivered on their backs. In assessing these explanations, we must be careful in deciding what the ex- planandum is. The explanandum could be, among others: (A) The existence of puerperal fever in Division I (and by extension the existence of puerperal fever elsewhere). (B) The greater prevalence of puerperal fever in Division I. The character of the inference is very different depending on which the explanan- dum is taken to be. In my view it is important to focus on explanandum (B), the difference in rates of puerperal fever and consequent mortality between the wards.1 The explananda are connected: the principal explanations of the existence of puer- peral fever in general might supply explanations of the difference between the two wards; and conversely a successful explanation of that difference might well provide 1 Our explananda concern the existence and rates of puerperal fever, but the data concern rates of mortality from puerperal fever. The mortality rates are good proxies for the morbidity rates since the disease was almost always fatal. 4 insight into the cause of puerperal fever in general. But these are further inferences, and fraught ones at that, as I shall mention. In the light of this we should consider the explanations (S1)–(S5) as shorthand for explanations of the form ‘X is the cause of the positive difference between the rates of Division I and Division II’, e.g. (S1) should be understood as asserting that overcrowding in Division I is the cause of the greater mortality rate in Division I when compared to Division II. ?: 65–7, 69) noted that hypotheses (S1) and (S2) refer to features that were com- mon to both Division I and Division II. Indeed, because of the desire of expectant mothers to be admitted to Division II rather than Division I, the former was even more crowded than the latter.2 Lipton remarks, however, that the similarly between the wards is nonetheless consistent with one or other of those hypotheses being true. Since no-one thought such factors to be sufficient for puerperal fever, those who maintained such hy- potheses would think that they are only part of the explanation as to why any par- ticular woman contracted the fever; a full explanation would refer to other factors as well, such as general state of health. Note, though, that this point holds only if the explanandum is (A) rather than (B). But as we have seen and will continue to see, Semmelweis’s principal evidence concerns the differences between the two wards. Since Division II had puerperal fever, which could also affect women giving birth at home, Semmelweis was not in a position to directly infer the cause of puerperal fever tout court. Naturally, he was indeed interested in the cause of puerperal fever, as the title of his book (?) on the subject demonstrates. But, but as we shall see, the inference from an explanation of (B) to an expanation of (A) makes difficulties for Semmelweis. According to Lipton’s view of contrastive explanation, to explain the difference between the two wards, we must seek a feature in the history of Division I that is absent from the history of Division II. But hypotheses (S1) and (S2) do not identify such a difference (ignoring the lesser degree of crowding in Division I). Therefore they simply cannot be explanations of (B). Those hypotheses, construed as poten- tial explanations of the difference between the two wards, are simply inconsistent with the evidence. The same goes for a number of other potential causal factors in a case of puerperal fever that are not mentioned in the list above: inadequate venti- lation, excess blood in the circulation, stagnant circulation, disturbances caused by pregnant uterus, decrease in weight caused by emptying of the uterus, protracted labour, wounding of the inner surface of the uterus in delivery, imperfect contrac- tions, faulty involutions of the uterus during maternity, the volume of the secreted milk, and death of the foetus (?: 47). Hypotheses (S3)–(S5) do mark a difference between the wards, at least at the be- ginning of Semmelweis’s investigations. (S3), though, was hardly a difference. For as Semmelweis pointed out, the roughness of the handling by the students was neg- ligible compared to the trauma of childbirth itself, and the difference in roughness 2 In which case, one might ask, why would (S1) even have been raised? The principal answer is that while women were admitted to the two wards on alternating days Sunday through Friday, from Friday to Sunday afternoon, women were admitted to Division I. Furthermore, Division II was instituted in order to relieve overcrowding in Division I. So historically there had been a problem of overcrowding in Division I, until the difference in mortality became widely known. An additional reason is that overcrowding was a widespread problem in European hospitals, with several patients sharing a single bed being a common occurrence. In Vienna, however, one patient per bed was the rule. Nonetheless, the relationship between overcrowding and puerperal fever was a natural one for doctors to consider. 5 between the students and midwives would have been even smaller in comparison.3 Hence hypothesis (S3) seeks to explain a large difference between the two divisions, the fact that the mortality rate in Division I was three times that in Division II, by ap- peal to what is at most a tiny marginal difference. Certainly, in some set-ups, incre- mental changes can have significant effects; but no doctor would suppose this to be such a case. In my view such a hypothesis is not merely implausible—Semmelweis could rule it out as inconsistent with what he knew about how trauma affects dis- ease.4 With respect to hypotheses (S4) and (S5) Semmelweis pursued the policy of seek- ing to eliminate the differences between the wards referred to in a given hypothesis. Thus the priest agreed to take a different route, avoiding Division I; and women in that ward delivered on their sides: again in both cases without any diminution in death. Semmelweis was thus able to generate evidence inconsistent with (S4) and (S5) and thereby eliminate them from his enquiries. At this point, discussions of the Semmelweis case mention the fact that while on holiday in Venice in early 1847, Semmelweis’s colleague Jakob Kolletschka died of a wound incurred during a post-mortem examination. In his illness Kolletschka showed the same symptoms and on autopsy the same lesions as found in women who suffered and died from puerperal fever. This led Semmelweis to his final hy- pothesis, that the parturient women were being infected with cadaveric matter transmitted by medical students from the autopsies that they had been carrying out beforehand. I divide Semmelweis’s hypothesis into two components: (S6a) Women in Division I were infected during examination by medical students. (S6b) The infectious agent was ‘cadaveric matter’ imported by the students after carrying out autopsies. Kolletschka’s death is often presented as a key piece of evidence, one that (S6) can explain whereas the other hypotheses cannot. Consequently (S6) is, in this re- spect at least, a better explanation than the others. I believe, however, that the im- portance of Kolletschka’s death lies elsewhere. At this time, the leading explanation offered of puerperal fever, along with many other diseases, was the miasma theory, according to which diseases are often caused by bad airs that are themselves effects of geography and climate, and can be caused by stagnating water, rotting organic material, overcrowding and the like. This is the theory covered by (S2). Note first that in terms of being able to explain other facts, supporters of the miasma theory of disease would argue that their theory explains a huge amount of data, such as the fact that some diseases, such as malaria, are common in low-lying marshy areas, why diseases such as cholera are more prevalent at sea-level than at altitude, why many diseases, such as typhoid and cholera are more prevalent in crowded, unsan- itary cities than elsewhere, why improvements in sanitation lead to diminution in 3 Lest one should imagine that the midwives were particularly gentle, consider the comment of Sem- melweis’s colleague Jakob Kolletschka, “It is here no uncommon thing for midwives, especially in the commencement of their practice, to pull off legs and arms of infants, and even to pull away the entire body and leave the head in the uterus. Such occurrences are not altogether uncommon; they often hap- pen.” (Lancet 2 (1855): 503. Quoted in ?: 126, fn. 5.) In mitigation of the midwives, one should note, as ?) do, that many of their patients were women from impoverished backgrounds who had suffered from rickets as children. Rickets can often lead to a malformed pelvis, resulting in difficulties in childbirth when adult. 4 As it was, Semmelweis sought to minimize the difference by excluding foreign students from Division I, who, he thought, would be the least gentle in their examining. That, of course, had no effect on the rate of infection in that ward. 6 disease. With respect to the evidence concerning puerperal fever in particular, at the Vienna General Hospital and elsewhere, the miasma theory would explain why puerperal fever comes in epidemic waves and varies seasonally, being worse in win- ter than in summer. Against this mass of evidence, the fact of Kolletschka’s death counts for every little. So if we are considering explanations of (A), then Kolletschka’s death counts for very little. But if we are considering (B), Kolletschka’s death is ev- identially otiose. For the rival (S2), as an explanation of the difference between the two wards, is already refuted by the evidence, as we saw above. And, more impor- tantly, Semmelweis generated the crucial piece of evidence when he insisted on the students washing their hands in chlorinated water before examining the women, the mortality rate in Division I fell to equalling that in Division II. The headline figures are these: the percentage mortality rates for the six years 1841–1846 were: Division I—9.92, Division II—3.88, and for the twelve years 1847–1858 were: I—3.57, II—3.05 (figures from ?: 159–81). This crucial fact clinches the argument in favour of (S6a) independently of the evidence concerning Kolletschka. The significance of Kolletschka’s death is that it drew Semmelweis’s attention to a difference between the midwives and the students that might otherwise have gone unnoticed—the fact that they attended autopsies and carried out dissections before performing examinations in the maternity wards. While he could not eliminate this difference, since he didn’t control the students’ timetable, he could isolate it causally, which amounts to the same thing, by insisting on hand-washing. The evidence concerning Kolletschka enabled Semmelweis to do something else, to formulate a specific hypothesis concerning the infection, that it was due to ‘cadaveric matter’ being transferred from a dissected body to the uterus of an unfor- tunate mother via the hands of the students. Above I divided (S6) into a less specific claim (S6a), that some kind of infection from the medical students is responsible; and a more specific claim, that cadaveric matter from autopsies is responsible. This distinction is important because only (S6a) is verified by the evidence. Although (S6b) is rendered plausible, it is far from verified (and indeed it is strictly false). To be precise, the evidence verifies the claim that the explanation of (B) is some prop- erty of the hands of the medical students that is removed when they are washed. It strongly supports the claim that this property is related to and a causal consequence of the presence of the students at the autopsies, but without verifying this claim, and it lends some support, but much less, to the claim that the property in question is the presence of cadaveric matter. I should make clear that the notion of ‘infection’ I am using here is a very weak one, and does not imply any commitment to a modern germ theory. Rather it is intended to capture an idea that would have been familiar to Semmelweis’s contemporaries, that of contagion, an idea that goes back to Fras- cotoro in the seventeenth century. The core of the idea is that diseases can be spread from individual to individual by the conveyance of some material medium between them. Frascotoro hypothesized the medium to be ‘seminaria’ (seeds), but tells us little about them, which is why I say that the core idea is that there is some material medium of transmission. Semmelweis’s insistence on ‘cadaveric matter’ is a specific version of this theory. I make these distinctions, even though Semmelweis did not, in order to make two points. First, as I shall go on to explain, making these distinctions will allow me to demonstrate my principal thesis, that the evidence can lead us not simply the the best explanation of the evidence, but also, on occasion, to the only explanation of the evidence. (S6a) is a hypothesis of which this true, but (S6b) is not. 7 Secondly, I suggest that one reason why Semmelweis failed to get his views ac- cepted is that he did not distinguish between (S6a) and (S6b), and argued strongly in favour of (S6b) which was only partly supported by the evidence. Furthermore, Semmelweis did not clearly distinguish between explananda (A) and (B). Although Semmelweis’s most effective evidence concerned the difference between the two wards, his ultimate aim was to explain the causes of puerperal fever tout court— all the cases in Division I, and in Division II, and elsewhere. This was because he insisted on a single cause for all cases. But his evidence did not support such a view. For example, it was unclear how cadaveric matter could explain the deaths in Division II and elsewhere. Semmelweis’s explanation was that in such cases the women were self-infecting, due to internally decaying matter. Such an explanation seemed ad hoc. And while ?: 81) noted that street births showed a lower rate of mor- tality than Division I, he could not explain why home births showed a significantly lower rate of mortality (circa 0.5%) than even Division II (over 3%)—if the cadaveric hypothesis implied that the deaths in Division II were unavoidable self-infections, then one would expect a comparable rate of self-infection among mothers giving birth at home. Furthermore, the cadaveric hypothesis was not even novel. A commonly held alternative to the miasma theory of puerperal fever was the view that it is caused by internal putrescence, the rotting of the patient’s own internal flesh and organs. For example, Dr John Clarke (cited in ?: 43) held that tight stays and petticoats and the weight of the baby in the uterus detained faeces in the intestine causing putresence. Getting people to believe a new theory may be difficult enough, but it is often even more difficult to get them to believe an old theory they regard as having been re- futed. The principal piece of evidence against such a theory is the fact that puerperal fever was an epidemic disease which could afflict a population particularly severely for a number of years. Additionally it was seasonal, with winters being particularly bad. ?: 122) explained the latter by reference to the greater diligence of the student doctors in winter months, and while that may have been an exacerbating factor, this seasonal variation was not limited to teaching hospitals (which is another reason to focus on explanandum (B) rather than (A)). To conclude: if we, unlike Semmelweis, restrict our hypothesis to (S6a) and our explanandum to (B), then we see that the evidence forces us to that conclusion by eliminating all potential alternatives. In this case, inference to the best explanation reveals an important kind of limiting case—inference to the only explanation. In Lipton’s model of inference to the best explanation, the loveliness of the hy- potheses is central to their epistemic status: the rank order of their epistemic cred- ibility should follow the rank order of their explanatory loveliness. But the episte- mology of the aetiology of puerperal fever is not like this. Semmelweis considered six hypotheses, but he did not rank these according to their loveliness. It wasn’t that infection via the doctors’ and students’ hands was a lovelier hypothetical cause of the difference in level of puerperal fever than the presence of a dolorous son of the church. That evidence didn’t simply show the priest hypothesis to be unlovely, it showed it to be outright false. Likewise for all the other hypotheses considered by Semmelweis, with the exception of the infection hypothesis (S6a). Thus Semmel- weis had no need to consider the loveliness of these hypotheses, and so it is no sur- prise that Lipton does not discuss their loveliness either. 8 4 HIV and AIDS I shall now turn to a more recent case in medical history, the story of the discovery of HIV and the cause of AIDS. The initial phase involved the identification of a syn- drome that needed explaining. In June 1981 a report was published concerning the appearance of a rare form of pneumonia, Pneumocystis carinii in five homosexual Californian men. Pneumocystis carinii had otherwise only been observed in indi- viduals who had undergone medical therapies involving immunosupression. The following month a second report appeared, discussing the cases of twenty-six young homosexual men with Karposi’s sarcoma, an unusual form of skin cancer, normally found only in men in their 70s and then usually only those of Mediterranean ori- gin. Moreover, four of these had Pneumocystis also. Shortly thereafter a further ten cases of Pneumocystis were revealed in California. As the Centers for Disease Con- trol (CDC) commented, “The apparent clustering of both Pneumocystis carinii and Karposi’s sarcoma among homosexual men suggests a common underlying factor” (?: 14). The clustering of symptoms in a manner indicative of a common cause is a syndrome, in this case initially called GRIDS, Gay-Related Immune Deficiency Syn- drome, and then AIDS, Acquired Immune Deficiency Syndrome. What explains the existence of this syndrome? What causes AIDS? Researchers considered four hypotheses as follows: (A1) Recreational drugs. Initially a contaminated batch of ‘poppers’ (amyl nitrate) was suspected. And then it was considered that exces- sive use of certain recreational drugs, even if not contaminated, might depress the immune system. (A2) Some researchers hypothesized that the very high incidence of fa- miliar sexually transmitted diseases among certain sexually very active men might overload the immune system and cause it to fail. This might also explain the appearance of AIDS among intravenous drug users who shared dirty needles—the repeated taxing of the immune system by for- eign matter and infections overloads it and makes it unable to fight off opportunistic infection. (A3) Bacterial infection—infection by a bacterium, probably hitherto unknown. (A4) Viral infection—infection by a virus, probably hitherto unknown. To these we may add: (A0) There is no common cause—the clustering is entirely accidental. (A0) is the null hypothesis. Abductive inference assumes that there is something in need of explanation. If there is nothing to explain nothing counts as the best ex- planation of it. Individual events or facts typically do need explanation. If someone falls ill with red pustules over arms, chest, and legs, that needs explanation. As I shall discuss later, that may not be true for all individual events, and certainly not for population level events. For what might appear to be a population level phe- nomenon of interest may after all be nothing of the sort—just a chance coincidence. Why did I get six sixes in a row? I might have been using a loaded die. But perhaps I was just lucky, which is to say, there is no explanation. Likewise the co-occurrence of certain symptoms in a small number of people might be a coincidence. The fact that the CDC said that the clustering suggested a common underlying factor indi- cates that for them the null hypothesis had not been ruled out. But as numbers 9 rise, the chances of a coincidence recede rapidly. In Semmelweis’s case, the null hypothesis is that there was no medical difference between the wards. By chance the women assigned to Division I were individually more susceptible to puerperal fever than those assigned to Division II. However, Semmelweis’s statistics covered so many women and such a continued and dramatic difference between the two wards that the chance of that difference being mere chance was absolutely tiny. Semmel- weis’s intuition is confirmed not only by common sense, admittedly unreliable as concerns matters of statistics and probability, but also by modern statisticians (cf. ?). Likewise the number of cases of rare symptoms, often overlapping, all related to an impoverished immune system, and in many cases found amongst homosexual men, means that one can conclude in the AIDS case that the null hypothesis is false. There is indeed a genuine syndrome needing explanation. The key piece of evidence which refuted the lifetstyle-related hypotheses (A1) and (A2) was the discovery of AIDS among haemophiliacs. In 1982 several haemophiliacs were found to be suffering from the syndrome, as were a number of people, both men and also women, who had received blood transfusions, including a twenty month old baby. Among the donors of the blood received by the baby was one man who developed AIDS less than a year after donating. While such evidence points to a blood-borne infection, it also serves to exclude the hypotheses (A1) and (A2), since now numerous individuals were beginning to be diagnosed with AIDS who simply did not participate in drug-taking or very active sex. Indeed, this ev- idence serves to refute pretty well any lifestyle-related hypothesis, since there are no habits shared by the haemophiliacs, the gay men, and the transfusion recipients, that are not shared also by pretty well everyone else. To my mind, it is difficult to think of any hypothesis compatible with the evi- dence of the haemophiliacs and transfusion recipients that does not take the cause of AIDS to be an infectious agent. If instead of distinct hypotheses (A3) and (A4) we had a more general hypothesis, that AIDS is caused by an infectious agent, then that hypothesis is confirmed, by refuting the null hypothesis and all other hypotheses in- consistent with this one. Having established that AIDS is an infectious disease, the next task is to identify the kind of infection. The two obvious candidates are bac- terial and viral. The evidence already obtained rules out bacterial infection. This is because the blood product used by haemophiliacs, the clotting agent factor VIII, is obtained from donated blood by a process that involves, among other things, filtra- tion. Filtration removes bacteria, and so the bacterial hypothesis can be excluded. With the bacterial hypothesis refuted, it is natural to turn to the viral hypothesis. However, one might wonder whether some other infectious agent could be respon- sible: not every infection is bacterial or viral; the other possibilities include fungi, protozoa, and multicellular parasites. In fact filtration removes all of these agents also. Arguably it is conceivable that some hitherto undiscovered kind of filterable agent could be responsible. We now know that there are such agents, although most are virus-like, such as satellite viruses and viroids, and typically these require the presence of a true virus, a helper virus, to replicate. However, the first research into prions was being carried out at about the same time as the cause of AIDS was being investigated, and so such a possible cause would not have been considered. Like a virus, a prion, being simply a protein, is filterable. It remains contentious, however, that prion-related disease is caused by a protein-only agent rather than by protein- plus-virus or some other mechanism. Indeed, one of the controversial features of the prion hypothesis, is that it appears to be inconsistent with the central dogma of molecular biology. The latter says that information can be passed only from nucleic 10 acid (DNA, RNA) to nucleic acid or to protein, but never from protein to protein or from protein to nucleic acid. But prions are proteins, and so if prions are both the causes and effects of prion-related disease (such as CJD), then there is informa- tion transfer from protein to protein. If this objection from the central dogma holds good, then it looks as if only a virus, or virus-like organism (satellite virus or viroid) could be the AIDS vector, since such a vector must contain DNA or RNA, but to be filterable it must smaller than cellular. Being non-cellular, the agent cannot carry the means of its own replication, but must depend upon some external mechanism. That is tantamount to a definition of a virus. Nonetheless, I do not think that such an argument suffices to give us knowledge that AIDS is caused by a virus, since the epis- temic status of the central dogma is not sufficiently well established that it amounts to knowledge. Crick’s point in calling the claim a dogma was that he felt that de- spite its importance it was not well supported by the evidence.5 Correspondingly it would not be safe to conclude that only a virus is consistent with the evidence mentioned so far regarding the cause of AIDS. What did establish that a virus causes AIDS was the isolation of a particular virus, LAV (lymphadenopathy AIDS-associated virus), by Luc Montagnier in 1983, renamed HIV three years later. In due course HIV was shown to satisfy Koch’s postulates with respect to AIDS (Koch’s postulates be- ing principles used to establish that a certain infectious agent is the cause of a given disease).6 What the AIDS case shows is, again, that the identification of the correct explana- tory hypothesis proceeds by the refutation of principal rivals. While the methodol- ogy may have a Popperian flavour, the epistemology does not. For the process of elimination raises the likeliness of the remaining hypotheses. The epidemiologi- cal evidence ruled out hypotheses such as the overloading of the immune system by drugs or commonplace STDs; indeed it refuted any hypothesis other than those permitting blood-borne infection. That raised the probability that the cause of AIDS is a bacterium or virus. The fact that the infectious agent is filterable rules out bacte- rial infection, raising the probability further that the cause is a virus. That does not establish the viral hypothesis with certainty, since that evidence is consistent with subviral agents as a cause. Nonetheless, the fact that such agents are rare, means that the virus hypothesis had a high probability, which encouraged Montagnier to search for a virus directly. 5 Eliminative abduction and Holmesian inference Above I have argued that a key part of understanding Semmelweis’s reasoning must be the fact that he refuted the competing hypotheses. No doubt one large part of the popularity of Popper’s philosophy among scientists is the fact that they recog- nize the role that refutation does play in science. But the Semmelweis case also shows that the sceptical side of Popper’s philosophy, usually ignored by scientists, is unwarranted, at least as a description of scientific practice. For the verdict of sub- sequent scientists is that Semmelweis did have very good reason for believing his conclusions, at least when framed in a suitably circumspect manner; indeed his ev- 5 Note however, that this, Crick’s informational version of the central dogma from 1958, is not refuted by the many criticisms directed at Watson’s pathway account of 1965, popular in textbooks, according to which DNA generates RNA (or more DNA) and RNA generates proteins. 6 That HIV does satisfy Koch’s postulates with respect to AIDS was disputed in some quarters, most notably by Peter ?). But there is now little mainstream doubt on this point. 11 idence allowed him to know that the cause of the differential mortality rate was as stated in hypothesis (S6a). If that is right, then Semmelweis’s reasoning bears a close relation to what I have called ‘Holmesian inference’ (?), in recognition of the famous dictum, “Eliminate the impossible, and whatever remains, however improbable, must be the truth.” (?: 94, 118). (For eliminative induction—and Sherlock Holmes—see also ????.) Spelt out more systematically, Holmesian inference has the following structure: (i) the fact es has an explanation (Determinism); (ii) h1, . . . , hn are the only hypotheses that could explain es (Selection); (iii) h1, . . . , hn−1 have been falsified by the evidence (Falsification); therefore (iv) hn explains es . 7 Holmesian inference is clearly deductive, which is why Conan Doyle correctly de- scribes his hero as a master of deduction. Therefore in considering whether Holme- sian inference can ever lead to knowledge, we need to know whether we can ever be in a position to know that the premises of a Holmesian argument are ever true. Falsification ought not present a problem. Aficionados of the Duhem-Quine the- sis might have their doubts, but I regard these as exaggerated, especially in the hands of Quine. I shall not pursue that more general issue here. To give just one example, it seems clear that the hypothesis that the priest’s presence is a cause of the higher mortality rate is simply refuted by the fact that absence is not marked by any reduc- tion in mortality.8 Determinism is the denial of the null hypothesis, which states that there is no explanation of the phenomenon in question. For individual macroscopic events, the null hypothesis will usually be false and so Determinism will be true, and knowable. However there are cases where Determinism might fail. It may fail for individual atomic or subatomic events; not all such events have an explanation. For example the decaying of a fissile nucleus does not itself bear an explanation, since that is an intrinsically indeterministic occurrence; it just happens. (Nevertheless, we can explain related facts, e.g. that it was possible for it to decay, or that its chance of decay was p, etc.). When we move to statistical phenomena, we may find that there are borderline cases. The proportion of the population at large which is left-handed is about 12%. Let us imagine that a survey of a lecture audience showed that its proportion of left-handers is 16%. It is not immediately obvious whether this fact has an explanation. This difference from the national average could be just a statistical fluctuation. Classical significance testing aims to quantify this, by telling us what the chance would be of a group of just this size having a proportion equal to 16% when it is chosen from the population at large in a manner independent of any possible causal factor (i.e. ‘randomly’). The null hypothesis is the hypothesis that the group can be regarded as chosen independently of left-handedness. 7 For discussions of Determinism and Selection see ?: 131). 8 I do note, however, Alex Broadbent’s (?) response on behalf of the Liptonian model, that the removal of the priest without change in mortality does not refute the priest hypothesis but just makes it incredibly unlovely: “. . . it seems to me that Semmelweis did not refute the priest-hypothesis when the priest was re- routed. Maybe the effects of religion are delayed, and Semmelweis did not wait long enough for the level in the two wards to equalize. Maybe there was some confounding factor: maybe the sudden absence of the priest caused concern among the inmates, as alarming as they had found his presence. Hypotheses of this sort can be devised that are consistent with the claim that the priest’s route caused the difference between the two wards. But they are incredibly unlovely” 12 Selection is more controversial. It is often contended that for any set of evidence there is an unlimited number of hypotheses consistent with that evidence. Note that Selection requires the hypotheses not merely to be consistent with the evidence but also to explain the crucial phenomenon in question. And one ought to construe ‘explanation’ in a reasonably robust way (as does Lipton). Merely deducing e from some proposition h plus certain conditions c does not amount to an explanation of e—for example when h = ¬c∨e. Still, a sceptic may still insist that any phenomenon may have an unending range of mutually inconsistent potential explanations. But for Falsification to retain its plausibility, the range of explanations needing to be con- sidered must be finite, indeed finite and reasonably small. In some circumstances one can engineer an experimental setup so that all po- tential explanations are excluded bar one—we get to know Falsification and Selec- tion simultaneously. Much laboratory science is like this: one varies one factor at a time and so one is able to infer from an observed difference that it is caused by that single factor. This is Mill’s method of difference. Much medical science is based on the method of difference. The Randomized Controlled Trial is intended to be the method of difference writ large. If there is a sufficiently large difference in outcome in a sufficiently large trial so that the null hypothesis can be eliminated (i.e. Deter- minism holds in this case), then one can infer that the treatment is the only possible explanation of that difference and hence is the explanation of that difference. Some- times at least, we can be in a position to know that there is only one explanation of the evidence. Arguably Semmelweis ended up in something like this position. In many cases we will not be able to engineer a position in which Mill’s method applies. The problem is that there may be potential explanations that are sufficiently out of the ordinary that they never get considered, but which are not refuted by our evidence. This is just the problem of Underconsideration mentioned already. And insofar as it is a problem, it is one that afflicts not only Holmesian inference but also Lipton’s model of IBE. Whether one chooses a favoured explanation at stage 2 by ranking or by refutation of competitors, the choice will fail to be known to be the explanation, if one has failed to consider the actual explanation or indeed false potential explanations that ought to have been considered. The answer to the problem of Underconsideration it to appeal to general exter- nalist epistemology.9 In order to know that an instance of Selection is true it is not required that one have considered every possible hypothesis consistent with one’s initial evidence. It will normally be sufficient that one has considered all the po- tential explanations that are true in nearby possible worlds. The sense of ‘could’ in Selection is not the philosopher’s liberal one meaning ‘in some possible world’ but a more restrictive one, exemplified by the true statement, as uttered just before the 2008 US election, ‘either McCain or Obama could win the election on Tuesday, but Ralph Nader could not’. For example, one may take a central feature of knowledge to be the fact that it is closely related to safe believing. One believes p safely if p is true in all conditions that are in fact similar to the actual condition—in terms of possible worlds, S believes p safely if p is true in all nearby possible worlds. While it is too simplistic to say that safe believing suffices for knowing, we may nonethe- less employ the general idea that to know that p does not require that one’s evidence rule out all possibilities, however remote. The issue of underconsideration is not problematic so long as all the hypotheses that are not considered, even if they are 9 David ?) makes a related appeal to externalist/naturalized epistemology, but instead of requiring Selection to be known, he regards it as sufficient that one has a reliable disposition to infer (iv) from (iii) alone. For my response see ?: 15. 13 consistent with our evidence, are true only in remote possible worlds. If that condi- tion is met Selection comes out as true and knowable. Having articulated Holmesian inference, I now turn to the relationship between that inference pattern and the two cases we have examined above. Determinism holds in Semmelweis’s case because the statistical difference between the wards was large enough to be clearly no accident. Likewise, when the number of cases of GRIDS/AIDS, a distinctive and hitherto unusual combination of symptoms, was sufficiently high, researchers could know that they had a new disease on their hands with a specific cause—the null hypothesis, (A0) can be dismissed. Falsification holds with respect to the set of hypotheses (S1)–(S5) discussed: Semmelweis refuted hy- potheses by gathering evidence inconsistent with them. Likewise, the evidence that AIDS researchers possessed allowed them to refute hypotheses (A1)–(A3). Does Selection hold for our two cases? It is not obvious that the six hypotheses considered by Semmelweis are all the potential explanations there could be, even that they include all the hypotheses that could be true in nearby possible worlds. Above I listed a range of hypothetical causes of puerperal fever that Semmelweis’s contemporaries proposed and which might have marked a difference between the two wards. Consequently, the elimination of the other hypotheses did not itself es- tablish the truth of the infection hypothesis (S6a). Nonetheless, the method em- ployed by Semmelweis to eliminate specific hypotheses could be used to eliminate all hypotheses except his favoured one. If the two wards are made to be identical in every causally relevant respect except one, and the difference in mortality remains as before, then the remaining respect must be the cause. As it happens, the elim- ination of (S1)–(S5) made (S6a) highly likely and allowed Semmelweis to test it by direct intervention. Likewise, as explained, the elimination of various hypothetical causes of AIDS made the viral hypothesis very likely, but did not directly establish it. But by making the viral hypothesis likely it made the difficult investigation of a viral hypothesis epistemically more profitable. Does this suggest that the Holmesian model in fact does not, strictly speaking, ever apply, because we may always reasonably doubt Selection to be true? I don’t believe so. Some scientific investigations are too complex for them to be summa- rized as exemplifying any single argument form. Ernst Mayr characterized Darwin’s origin of species as ‘one long argument’, where this one long argument is made up of many arguments some of which may individually lend only partial support to their conclusions. Semmelweis could not use Selection to come to know the truth of (S6a); nonetheless the elimination of rival hypotheses made (S6a) likely. This led to Semmelweis being able to use Mill’s Method of Difference, which is itself a special case of Holmesian inference, to come to know that (S6a) is true, not merely that it is likely. Something similar may be said in the identification of HIV as the cause of AIDS, since there again, I conjecture, the use of Koch’s postulates can be considered a specific case of Holmesian inference. 6 Eliminative abduction and inference to the best ex- planation If Holmesian inference depends on the elimination of hypotheses by refutation, does explanatory power play no epistemic role? My case for Holmesian inference notwithstanding, I agree with most of what Lipton says about the confirmational 14 power of inference to the best explanation, even when the best explanation is not the only one consistent with the total evidence. In discussing the epistemology of some reasoning process one must distinguish what is sufficient to produce knowl- edge from what is sufficient to produce good reasons to believe. Philosophy of sci- ence has tended to ignore knowledge as a category and focus solely on degrees of confirmation. One reason for this is a residual scepticism that affects even many sci- entific realists, that we never get to know any theoretical truths since none of them are strictly true—a view I think is badly mistaken. Epistemologists outside the phi- losophy of science do make the kind of distinction I have made. Thus it is commonly thought that one cannot know that one’s lottery ticket will lose, but the evidence that it is only one ticket in a thousand gives one a good reason to believe that it will lose. Since I don’t think that even statistical cases can be treated like a lottery, I shall not discuss this analogy. Its purpose is simply to point out that a discussion that tells us a great deal about when our beliefs are well-supported by the evidence may well leave unanswered important questions about when our beliefs amount to knowledge. In my view Lipton’s IBE is our best account of confirmation in theory choice, but needs to be supplemented by Holmesian inference if we are to understand when we can gain knowledge of the truth of an explanatory hypothesis. Clearly there must be a connection between the correct account of confirma- tion and the correct account of conditions for knowledge. A key condition is that the conditions for knowledge, in the case of knowledge gained by inference from evidence (as opposed to non-inferential knowledge, such as perception), are such that knowing entails a high degree of confirmation. How do IBE and Holmesian in- ference relate? The answer is clear: while IBE sorts hypotheses according to their degree of ability to explain the evidence, Holmesian inference corresponds to the case where the sorting is extreme: the best explanation comes to the top because all its competitors are rejected as explanations of the evidence since they are inconsis- tent with it. The only explanation is the best by default. Lipton proposes that IBE should be regarded as a surrogate or heuristic stand in for Bayesian conditionaliza- tion. So, for example, our estimate of the likelihood ratio, P (e /h) is often given by considering the potential explanatory relationship between h and e . Now consider a Holmesian case where all explanatory hypotheses have been falsified bar one. In such circumstances the falsified hypotheses have zero loveliness and zero posterior probability.10 And the remaining hypothesis is now the only hypothesis that has any loveliness. Its posterior probability must be one, since the posterior probabilities of all its competitors sum to zero. In such a case our original qualifications, (Q1) and (Q2), no longer apply. To be worth inferring a hypothesis need not meet a minimum threshold of loveliness (other than zero loveliness) in every case. In the Holmesian case it can have quite a low loveliness and still be inferable, since its posterior proba- bility is one. The point of Holmes’s dictum (‘Eliminate the impossible, and whatever remains, however improbable, must be the truth.’) is that a low prior and low love- liness may be translated into a posterior of 1, without increasing loveliness at all, so long as all competitors are refuted. 10 Cannot a hypothesis be lovely, even if known to be false, such as Newtonian mechanics? Some as- pects of loveliness are independent of the evidence, such as simplicity. But other aspects are evidence relative, such as whether they provide a unified explanation of the evidence. And it is difficult to see how the latter can ignore the incompatibility of hypothesis and evidence. If, however, one does have a con- ception of loveliness that is compatible with known falsity, then Lipton’s model of IBE will need a further intermediate stage where plausible and lovely but falsified hypotheses are filtered out. 15 7 Conclusion Lipton’s account of IBE is still our best theory of how scientists distribute their cre- dences among theories on the basis of their explanatory power. Taking our cue from his discussion of the Semmelweis case, where loveliness does not play a role in the- ory choice, we can identify a special case of IBE where the evidence leaves us with only one hypothesis from among those generated at Stage 1 of IBE. Inference to the only explanation is a limiting case of inference to the best explanation. If my discussion of Holmesian inference is correct, we can see how in this lim- iting case at least IBE can lead to knowledge: we have a simple deductive inference from known premises. This raises the question, whether IBE can lead to knowledge in cases that fall short of Holmesian inference. That is a question for another time. But it seems to me that in a case where we can only rank because not all considered hypotheses have been refuted, then we cannot know that the loveliest hypothesis is true, since we do not yet know that an unlovely but unrefuted competitor is false. If such considerations are correct, then Holmesian cases are not simply special cases of IBE, they are very special cases, since they are precisely the cases where IBE leads to knowledge. 8 Acknowledgments As is exemplified by this paper, my own thinking about problems in the epistemol- ogy of science is not simply deeply indebted to Peter Lipton, but has developed within a framework that Peter constructed. Peter’s work and the way in which he wrote philosophy and indeed the way in which he approached the subject, will re- main exemplary for me and for many others for a long time to come. Less appar- ent to others is the way in which Peter promoted the ideas and careers of younger philosophers. I, with many others, benefitted hugely from Peter’s kindness, his ad- vice, and his unending willingness to engage in philosophical discussion. We have so much to be grateful to him for. As regards this paper, I am grateful for very helpful comments to audiences at the Universities of Bristol, Edinburgh, Exeter, Hertfordshire, and Stirling, and at the Peter Lipton memorial conference held in Cambridge on 1 November 2008. I am grateful also to Alex Broadbent, Mark Sprevak, and Anjan Chakravartty for stimulat- ing conversations on the topics raised in this paper, and again to Anjan Chakravartty for comments on a draft. References Bird, A. 2005. Abductive knowledge and Holmesian inference. In T. S. Gendler and J. Hawthorne (Eds.), Oxford Studies in Epistemology, pp. 1–31. Oxford: Oxford Uni- versity Press. Bird, A. 2007. Inference to the only explanation. Philosophy and Phenomenological Research 74: 424–32. 16 Broadbent, A. 2008. Holmesian elimination and Liptonian loveliness. Peter Lipton Memorial Conference, 1 November 2008. Buyse, M. 1997. Opening address: A statistical tribute to Ignaz Philip Semmelweis. Statistics in Medicine 16: 2767–72. Carter, K. C. and B. R. Carter 2005. Childbed Fever: A Scientific Biography of Ignaz Semmelweis. Edison, NJ: Aldine Transaction. Conan Doyle, A. 1953. The sign of four. In The Complete Sherlock Holmes, Volume I. New York: Doubleday. Connor, S. and S. Kingman 1989. The Search for the Virus. Harmondsworth: Penguin Books. Duesberg, P. 1996. Inventing the AIDS Virus. Washington D.C.: Regnery. Earman, J. 1992. Bayes or Bust? Cambridge MA: Bradford. Gillies, D. 2005. Hempelian and Kuhnian approaches in the philosophy of medicine: the Semmelweis case. Studies in History and Philosophy of Biological and Biomed- ical Sciences 36: 159–181. Kitcher, P. 1995. The Advancement of Science. New York: Oxford University Press. Lipton, P. 2004. Inference to the Best Explanation (2nd ed.). London: Routledge. Nuland, S. B. 2003. The Doctors’ Plague. New York: Norton. Papineau, D. 1993. Introduction to Philosophical Naturalism. Oxford: Blackwell. Semmelweis, I. 1983. The Etiology, Concept, and Prophylaxis of Childbed Fever (K. Codell Carter trans and ed.). Madison, WI: University of Wisconsin Press. von Wright, G. 1951. A Treatise on Induction and Probability. London: Routledge and Kegan Paul. Walker, D. 2009. A Kuhnian Defence of Inference to the Best Explanation. Ph. D. thesis, University of Bristol. 17