The Problem of Intransigently Biased Agents Bennett Holman (University of California, Irvine) & Justin Bruner (The Australian National University) Philosophy of Science (Forthcoming) Note: This draft has been accepted with minor revisions, please do not quote. Abstract: In recent years the social nature of scientific inquiry has generated considerable interest. We examine the effect an epistemically impure agent on a community of honest truth-seekers. Extending a formal model of network epistemology pioneered by Zollman, we conclude that an intransigently biased agent prevents the community from ever converging to the truth. We explore two solutions to this problem, including a novel procedure for endogenous network formation in which agents choose who to trust. We contend our model nicely captures aspects of current problems in medical research, and gesture at some morals for medical epistemology more generally. 1. Introduction The emergence of social epistemology has provided both a new range of philosophic problems and new formal tools to address these questions. A salient aspect of this new approach has been the examination of how information is shared between agents within a group. Surprisingly, features that intuitively would seem to be epistemic virtues, such as free exchange of information, can turn out to inhibit the group from acquiring true beliefs (Zollman, 2007). More generally, instead of one optimal communication structure, it turns out that epistemic virtue depends crucially on the particular problem confronting the group (Zollman, 2013). This paper will consider a formal model of a problem that increasingly confronts diverse areas of scientific inquiry: the problem of intransigently biased agents. Previous studies have assumed that research is conducted by agents who, broadly speaking, are interested in discovering the truth (e.g. Alexander, 2013). But there are broad swaths of science where those who are financially backing research do so with the express aim of promoting a claim regardless of underlying facts. Tobacco companies funded work that delayed the establishment of a causal link between second-hand smoke and lung cancer. The potential consequences of regulation and taxes for the energy sector led to the fossil fuel industry to fund studies that controverted the reality of anthropogenic climate change (Oreskes & Conway, 2010). Chemical companies fund research that minimizes the effects of exposure to toxic substances in order to reduce their legal liability (Elliott, 2011). And so on. The presence of financial interests in these domains fundamentally alters the incentives that drive scientific inquiry. Specifically, epistemically motivated inquirers in these areas must contend with intransigently biased agents. In this paper we will first examine the use of Diethylstilbestrol (DES) as an illustration of such a problem. Next we will review the bandit problem as a model of social learning. The fixed nature of communication in these models exacerbates the problem of intransigently biased agents, suggesting that if agents were allowed to choose who to trust they might be able to avoid manipulation. Our paper uncovers that such freedom can render biased agents ineffective. 2. DES: Four Decades of Intransigence In the decades before antibiotics revolutionized medical care, endocrinology was in ascendance. But like many advances, endocrinology brought with it excess enthusiasm in a host of legitimate products and similar-sounding quack remedies exploiting the irrational exuberance. It is somewhere in the penumbra of legitimacy that we find DES. The synthetic estrogen began with an excellent pedigree. In 1939, the British Medical Research Council reported favorably for its use in several conditions related to menstruation and menopause. Because no patent was sought, any drug company that wished, could manufacture and market DES. 1 In 1941, twelve companies gained FDA approval to use DES to ameliorate the symptoms of menopause. Yet not all of the research on DES was favorable. Pervasive side effects and animal studies demonstrating that DES was carcinogenic, led the American Medical Association to recommend that it not be recognized for general use, characterizing actual use as “overzealous… indiscriminate and excessive” (Stoddard, 1945, quoted in Dutton, 1988, p. 47). Amongst the early proponents were Harvard professors Olive and George Smith whose research formed the substantive basis for FDA approval in 1948. Though well respected, the work was not without its critics, and within five years four separate (methodologically superior) studies had shown that DES was ineffective. Unfortunately, the FDA had come to the conclusion that it lacked the legal authority to remove ineffective products from the market, and 1 This account is heavily indebted to the excellent scholarship of Dutton (1988). although the FDA was explicitly awarded such legal authority in 1962, it would take until 1971 before officials concluded that DES was contraindicated for pregnant mothers. Meanwhile, roughly 100,000 prescriptions/year were written throughout the 1960s. By the end of its use at least 3% of the nations’ children had been exposed to DES in utero, in addition to the millions of mothers that had ingested it (Meyers, 1983). Ultimately, it fell from favor in large part due to the actions of patients who brought public attention to the increase in cancer, deformed genitalia, and fertility problems. But given that many of these problems were known or suspected from the start, a perennial question has been what explains the continued use of an ineffective and dangerous drug. Amongst the possible candidates are: a doctor’s own experience and the experience of their colleagues, expert opinion, studies published in medical journals, and information provided by pharmaceutical companies. We have already seen that DES was not supported by the medical literature. By 1954, over 2,000 women had already participated in four randomized clinical trials, all of which failed to support efficacy, the largest showed that DES increased miscarriages. 2 As for experts, while the Smiths never recanted, their position was increasingly isolated. Internal memos document that the companies themselves were aware that use of DES was rejected by the medical elite (Dutton, 1988). Finally, there is the experience of the doctors themselves. Given that DES exacerbated the problems it was prescribed to ameliorate, a doctor’s experience should lead her to the conclusion that DES was ineffective at best. This is too fast for two reasons. First, a doctor might simply encounter a random string of live births and mistake it for drug efficacy. Secondly, the doctor might be so sold on an intervention, that failures are perceived as successes. Yet, the 2 See Bamigboye & Morris (2003) for a retrospective analysis of the available literature. former would not explain such widespread uses and the later possibility only pushes the question back as to where such enthusiasm came from. While some fervor might be attributed to the progress of medicine in general, the lion’s share can be found in the information provided by pharmaceutical companies. One common and influential source of information for doctors was the Physicians’ Desk Reference. The information contained in the PDR was submitted by the manufacturer and then sent out to doctors free of charge. In 1960, over half of busy doctors consulted it daily (Dutton, 1988). Beginning in 1947, DES was listed as indicated for “habitual or threatened abortions” with no mention of any disconfirming evidence until 1969 when the indication was dropped and a strong warning against use in pregnancy was added. Pharmaceutical marketing was both passive and active, and all of it sang the praises of DES. Magazine ads ranged from relatively subdued pieces listing only the positive claims approved by the FDA to garish ads recommending DES for all pregnancies (see Langston, 2010). More active marketing involved company spokesmen (detailers) whose job was to visit doctors and keep them up-to-date on the companies’ products. Corporate memos clarified the approach detailers were to take to doctors: “Tell ‘Em Again and Again and Again – Tell ‘Em Till They’re Sold and Stay Sold” (Quoted in Dutton, 1988, p. 58). Many doctors would deny that such sources affect them, claiming they are men and women of science, moved by reason, not the same tricks used to sell soaps. Yet in the case of DES, no other source besides advertising appears to be a viable candidate for explaining such widespread and enduring use. In the face of detailers telling doctors and telling them again, many were sold. Moreover, it seems that nothing that the doctors told the detailer could change their mind. Even a biased agent would have been moved to reconsider their position if they were in search of the truth, but detailers and other sources of information like the PDR were not just biased, they were intransigently biased. 3. Network Structure and the Bandit Problem We now move to a precise formal framework that can help us better understand the influence that biased agents have on a group of epistemically pure agents. In particular, we examine a network of individuals all confronted by the so-called bandit problem, a situation in which one is presented with two slot machines and must determine which to play. Zollman suggests this is analogous to a doctor determining which of two medications to administer to a patient. Doctors are modeled as Bayesian learners, who update their belief when presented with new evidence, and are myopic in the sense that they simply administer the drug they believe is more efficacious. Moreover, there is no guarantee an individual doctor will correctly identify the more efficacious drug. Consider the following scenario: a doctor has observed 5,566 successes upon administering drug B 10,000 times, and only 10 successes upon administering drug A 20 times. In this case our agent will believe drug B is superior, but clearly since comparatively little is known about drug A, the optimal long-run strategy may include prescribing it to gain more information. The myopic doctors considered in the course of this paper, however, will only begin to prescribe A if the success rate of B falls under 50%. 3 In our model, the doctors do not know the true success rates of drugs A and B. In each interval doctors administer the drug they believe to be superior to their N patients—where each patient has probability pA (or pB) of recovering—and records what percent recover. Doctors are 3 Myopia is plausible in several cases including where a doctor feels ethically prohibited from giving a patient a drug perceived to be worse just to increase the doctor’s confidence the drug is inferior. Figure 1: Each node is an agent and each line represents a two-way communication channel between the agents. We refer to these three canonical structures as “the cycle” (left), “the wheel” (middle), and “the complete graph” (right). embedded in a social network and treat results obtained by their neighbors on par with their own experience. As Figure 1 indicates, epistemic agents are represented as nodes in a graph and those nodes connected by an edge are said to be “neighbors.” With the society of knowers in view, we can now ask some interesting questions; chief amongst them, how should the group communicate in order to maximize the likelihood that every member will learn which drug is superior? While agents in the maximally connected graph reach consensus more quickly, the agents in the cycle are more likely to reach a true consensus (Zollman, 2007). This counterintuitive finding occurs because, as connection density increases, the entire group is likely to be converted from the superior option by a chance wave of bad results. By contrast, the cycle promotes situations in which the group as a whole stays undecided for longer and there is at least one member collecting data on each option, a phenomena Zollman (2010) calls “transient epistemic diversity.” We find that these results only hold so long as agents are epistemically pure. Loosely, an epistemically impure agent, such as a pharmaceutical company, is an agent who attempts to encourage other doctors to use a drug irrespective of which drug is more efficacious. An epistemically impure agent only runs tests on their favored drug and samples from a biased distribution. So, if the actual probability of success is 56%, the pharmaceutical’s reported data comes from a binomial distribution with a mean of 56%+b, where b is the strength of the bias. This is our attempt to capture, in our idealized model of medical epistemology, the fact that pharmaceutical manufacturers find numerous ways to subtly bias their results. 4 We focus primarily on the “worst case scenarios” in which the pharmaceutical company promotes the inferior drug to all doctors. 5 Briefly, the exact set-up is as follows. Agents are randomly assigned beliefs regarding the two available drugs. Doctors, as well as the biased agent, administer the drug they believe to be most efficacious. This generates N data points, and all share their data with those they are connected to. Doctors then update their beliefs in a fashion outlined by Zollman (2010) and then repeat this process. 4. The Impossibility of Sustained Convergence to the Truth Consider the case in which drug A is successful with probability 0.51 and the pharmaceutical’s drug (B) is slightly inferior (pB = .5). Assume all doctors begin with true beliefs regarding both drugs. Given this belief profile all will immediately begin administering A to their patients. We have convergence in the short run, but not in the long run. This is due to the bias of the pharmaceutical company (assume the bias is .03). Since the pharmaceutical company is the only one conducting research on B, they alone influence the doctors’ perceptions about it. Eventually, one of the doctors will “crossover” and begin to administer B. By doing so, she is now running her own unbiased experiment. This in turn helps to mitigate the influence the pharmaceutical has on everyone she is connected to, including herself. Thus if she and two of 4 For a non-exhaustive list see Safer (2002). 5 Given manufacturer’s ability to organize “educational events” and fund “key opinion leaders” this worst case scenario is a reasonable approximation of reality (Elliott, 2010; c.f. Krimsky, 2003) her neighbors both switch over to the pharmaceutical drug, the combined results of their experiments are sufficient to mitigate the influence of the pharmaceutical company. Yet when none of the doctors investigate B, the only information they receive about the drug, once again, comes from the biased pharmaceutical company. We now see why convergence to the superior drug for a sustained amount of time is impossible. To try to quantify this effect we look to the last 1,000 rounds of a 2,000 round simulation and determine the proportion of doctors that experiment with the pharmaceutical drug. We find that six doctors arranged on the wheel use the superior drug 42% of the time (see figure 2). Interestingly, this number increases as we add more connections to the network. In the complete network, doctors utilize the superior drug 63% of the time. Thus, contra Zollman (2007), the more connections the more likely the network as a whole is to adopt the more efficacious treatment. The reasons for this should be obvious. When doctors are better connected to each Figure 2: Proportion of times the more efficacious drug is administered in the last 1,000 rounds (y-axis) as connectivity (x-axis) varies. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.2 0.4 0.6 0.8 1 1.2 Series1 other, fewer doctors have to spend their time debunking the biased results because the unbiased results are more widely broadcast. Though more connected networks provide a defense against intransigently biased agents, nothing short of eternal vigilance is required of the community – the community must constantly devote members to investigate the less successful drug. This keeps the biased agent at bay, but is surely a second-best solution. We argue that what is primarily driving this phenomenon is the fact that experimental results from one agent are taken just as seriously as experimental results from another. If individual doctors could learn that the pharmaceutical is severely biased, doctors may begin to discount their results. Our model forces epistemic agents to listen to everyone they are connected to. If this assumption is relaxed, our doctor may learn to ignore the pharmaceutical. We now turn our attention to endogenous network formation and see that if individuals have some control over who they listen to, then for a wide variety of parameters, the pharmaceutical is unlikely to draw doctors away from the most efficacious drug. 5. Choosing Your Neighbors: Endogenous Network Formation Modeling network formation is an active area of research in a number of disparate fields. 6 Unfortunately, none of the canonical models can be appropriately applied to our epistemic community because our agents are continuously generating data. As in our earlier model, we say a doctor i is connected to doctor j if i is somehow influenced by the experimental findings of j. Network connections now vary continuously, and are no longer symmetric, meaning that i can be strongly connected to j, while j is only weakly connected to i. In this case, i is strongly influenced by j while j is only slightly influenced by i. Similar arrangements no doubt do occur 6 See Jackson (2005) for an overview. as when the work of a senior scientist is very influential on a junior scientist, but this influence is not reciprocal. In general, j strengthens her connection to i if i’s experimental findings are somehow in line with j’s subjective beliefs. Likewise, j weakens her connection to i the more i’s experimental findings seem to clash with j’s beliefs. Making this precise is difficult and highlights why many models of endogenous network formation used in economics and sociology are not applicable when thinking about our epistemic network. We instead present a novel model of endogenous network formation that replicates basic hypothesis testing inside an epistemic community of agents continuously engaged in experimentation. Consider a network of D doctors. Each doctor has D+1 bins which initially have anywhere from zero to 100 balls in them. Let Bi be the vector < 𝑏𝑖1, 𝑏𝑖2, … , 𝑏𝑖𝐷+1 > where bi1 is the number of balls in agent i’s first bin. How strongly connected agent i is to agent j is determined by the proportion 𝑏𝑖𝑗 ∑ 𝑏𝑖𝑘𝑘∈𝐷+1 ⁄ . This connectedness determines how much weight i puts on the experimental findings of j (call this wij). Agent i updates her beliefs regarding drug A in the following way: P (drug A works | agent j has s successes in N trials) = (α + wij s) / (α + β + wij N) Where α and β are the agent’s values from the previous round. Ceteris paribus, the more balls agent i has in her jth bin, the more connected she is to agent j and thus the larger impact agent j has on i’s beliefs. Individuals adjust their connections in the following fashion. Upon receipt of N data points from agent j, agent i conducts a one-sample t-test based on her subjective beliefs. Let tij be the t-score agent i assigns to agent j’s experimental results in round r. The number of balls in bij is then updated by the following equation: 𝑏𝑖𝑗 (𝑟 + 1) = 𝑏𝑖𝑗 (𝑟) + 𝑓(𝑡𝑖𝑗 ), 𝑤ℎ𝑒𝑟𝑒 𝑓(𝑥) = Λ(1.96 − |𝑥|) and Λ > 0. Where bij (r) is the number of balls in agent i’s jth bin at round r. Thus a t-score with an absolute value less than 1.96 results in an increase in the number of balls in the bin, while a t-score with an absolute value exceeding 1.96 results in a decline. How the strength of connection to j is affected can of course only be determined if we take into account the change in all bins. One intuitive property this update rules satisfies is the following: if you are connected to two individuals and they repeatedly provide you the same evidence then in the long run you should expect to be equally connected to these two individuals. One’s initial connectivity “washes out” in the end. We find that the inclusion of network formation has drastic effects. One common outcome is for all doctors in the community to heavily discount the pharmaceutical’s experimental data. In this case, none of the doctors administer the inferior pharmaceutical drug and all have minimal connections to the drug company. The biased agent is effectively squelched, thereby allowing doctors to converge on the superior drug. Less desirable arrangements are also possible. In some scenarios a minority of agents listen to both their fellow doctors and the pharmaceutical company. The level of connection these agents have to the company does not completely dissipate because the company had a hand in shaping their perception of the drug. The company’s biased experimental results are thus not seen as particularly unusual, since they are in some sense already reflected in these doctors’ subjective beliefs. By and large, however, a fluid network helps the community better identify the superior drug. For example, in the fixed network with pA=.51, pB = .50 and b= .08, doctors almost never come to prescribe the superior drug. In contrast, the superior drug is prescribed 80% of the time in the fluid network. In general, dynamic networks are much more resistant to the influence of intransigently biased agents than static networks and Figure 3 drives this point home quite nicely. Two variables are primarily responsible for ensuring that the more effective drug is taken up by the population: N and b. As N increases, the community becomes more likely to converge on the better drug. Surprisingly, convergence on the superior drug is also more probable when the company is highly biased. All else being equal, if the bias is outlandish, then even a small number of trials will be able to alert the community that something is awry. Introducing biased data can influence honest agents, but lies have to be subtle enough to go undetected. In a dynamic network, agents can simply stop listening if the bias become apparent. Figure 3: Proportion of times the more efficacious drug is administered in the last 1,000 rounds (y-axis) for different levels of pharmaceutical bias, ranging from zero to .15 (x-axis) as well as various levels of N. 0 20 40 60 80 100 120 Static network Dynamic, 500 Dynamic, 250 Dynamic, 100 Dynamic, 50 6. The Problem of Intransigently Biased Agents and Epistemic Clarity The problem posed by intransigently biased agents can be alleviated if agents learn to identify and trust good informants. We have seen this is not possible in a static network, since by decree individuals cannot come to ignore their neighbors, thereby allowing a biased agent to mislead the community. Furthermore, Zollman’s finding that “in small finite groups, the best graphs are minimally connected,” (2013, p.25) fails to obtain with the introduction of biased agents. Instead of promoting a virtuous transient epistemic diversity, the lack of communication forces sparsely connected agents to duplicate the debunking work—if they are able to resist the biased agent at all. The introduction of our network formation rule yields desirable results. While other update rules may be superior, this simple rule prevents agents from being manipulated by a highly biased pharmaceutical. It creates a point at which increasing the bias in one’s results merely makes it easier to be identified as untrustworthy. Even in cases where the pharmaceutical company retains some influence with most doctors, groups virtually never converge to the wrong drug, and under most circumstances reviewed here, prescribe the right drug more often than not. Indeed, one common result is that every doctor gives no weight to the pharmaceutical company and roughly equal weight to everyone else. Returning to the DES case, doctors most closely approximate agents in the fixed wheel. Each doctor was in contact with a limited number of colleagues, but maintained contact with the pharmaceutical company via advertisements, the PDR, and interactions with detailers. Thus, despite the experimental evidence, elite opinion, and the doctor’s own experiences, use of DES continued apace. The models considered here suggest two possible responses: increase the number of connections or learn to ignore biased agents. It might be suggested that doctors could also have learned to pay attention to research as is currently recommended by the Evidence-Based Medicine movement. Note however that this just pushes the problem back. As doctors have become more influenced by research, pharmaceutical companies have spent an increasing amount of their marketing budget on biased trials (Angell, 2004). A number of meta-analyses have found a large correlation between positive results and industry funding (Bekelman, Li , & Gross, 2003). Rochon et al. (1994) found that 56/56 comparison trials funded by manufacturers of nonsteroidal anti-inflammatory drugs for arthritis concluded the funder’s product was as good or better than the comparison drug. While this was particularly egregious, it is estimated that between 89% and 98% of trials yield results favorable to the company that funded the research. (Cho and Bero, 1996). Given the severity of this problem some commentators have suggested that pharmaceutical companies be prohibited from conducting such research. An alternative to such a fundamental change in the structure of scientific practice is to better exercise epistemic discrimination. Though it is rare, official bodies have occasionally considered devaluing the epistemic weight accorded to industry funded studies, a proposal the British advisory agency NICE considered, but ultimately rejected. The present analysis suggests that something like our network formation rule may be preferable to the current practice of treating all equally well- designed trials as equivalent regardless of their source. Alexander, Jason. 2013. Preferential Attachment and the Search for Successful Theories. Philosophy of Science 80: 269-282. Bamigboye, Anthony, and Jonathan Morris. 2003. Oestrogen supplementation, mainly diethylstilbestrol, for preventing miscarriages and other adverse pregnancy outcomes. Cochrane Database of Systematic Reviews 3: Art. No.: CD004353. Bekelman, Justin, Yan Li, and Cary Gross. 2003. Scope and impact of financial conflicts of interest in biomedical research. JAMA 289: 454-65. Cho, Mildred, and Lisa Bero. 1996. The quality of drug studies in symposium proceedings. Annals of Internal Medicine 124: 485–489. Dutton, Diana. 1987. Worse than the Disease: Pitfalls of Medical Progress. Cambridge: Cambridge University Press. Elliott, Carl. 2010. White Coat Black Hat: Adventures on the Dark Side of Medicine. Boston: Beacon Press. Elliott, Kevin. 2011. Is a Little Pollution Good for You?: Incorporation Societal Values in Environmental Research. New York: Oxford University Press. French, John. 1956. A Formal Theory of Social Power. Psychological Review 63: 181–94. Goldman, Alvin. 2011. “Systems Oriented Social Epistemology.” In Social Epistemology: Essential Readings, eds. Alvin Goldman and Dennis Whitecomb, 11-37. Oxford: Oxford University Press. Jackson, Mathew. 2005. “A Survey of Models of Network Formation: Stability and Efficiency.” In Group Formation in Economics: Networks, Clubs and Coalitions, eds. Gabrielle Demange and Myrna Wooders, 11-57. Cambridge: Cambridge University Press. Krimsky, Sheldon. 2003. Science in the private interest: Has the lure of profits corrupted medical research? Landham: Rowman and Littlefield. Langston, Nancy. 2010. Toxic Bodies. New Haven: Yale University Press Meyers, Robert. 1983. DES: The Bitter Pill. New York: Seaview/Putnam. Oreskes, Naomi and Erik Conway. 2010. Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming. New York: Bloomsbury Press. Rochon, Paula, Jerry Gurwitz, Robert Simms, Paul Fortin, David Felson, Kenneth Minaker, and Thomas Chalmers. 1994. A study of manufacturer-supported trials of nonsteroidal anti- inflamatory drugs in the treatment of arthritis. Archives of Internal Medicine 154: 157- 163. Safer, Daniel. 2002. Design and Reporting Modifications in Industry-Sponsored Comparative Psychopharmacology Trials. Journal of Nervous and Mental Disorder, 190: 583–592. Zollman, Kevin. 2007. The Communication Structure of Epistemic Communities. Philosophy of Science 74: 574–87. Zollman, Kevin. 2010. The Epistemic Benefit of Transient Diversity. Erkenntnis 72: 17–35. Zollman, Kevin. 2013. Network Epistemology: Communication in Epistemic Communities. Philosophy Compass 8: 15–27.