Microsoft Word - absolutemargin061102.doc 1 On the Significance of the Absolute Margin1 Christian List 7 November 2002 Abstract. Consider the hypothesis H that a defendant is guilty (a patient has condition C), and the evidence E that a majority of h out of n independent jurors (diagnostic tests) have voted for H and a minority of k:=n-h against H. How likely is the majority verdict to be correct? By a formula of Condorcet, the probability that H is true given E depends only on each juror’s competence and on the absolute margin between the majority and the minority h-k, but neither on the number n, nor on the proportion h/n. This paper reassesses that result and explores its implications. First, using the classical Condorcet jury model, I derive a more general version of Condorcet’s formula, confirming the significance of the absolute margin, but showing that the probability that H is true given E depends also on an additional parameter: the prior probability that H is true. Second, I show that a related result holds when we consider not the degree of belief we attach to H given E, but the degree of support E gives to H. Third, I address the implications for the definition of special majority voting, a procedure used to capture the asymmetry between false positive and false negative decisions. I argue that the standard definition of special majority voting in terms of a required proportion of the jury is epistemically questionable, and that the classical Condorcet jury model leads to an alternative definition in terms of a required absolute margin between the majority and the minority. Finally, I show that the results on the significance of the absolute margin can be resisted if the so- called assumption of symmetrical juror competence is relaxed. 1. Introduction Suppose there are two states of the world: x = 1 (e.g. the defendant is guilty), and x = 0 (e.g. the defendant is innocent). An n-member jury has to make a decision on whether or not to convict the defendant. The jury’s aim is to ‘track the truth’, i.e. to find that the defendant is guilty if and only if the defendant is guilty. Given the state of the world x, each juror has the same probability (competence) p > ½ of voting for x. The value of p is the same for x = 1 and x = 0. Further, the votes of different jurors are independent from each other, conditional on the state of the world. This is the classical Condorcet jury model (see, amongst many others, Grofman, Owen and Feld 1983). Although the model is usually interpreted in terms of jury decisions, it also applies to other situations. It applies whenever the aim is to ‘track’ the true state of the world on the basis of multiple independent identically distributed signals, where each signal is not perfectly reliable but biased towards the truth. For example, suppose a doctor has to determine whether or not a patient has a particular condition C. There are two states of the world: x = 1 (the patient has condition C), and x = 0 (the patient does not have condition C). The doctor performs a sequence of diagnostic tests, where the different tests 1 I wish to express my gratitude to Robert Goodin and David Estlund for first drawing my attention to the questions addressed in this paper; and to Luc Bovens, Franz Dietrich, Branden Fitelson and Iain McLean for very helpful discussions. Earlier versions of this paper were presented at seminars at Nuffield College, Oxford (January 2002), at the Department of Philosophy, Logic and Scientific Method at the London School of Economics (March 2002), and at an International Summer School on Philosophy and Probability, held at the University of Konstanz (September 2002). I am grateful to the participants at these occasions for comments and discussion. Address for correspondence: Christian List, Nuffield College, Oxford OX1 1NF, U.K.; E-mail christian.list@nuffield.oxford.ac.uk. 2 (possibly repetitions of the same test) are independent from each other, but all have the same probability p > ½ of producing verdict x, given the state of the world x.2 Consider two situations: Situation A. In a 12-member jury (a sequence of 12 tests), 12 jurors (tests) vote for ‘guilty’ (‘condition C’) and 0 jurors (tests) vote for ‘innocent’ (‘no condition C’). Situation B. In a 1000-member jury (a sequence of 1000 tests), 507 jurors (tests) vote for ‘guilty’ (‘condition C’) and 493 jurors (tests) vote for ‘innocent’ (‘no condition C’). In both situations p is the same, and we attach the same prior probability r to the proposition that the defendant is guilty (the patient has condition C) (where 0 < r < 1). In which of the two situations is the given majority verdict more likely to be correct? More precisely, which of the following two probabilities is the greater one: (i) the probability that the defendant is guilty (the patient has condition C), given that we have situation A, or (ii) the probability that the defendant is guilty (the patient has condition C), given that we have situation B? In situation A there is a 100% majority for ‘guilty’ (or ‘condition C’) (12 out of 12 jurors), whereas in situation B there is only a 50.7% majority for ‘guilty’ (or ‘condition C’) (507 out of 1000 jurors). This might lead us to think that probability (i) is greater than probability (ii). However, according to a formula by Condorcet, if a majority of h jurors have voted for x and a minority of k jurors against x, the probability that the verdict of the majority is correct, i.e. that x is the true state of the world, is ph-k  . ph-k + (1-p)h-k Condorcet's formula has a striking implication, as summarized by McLean and Hewitt (1994, p. 37): "... the probability of [a correct majority judgment] can be improved by increasing h-k. Note h minus k, not h+k nor (h-k)/(h+k). What matters is the absolute size of the majority, not the size of the electorate, nor the proportion of the majority size to electorate size. If the jury theorem is applicable, we should talk about 'a majority of 8', 'a majority of 20', etc., not 'a two-thirds majority' or 'a three-quarters majority'. Condorcet did so in most of his later work." 2 The assumption that the value of p is the same for x=0 and x=1 is a strong simplification here. As discussed in more detail below, there are situations in which it is plausible to assume that p depends on x. 3 In the example of situations A and B above, since the absolute margin between the majority and the minority in situation B (503-493=14) is greater than the one in situation A (12-0=12), it follows that probability (ii) is greater than probability (i). This paper reassesses this result and explores its implications. Let me summarize my argument. First, I derive a more general version of Condorcet's formula, by applying Bayes's theorem to the classical jury model. I show that Condorcet's formula is a simplification which leaves out an important complication. According to Condorcet’s formula, the probability that the verdict of the majority (for x) is correct, given the size of the majority, is a function of two parameters: the absolute margin between the majority and the minority (h-k), and the competence of each juror (p). I show that it is in fact a function of three parameters, the third parameter being the prior probability (r) that x is the true state of the world. Second, I show that the more general formula still confirms Condorcet’s basic insight. If we fix the prior probability r, the posterior probability that x is the true state of the world given the size of the majority depends only on the absolute margin between the majority and the minority (h-k) and is an increasing function of that margin.3 The posterior probability that x is the true state of world is invariant under changes of h and k (and consequently of h/n) that leave h-k fixed. I show that a similar result holds when we consider not the degree of belief we attach to the hypothesis that x is the true state of the world, given the evidence that a majority of h out of n jurors have voted for x, but the degree of support the evidence gives to the hypothesis. Third, I argue that the result has an implication for how special majority voting should be defined from the classical Condorcet jury perspective. In many jury decisions, special majorities of at least 10 out of 12 jurors are required for a ‘guilty’ verdict. Special majority voting is usually defined in terms of the proportion of the jury required for a positive decision (e.g. conviction). Suppose we use special majority voting for ‘epistemic’ reasons, i.e. because of a concern for tracking a true state of the world.4 Then the standard definition of special majority voting not only has the wrong focus, but may be even counterproductive, given the assumptions of the classical Condorcet jury model. Alternatively, special majority voting can be defined in terms of the absolute margin between the majority and the minority required for a positive decision. I show that this alternative definition is ‘epistemically’ sound, and that it should thus be recommended from the perspective of the classical Condorcet jury model. It follows that, if we nonetheless want to defend special majority voting under the standard definition, we must either defend it for reasons other than ‘epistemic’ ones or reject the classical Condorcet jury model and find an alternative model that avoids the present results on the significance of the absolute margin. Finally, I show that such an alternative model can be obtained by relaxing the assumption of symmetrical competence. Specifically, suppose each juror's probability of voting for 3 Provided that p > ½. 4 For a discussion of epistemic and procedural justifications of decision procedures, see, amongst others, Cohen (1986); Dahl (1979); Coleman and Ferejohn (1986); Estlund (1993, 1997); List and Goodin (2001). 4 'guilty' given guilt (p1) differs from the probability of voting for 'not guilty' given innocence (p0). Then it is no longer true that the probability that the defendant is guilty given the size of the majority for 'guilty' depends only on the absolute margin between the majority and the minority (h-k).5 For any fixed absolute margin h-k, that probability is now a monotonic function of the total number of jurors n. If p1 < p0, it is an increasing function, which converges to 1 as n tends to infinity. If p0 < p1, it is a decreasing function, which converges to 0 as n tends to infinity. 2. The classical Condorcet jury model and the Condorcet jury theorem We use the labels 1, 2, …, n to denote the n jurors. We represent the state of the world by a binary variable X which takes the value 1 for ‘guilty’ and 0 for ‘not guilty’. We represent the votes of the jurors by the binary random variables V1, V2, …, Vn, where each Vi takes the value 1 for a ‘guilty’ vote and 0 for a ‘not guilty’ vote. The vote of juror i is correct if and only if the value of Vi coincides with the value of X. Capital letters are used to denote random variables and small letters to denote particular values. The classical Condorcet jury model assumes: Competence. For all jurors i = 1, 2, …, n, p1:=P(Vi=1|X=1)>½ and p0:=P(Vi=0|X=0)>½. Symmetrical competence. We have p1=p0=:p. Independence. For each x∈{0, 1}, V1, V2, …, Vn are independent from each other, given the state of the world x. The model’s most famous implication is the Condorcet jury theorem. Given the state of the world x, the probability that a majority of jurors will vote for x is greater than the probability that a majority will vote against x, and the first of these two probabilities converges to 1 as the number of jurors tends to infinity.6 In the medical example, the Condorcet jury theorem can be interpreted as follows. Given the state of the world x, the probability that a majority of tests will produce verdict x is greater than the probability that a majority of tests will produce the opposite verdict, and the first of these two probabilities converges to 1 as the number of tests tends to infinity.7 Let us state the Condorcet jury theorem more formally. For each x∈{0, 1}, let Nx := |{i∈{1, 2, …, n} : Vi = x}|. Then Nx is the random variable whose value is the number of jurors voting for x (where x is 0 or 1). Here Nx > n/2 means that there is a simple majority for x. Since Nx is binomially distributed (with parameters n and p), we have: 5 Holding the competence parameters p0 and p1 and the prior probability r fixed. 6 In the same model, if p < 1/2, the probability that a majority of jurors will vote for x, given the state of the world x, converges to 0 as the number of jurors tends to infinity. 7 The result is robust to certain relaxations of the assumptions. A version of it still holds in certain cases where different jurors have different competence levels, but where the average competence is greater than ½ (e.g. Grofman, Owen and Feld 1983; Borland 1989), and in cases where there are certain dependencies between different jurors’ votes (ibid.; Ladha 1992; Estlund 1994; but see Dietrich and List 2002). See also Hawthorne (2001). We see below that the Condorcet jury theorem itself does not require the assumption of symmetrical competence, whereas Condorcet’s formula on the significance of the absolute margin does. 5 Lemma 1. n (a) For each h = 0, 1, 2, …, n, P(Nx = h|X=x) = ( ) ph(1-p)n-h; h n (b) P(Nx > n/2|X=x) = ∑ ( ) ph(1-p)n-h. h>n/2 h Theorem 1. (Condorcet jury theorem) For each x∈{0, 1}, P(Nx > n/2|X=x) converges to 1 as n tends to infinity. Thus the Condorcet jury theorem concerns (iii) the probability that x is the verdict of a majority of jurors, given that the state of the world is x. Determining this probability is useful for assessing the epistemic properties of majority voting from a ‘global’ perspective. A decision procedure (such as simple majority voting) tracks the truth if the following two subjunctive conditionals are true: • If x = 1 were the true state of the world, then x = 1 would be chosen. • If x = 0 were the true state of the world, then x = 0 would be chosen.8 For any given values of n and p, the Condorcet jury framework allows us to determine the probability that x=1 is chosen under simple majority voting, given that x=1 is the true state of the world, and the probability that x=0 is chosen, given that x=0 is the true state of the world. If both probabilities are close to 1, then this suggests that simple majority voting performs well at ‘tracking the truth’. Motivated by this consideration, we say that a decision procedure tracks the truth in the limit if it satisfies the following condition: Truth-tracking in the limit (T). • P(1 is chosen|X=1) converges to 1 as n tends to infinity, and • P(0 is chosen|X=0) converges to 1 as n tends to infinity. By the Condorcet jury theorem, for each x∈{0, 1}, the probability that x is chosen under simple majority voting, given that x is the true state of the world, converges to 1 as n tends to infinity. Simple majority voting thus satisfies condition (T), given the assumptions of the classical Condorcet jury framework. Although probability (iii) is useful for assessing the epistemic properties of majority voting from a ‘global’ perspective, it is only of limited use for the epistemic problem we are faced with ‘locally’, for instance when we put ourselves into the perspective of a court or doctor or when we compare situations A and B above. Probability (iii) is conditional on the state of the world, and that state of the world is precisely the unobserved parameter we typically want to estimate. What we can observe is whether a 8 This definition of truth-tracking is motivated by Nozicks definition of knowledge in terms of truth- tracking. See Nozick (1981). 6 majority of jurors have voted for or against x (and how large that majority is). So what we are interested in from a ‘local’ perspective is not probability (iii) but the converse conditional probability, namely (iv) the probability that the state of the world is x, given that x is the verdict of a majority of jurors. The hypothesis that we want to test is that the state of the world is x, and the evidence is that a majority of jurors (or, specifically, h out of n jurors) have voted for x. In this language, (iii) is the probability of the evidence given that the hypothesis is true, whereas (iv) is the probability that the hypothesis is true given the evidence. Probabilities (i) and (ii) in section 1 are both instances of (iv), not of (iii). 3. The significance of the absolute margin for the degree of belief we attach to the hypothesis given the evidence Our hypothesis H is that X=x. We attach the prior probability r:=P(X=x) to the truth of H, where 0 < r < 1. In the case of a jury decision, r might be the (typically low) probability that a randomly chosen member of the population is guilty of the relevant charge. We need to specify what evidence we use to test H. The posterior probability of H given the evidence depends on the informational content of the evidence. We might use evidence of the following kinds: E : A majority of precisely h out of n jurors have voted for x, i.e. Nx = h, where h > n/2. E* : A majority of jurors have voted for x, i.e. Nx > n/2. Equations (a) and (b) in lemma 1 above give us P(E|H) and P(E*|H), respectively. Let ¬H denote the negation of the hypothesis. By Bayes’s theorem, we have P(H) P(E|H) P(H) P(E|H) P(H|E) =  =  , P(E) P(H) P(E|H) + P(¬H) P(E|¬H) and a similar result holds for P(H|E*). Using Bayes’s theorem, we can thus derive P(H|E) and P(H|E*) from equations (a) and (b) in lemma 1. The derivation of P(H|E*) (theorem 2) is straightforward; the derivation of P(H|E) (theorem 3) is given formally in the appendix. Theorem 2. For each x∈{0, 1}, n r ∑ ( ) ph(1-p)n-h h>n/2 h P(H|E*) = P(X=x|Nx > n/2) = . n n r ∑ ( ) ph(1-p)n-h + (1-r) ∑ ( ) (1-p)hpn-h h>n/2 h h>n/2 h 7 An implication worth noting is the following. In the special case where n is odd and r = ½, we have n P(X=x|Nx > n/2) = ∑ ( ) ph(1-p)n-h, h>n/2 h i.e. P(X=x|Nx > n/2) = P(Nx > n/2|X=x), and thus P(H|E*) = P(E*|H). Theorem 3. Suppose h > n/2. For each x∈{0, 1}, r pm r P(H|E) = P(X=x|Nx = h) =  = , r pm + (1-r) (1-p)m r + (1-r) (1/p - 1)m where m = 2h-n. Note that E contains more information than E*; E implies E*, whereas the converse does not hold. To test hypothesis H it is desirable to use as much evidence as we have, and hence we will now be concerned with evidence of the kind E rather than the kind E*. By theorem 3, if p > ½, P(H|E) = P(X=x|Nx = h) is an increasing function of m. Here m is precisely the absolute margin between the majority of jurors who have voted for x (h) and the minority who have voted against x (k:=n-h). Therefore, for any fixed prior probability r that the hypothesis H is true, the posterior probability that H is true given the evidence E depends only on the absolute margin between the majority (for x) and the minority (against x) and is an increasing function of that margin. Particularly, P(X=x|Nx = h) is invariant under any changes of n and h that preserve m. Condorcet’s own formula is a special case of the formula in theorem 3 for r = ½. The posterior probability P(H|E) captures the degree of belief we assign to the hypothesis H after seeing the evidence E. Of course, P(H|E) depends on the prior probability r=P(H) we assign to the hypothesis before seeing the evidence. However, with regard to the evidence E itself, all that matters is the absolute margin between the majority of jurors voting for x and the minority voting against x (and the competence parameter p). This confirms Condorcet’s basic insight. 4. The significance of the absolute margin for the degree of support the evidence gives to the hypothesis The main result of the previous section suggests that the absolute margin has a special significance for testing the hypothesis H on the basis of the evidence E. Is this result an artefact of our particular method of testing H by considering the degree of belief we assign to H given E? Or does a similar result hold if we use a different method of testing H on the basis of E? 8 Recent work in Bayesian confirmation theory has advocated the use of the likelihood ratio9 as a measure of the degree of support some evidence E gives to some hypothesis H (Royall 1997, Fitelson 2001). The likelihood ratio is defined as P(E|H) l(H, E) := . P(E|¬H) Unlike the probability P(H|E) discussed in section 3, the likelihood ratio does not depend on the prior probability we assign to the hypothesis H, nor does it refer to the degree of belief we attach to H given the evidence E. Nonetheless, the likelihood ratio has some implications for how the prior and posterior probabilities of H are related. It has the following property: > 1 if P(H|E) > P(H) l(H, E) = { = 1 if P(H|E) = P(H) < 1 if P(H|E) < P(H). In other words, the likelihood ratio is greater than 1 if observing the evidence E would increase our degree of belief in H, it equals 1 if observing E would leave our degree of belief in H unchanged, and it is less than 1 if observing E would decrease our degree of belief in H – regardless of what degree of belief we actually attach to H. Moreover, let R(H) := P(H)/P(¬H) be the ratio of our degree of belief in H to our degree of belief in the negation of H before observing E, and let R(H|E) := P(H|E)/P(¬H|E) be the corresponding ratio after observing E. Then the likelihood ratio satisfies R(H) l(H, E) = R(H|E). In other words, the likelihood ratio can be interpreted as the factor by which observing the evidence E would change the ratio of our degree of belief in H to our degree of belief in the negation of H – regardless of what that ratio is. Further, by Bayes’s theorem, the likelihood also satisfies P(H) P(H|E) =  . P(H) + P(¬H) 1/l(H, E) Fitelson (2001) has shown that the (logarithm of the) likelihood ratio has some other properties that make it particularly suitable as a measure of incremental support some evidence gives to some hypothesis.10 So, in terms of the likelihood ratio, let us ask what degree of support the evidence E in our model (i.e. a majority of precisely h out of n jurors have voted for x) gives to the hypothesis H (i.e. the true state of the world is x). The following theorem is proved in the appendix: 9 Or the logarithm of the likelihood ratio. 10 Specifically, Fitelson (2001) shows that the logarithm of the likelihood ratio satisfies the Peirceian additivity condition, the negation symmetry condition and the urn condition. 9 Theorem 4. Suppose h > n/2. Then p m l(H, E) = () , 1-p where m = 2h-n. Theorem 4 shows that, like the posterior probability P(H|E), the likelihood ratio l(H, E) is an increasing function of m, where m is the absolute margin between the majority of jurors who have voted for x (h) and the minority who have voted against x (k:=n-h).11 In particular, like P(H|E), l(H, E) is invariant under any changes of n and h that preserve m. So, if we determine the degree of support the evidence E gives to the hypothesis H, again all that matters is the absolute margin between the majority of jurors voting for x and the minority voting against x.12 This further supports the claim that in the classical Condorcet jury model the absolute margin between the majority and the minority has a special significance for testing the hypothesis H on the basis of the evidence E. In theorem 10 in the appendix, it is shown that a similar result holds for two other methods of measuring the degree of support E gives to H, namely for the so-called difference and ratio measures. 5. An implication for the definition of special majority voting Let us fix x (where x=1 or x=0), and consider again the hypothesis H that X=x (e.g. the defendant is guilty; the patient has condition C). Suppose the aim is to make a positive decision (e.g. to convict the defendant, or to treat the patient) if and only if H is true. There are two possible types of error: • False positives: a positive decision is made even though H is false. • False negatives: a negative decision is made even though H is true. In many decision problems, there is an asymmetry between false positives and false negatives. Often one type of error is considered worse than the other. In a jury decision, false positives are usually considered worse than false negatives: it is considered worse to convict the innocent than to acquit the guilty. In a medical context, by contrast, false negatives are sometimes worse than false positives: it may be worse to fail to treat an ill person than to mistakenly treat a healthy one (provided that the treatment has no negative side-effects). We may therefore look for a decision procedure that respects that asymmetry, and that makes the “weightier” decision (that decision with respect to which an error is worse) only if the correctness of the decision is beyond any reasonable doubt. Many decision making bodies use the method of special (as opposed to simple) majority voting for this purpose.13 Special majority voting is usually defined in terms of the proportion of the jury 11 So long as p > ½. 12 And the competence parameter p. 13 Although there is a vast literature on the Condorcet jury theorem, not much of that literature addresses special majority voting. The technical results most closely related to the present ones are Nitzan and 10 – e.g. 2/3 or 5/6 – that is required for a positive decision, where that proportion is strictly greater than ½. The formal definition is the following: A proportion rule with parameter qmin. For any n, a positive decision is made if and only if the number of votes for a positive decision divided by the total number n exceeds (alternatively: is at least) qmin. The limiting case qmin = 1/2 is the case of simple majority rule. A proportion rule where the parameter qmin is strictly greater than ½ makes it harder to reach a positive decision (e.g. to convict) than a negative one (e.g. to acquit). In a minimal sense, the rule therefore appears to respect the asymmetry between the two possible types of error. But I will also consider an alternative definition of special majority voting. The alternative definition focuses not on the proportion of the majority in the jury, but rather on the absolute margin between the majority and the minority. An absolute margin rule with parameter mmin. For any n, a positive decision is made if and only if the difference between the number votes for a positive decision and the number votes against a positive decision exceeds (alternatively: is at least) mmin. The limiting case mmin = 0 is the case of simple majority rule. For any fixed number of jurors n, the two definitions of special majority voting can be made equivalent by setting qmin := 1/2(mmin/n + 1). What distinguishes the two definitions is that each definition holds its relevant parameter (qmin for the proportion rule and mmin for the absolute margin rule) fixed for all values of n. The condition for a positive decision under the proportion rule is that a proportion exceeding (or at least) qmin (e.g. 2/3) of the jury supports a positive decision, regardless of whether this corresponds to 8 out of 12 jurors or to 667 out of 1000. The condition for a positive decision under the absolute margin rule, by contrast, is that the difference between the number of votes for a positive decision and the number of votes for a negative decision is greater than (or at least) mmin (e.g. 12), regardless of whether this corresponds to 12 out of 12 jurors or to 506 out of 1000. Now the crucial question is this: Which of the two types of special majority rules is more suitable for respecting the asymmetry between false positives and false negatives, and for making decisions that track the truth in the limit? In subsection 5.1 we address the question about respecting the asymmetry, and in subsection 5.2 we address the one about truth-tracking in the limit. Paroush (1984) and Ben-Yashar and Nitzan (1997). Fey (2001) provides an extension of these results. All of these papers are concerned with determining the ‘optimal’ size of a special majority, given various specifications of a Condorcet jury framework. However, they do not explore the possibility of defining special majority voting in terms of a required absolute margin. Feddersen and Pesendorfer (1998), Coughlan (2000), Gerardi (2000), and Guarnaschelli, McKelvey and Palfrey (2000) all discuss the Condorcet jury theorem in relation to unanimous jury verdicts. Kanazawa (1998) provides a Condorcet jury theorem for special majority voting with high individual competence. 11 5.1 Making positive decisions if and only if the truth of the hypothesis is beyond any reasonable doubt As before, x is fixed (where x=1 or x=0), and H is the hypothesis that X=x. We also fix the parameters r and p. Suppose our goal is to avoid false decisions in favour of x as much as we can. Then we may want to find a decision procedure which satisfies the following condition. Let us choose a certain threshold c, where 0 < c < 1, typically close to 1, e.g. c = 0.95. No reasonable doubt (D). In any given situation (where precisely h jurors have voted for x and n-h jurors against x), a positive decision is made if and only if our degree of belief in H is greater than or equal to the threshold c. Suppose the threshold c is so close to 1 that we consider any proposition with a probability greater than or equal to c to be true beyond any reasonable doubt. Then condition (D) can be interpreted as the requirement that a positive decision should be made if and only if we believe H to be true beyond any reasonable doubt. Is it possible to specify a single fixed parameter qmin (for a proportion rule) or a single fixed parameter mmin (for an absolute margin rule) such that condition (D) is satisfied for any number of jurors n? Suppose we have observed the evidence E that precisely h out of n jurors have voted for x. Condition (D) requires us to make a positive decision if and only if P(H|E) equals or exceeds c. When does P(H|E) (i.e. P(X=x|Nx=h)) equal or exceed c? The following theorem answers that question. Theorem 5.14 Let c be a fixed threshold such that 0 < c < 1. For each x∈{0, 1}, (i) P(X=x|Nx=h) ≥ (>) c if and only if r-cr log() c-cr (ii) m ≥ (>)  (=: mmin), log(1/p - 1) where m = 2h-n. A proof is given in the appendix. By theorem 5, to implement condition (D), we have to make a positive decision if and only if the absolute margin between the number of jurors 14 For a related result, see the ‘convincing majorities theorem’ in Hawthorne (2001). Hawthorne also shows that the proportion of the electorate required to obtain a ‘convincing majority’ converges to ½ as the number of individuals increases and proves a Condorcet jury theorem on the likelihood of obtaining a ‘convincing majority’. At the cost of more complicated mathematics, Hawthorne’s framework, unlike the classical Condorcet jury framework, allows different competence levels for different individuals and dependencies between the choices of different individuals. In another related paper, Goodin (2002) addresses the question of when a majority can convince an agent of the negation of a proposition, where the agent initially assigns a high prior probability to the truth of that proposition. 12 who have voted for x and the number of jurors who have voted against x is greater than or equal to the fixed parameter mmin. The parameter mmin is a function of c, r and p. In particular, mmin is invariant under changes of n, the size of the jury. Thus we can implement condition (D) by using an absolute margin rule with parameter mmin. Table 1 reports some sample calculations of the required values of mmin, for different values of p, r and c. Table 1. Values of mmin corresponding to different values of p, r and c (rounded up to the nearest integer) r = 0.001 r = 0.01 r = 0.25 r = 0.4 r = 0.5 r = 0.6 r = 0.75 p = 0.51 c = 0.5 c = 0.75 c = 0.99 c = 0.999 173 201 288 346 115 143 230 288 28 55 143 201 11 38 125 183 0 28 115 173 0 18 105 163 0 0 88 146 p = 0.55 c = 0.5 c = 0.75 c = 0.99 c = 0.999 35 40 58 69 23 29 46 58 6 11 29 40 3 8 25 37 0 6 23 35 0 4 21 33 0 0 18 29 p = 0.6 c = 0.5 c = 0.75 c = 0.99 c = 0.999 18 20 29 35 12 15 23 29 3 6 15 20 1 4 13 19 0 3 12 18 0 2 11 17 0 0 9 15 p = 0.75 c = 0.5 c = 0.75 c = 0.99 c = 0.999 7 8 11 13 5 6 9 11 1 2 6 8 1 2 5 7 0 1 5 7 0 1 4 6 0 0 4 6 p = 0.9 c = 0.5 c = 0.75 c = 0.99 c = 0.999 4 4 6 7 3 3 5 6 1 1 3 4 1 1 3 4 0 1 3 4 0 1 2 3 0 0 2 3 From table 1, we can infer the value of mmin that we need to choose in an absolute margin rule in order to implement condition (D), given the values of p, r and c. For example, suppose juror competence is p = 0.6, our prior probability that the defendant is guilty is r = 0.001, and we require a degree-of-belief threshold of c = 0.99 for conviction. Then a margin of at least 29 is required for conviction.15 Can we implement condition (D) in terms of a proportion rule as well? In answer to this question, first note that, once we fix the values of p, r and c, a certain minimal jury size is required to enable a positive decision in accordance with condition (D). In the present example (p = 0.6, r = 0.001 and c = 0.99), if the total number of jurors is n = 29, unanimous support is required to secure the required margin of 29. If the total number of jurors is less than 29, a positive decision is never possible for the given values of p, r and c. If the total number of jurors is n = 600, on the other hand, 315 out of 600 votes are sufficient to secure the required margin of 29 between the majority and the minority. This corresponds to a 52.5% majority. 15 Whenever mmin = 0 in table 1, this means that, given the competence parameter p and the prior probability r, any majority from 50% onwards (including a tie) will already be sufficient to ensure that the posterior probability of the correctness of a positive decision equals or exceeds c. 13 More generally, as noted above, given a jury of size n, a margin mmin between the majority and the minority is equivalent to a proportion qmin = 1/2(mmin/n + 1) of the jury. However, for any fixed values of p, r and c, the value of qmin (unlike the value of mmin) depends on n and tends to 1/2 as the number of individuals n tends to infinity. To illustrate, if n is greater than 17300, the value of qmin will be less than 51% for all values of mmin shown in table 1. The next theorem follows immediately: Theorem 6.16 Let c be a fixed threshold such that 0 < c < 1. For each x∈{0, 1}, (i) P(X=x|Nx=h) ≥ (>) c if and only if r-cr log() c-cr (ii) q ≥ (>) 1/2 (  + 1) (:= qmin), n log(1/p - 1) where q = h/n. Table 2 reports some sample calculations of qmin, for different values of p, n and c, but with a fixed r = 0.001. Table 2. Values of qmin corresponding to different values of p, n and c, with r = 0.001 (rounded up to the nearest number with one decimal place) n = 12 n = 50 n = 100 n = 300 n = 500 n = 1000 n = 10000 p = 0.51 c = 0.5 c = 0.75 c = 0.99 c = 0.999 n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a 78.9% 83.5% 98% n/a 67.3% 70.1% 78.8% 84.6% 58.7% 60.1% 64.4% 67.3% 50.9% 51.1% 51.5% 51.8% p = 0.55 c = 0.5 c = 0.75 c = 0.99 c = 0.999 n/a n/a n/a n/a 85% 90% n/a n/a 67.5% 70% 79% 84.5% 55.9% 56.7% 59.7% 61.5% 53.5% 54% 55.8% 56.9% 51.75% 52% 52.9% 53.5% 50.2% 50.2% 50.3% 50.4% p = 0.6 c = 0.5 c = 0.75 c = 0.99 c = 0.999 n/a n/a n/a n/a 68% 70% 79% 85% 59% 60% 64.5% 67.5% 53% 53.4% 54.9% 55.9% 51.8% 52% 52.9% 53.5% 50.9% 51% 51.5% 51.8% 50.1% 50.1% 50.2% 50.2% p = 0.75 c = 0.5 c = 0.75 c = 0.99 c = 0.999 79.2% 83.4% 95.9% n/a 57% 58% 61% 63% 53.5% 54% 55.5% 56.5% 51.2% 51.4% 51.9% 52.2% 50.7% 50.8% 51.1% 51.3% 50.4% 50.4% 50.6% 50.7% 50.1% 50.1% 50.1% 50.1% p = 0.9 c = 0.5 c = 0.75 c = 0.99 c = 0.999 66.7% 66.7% 75% 79.2% 54% 54% 56% 57% 52% 52% 53% 53.5% 50.7% 50.7% 51% 51.2% 50.4% 50.4% 50.6% 50.7% 50.2% 50.2% 50.3% 50.4% 50.1% 50.1% 50.1% 50.1% 16 Theorem 6 can be interpreted as a slightly more general variant of a theorem by Nitzan and Paroush (1984). Nitzan and Paroush's result concerns the special case c = 1/2. Their result, however, focuses entirely on the definition of special majority voting in terms of proportions rather than absolute margins. 14 Theorem 6 implies that, even when we fix p, r and c, there exists no single fixed parameter qmin for a proportion rule such that condition (D) is satisfied for all numbers of jurors n. The value of qmin required to implement condition (D) depends on the size of the jury and converges to 1/2 as that size tends to infinity. In a sufficiently large jury, qmin thus approximates 1/2. From the perspective of condition (D), the epistemic justifiability of a proportion rule, particularly with a parameter qmin significantly greater than 1/2, is therefore questionable. 5.2 Tracking the truth in the limit We now turn to the perspective of the Condorcet jury theorem again. For both the proportion rule and the absolute margin rule, we will consider the probability that x is chosen, given that the state of the world is x. This is the probability that we described as being relevant for assessing the epistemic properties of a decision procedure from a ‘global’ perspective. We will see that the use of a proportion rule (as opposed to an absolute margin rule) may be even counterproductive from that perspective: While an absolute margin rule always tracks the truth in the limit (i.e. it satisfies condition (T))17, a proportion rule may fail to track the truth even in the limit (i.e. it may violate condition (T)). We consider the probability P(Nx/n≥ qmin|X=x) that x will be supported by a proportion of at least qmin of a jury of size n, given that x is the true state of the world. Theorem 7.18 (a) If 1/2 < p < qmin, then P(Nx/n≥ qmin|X=x) converges to 0 as n tends to infinity. (b) If p > qmin, then P(Nx/n≥ qmin|X=x) converges to 1 as n tends to infinity. A proof is given in the appendix. Let us call a decision in favour of x=1 a positive decision. If we use a proportion rule with parameter qmin, theorem 7 immediately implies the following: If 1/2 < p < qmin, then • P(1 is chosen|X=1) converges to 0 as n tends to infinity, and • P(0 is chosen|X=0) converges to 1 as n tends to infinity. If p > qmin, then • P(1 is chosen|X=1) converges to 1 as n tends to infinity, and • P(0 is chosen|X=0) converges to 1 as n tends to infinity. Therefore a proportion rule with parameter qmin > ½ satisfies condition (T) if and only if p>qmin. This implies that, if 1/2 < p < qmin, the rule violates condition (T). By contrast, we will now see that an absolute margin with any parameter mmin satisfies condition (T). We consider the probability P(Nx-N1-x≥mmin|X=x) that x will be supported 17 So long as p>½. 18 For result related to the second part of this proposition, see Kanazawa (1998). 15 by a majority with a margin of at least mmin between the majority and the minority, given that x is the true state of the world. Theorem 8. For any mmin > 0, if p > 1/2, then P(Nx-N1-x≥mmin|X=x) converges to 1 as n tends to infinity. A proof is also given in the appendix. Suppose we use an absolute margin rule with parameter mmin. Then theorem 8 immediately implies the following: For any p > ½ and any mmin > 0, • P(1 is chosen|X=1) converges to 1 as n tends to infinity, and • P(0 is chosen|X=0) converges to 1 as n tends to infinity. Therefore an absolute margin rule satisfies condition (T) for all p > ½ and for any value of mmin. The results of this section show that absolute margin rules track the truth in the limit as soon as each individual is better than random at tracking the truth, whereas proportion rules may fail to track the truth unless individual competence is very high (i.e. p > qmin) or the required proportion equals only ½. 5.3 Summary Absolute margin rules are more ‘epistemically sound’ than proportion rules in the following sense: • When we fix p, r and c, there exists a single fixed parameter mmin such that a corresponding absolute margin rule satisfies condition (D) for all values of n. • An absolute margin rule satisfies condition (T) for any p > ½ and any mmin > 0. By contrast: • Even when we fix p, r and c, there exists no single fixed parameter qmin such that a corresponding proportion rule satisfies condition (D) for all values of n. Rather, to implement condition (D), the definition of a proportion rule would have to be modified to allow the use of different values of qmin for different values of n. • A proportion rule violates condition (T) if 1/2 < p < qmin, and satisfies condition (T) if and only if p > qmin. From the perspective of the classical Condorcet jury model, we should therefore advocate the definition of special majority voting in terms of the required absolute margin rather than the required proportion. If we want to resist this conclusion, we have two alternatives. Either we defend the use of a proportion rule for reasons other than ‘epistemic’ ones, or we reject the classical Condorcet jury model and find an alternative model that avoids the present results on the significance of the absolute margin. A ‘non-epistemic’ defence of a decision procedure would be one that appeals not to the claim that the procedure tracks the truth, but rather to the claim that the procedure has certain procedural properties. There may be different views on what those procedural 16 properties are. The relevant properties in the case of proportion rules might be giving veto power (and thus special protection) to minorities, or securing the legitimacy of a decision by ensuring the endorsement of that decision by a large proportion of the jury or electorate. In some contexts, especially ones where it unclear whether there exists a relevant state of the world that the decision is supposed to track, those considerations might provide perfectly good reasons for the use of proportion rules rather than absolute margin rules. But even from an epistemic perspective, it might still be possible to resist the conclusions of the present section: we would simply need to find a plausible alternative to the classical Condorcet jury model in which Condorcet’s insight about the significance of the absolute margin does not hold. 6. The jury model without the assumption of symmetrical competence In the classical Condorcet jury model, it is assumed that each juror’s probability of voting for x, given that the state of the world is x, is the same for x = 1 and x = 0. This assumption of symmetrical competence is rather demanding. It is plausible to assume that in many contexts a juror’s probability of making a correct decision may depend on what the state of the world is. In the jury example, detecting innocence given that the defendant is truly innocent might be easier than detecting guilt given that the defendant is truly guilty. In the medical example, it might sometimes be easier to diagnose a certain medical condition than to rule it out. Or it might even be desirable to design a diagnostic test for which the probability of a positive verdict, given that the patient has condition C, is greater than the probability of a negative verdict, given that the patient does not have condition C. Many diagnostic tests have this property. It is therefore interesting to ask whether the present results on the significance of the absolute margin are robust to a relaxation of the assumption of symmetrical competence. As defined above, let p1:=P(Vi=1|X=1)>½ and p0:=P(Vi=0|X=0)>½. We keep the assumptions of competence and independence, but not the assumption of symmetrical competence. Theorem 1 does not require the assumption of symmetrical competence and therefore continues to hold. Theorems 3 and 4, however, no longer hold. Instead, we have: Theorem 9. Suppose h > n/2. Let H be the hypothesis that X=1, and E the evidence that N1 = h. Then r p1m r (a) P(H|E) = P(X=1|N1 = h) =  =  ; r p1m + (1-r) (1-p0)m αk r + (1-r) (1-p0/p1) m αk p1h(1-p1)n-h p1 m (b) l(H, E) =  = () (1/α)k, (1-p0)hp0n-h 1-p0 (1-p0)p0 where m = 2h-n, k = n-h, and α =  . (1-p1)p1 17 The proof of theorem 9 is given in the appendix. Parts (a) and (b) of theorem 9 correspond to theorems 3 and 4, and are equivalent to these theorems for the special cases p0 = p1 or k = 0. The smaller the difference between p0 and p1, the more closely will the equations in theorems 3 and 4 approximate equations (a) and (b) in theorem 9, particularly for small values of k. Assuming that p0, p1 > ½, note that p0 < p1 implies α > 1, p0 = p1 implies α = 1, and p0 > p1 implies α < 1.19 This means that, when p0 ≠ p1, P(H|E) and l(H, E) depend not only on m (in addition to the standard parameters), but also on k (and thereby on n and h), where k is the number of jurors in the minority. In particular, suppose we hold the standard parameters (r, p1, p0) and the absolute margin m fixed. Then we have: • If p0 > p1, P(H|E) and l(H, E) are strictly increasing functions of k (and thereby of n); P(H|E) converges to 1 and l(H, E) tends to infinity as n (and thereby k) tends to infinity. Both functions assume their minimum in the case of unanimity (i.e. when h=m and k=0) and increase as the proportion h/n converges to ½. • If p1 > p0, P(H|E) and l(H, E) are strictly decreasing functions of k (and thereby of n); they both converge to 0 as n (and thereby k) tends to infinity. Both functions assume their maximum in the case of unanimity (i.e. when h=m and k=0) and decrease as the proportion h/n converges to ½.20 This shows that, if competence is asymmetrical, the absolute margin is no longer significant by itself. In two different situations with the same absolute margin h-k but with a different proportion h/n, the values of P(H|E) and l(H, E) can be very different. Let m be any absolute margin, no matter how large or small. If p0 > p1, the value of P(H|E) is smaller if the margin m is achieved by unanimity than if the same margin m is achieved by a proportion close to ½. This means that in the present case, all other things being equal (particularly m), a larger proportion is worse than a smaller proportion. If p1 > p0, by contrast, the value of P(H|E) is larger if the margin m is achieved by unanimity than if the same margin m is achieved by a proportion close to ½. Here, all other things being equal, a larger proportion is better than a smaller proportion. In the case of asymmetrical competence, both the absolute margin and the proportion required for implementing condition (D) depend on n. Hence there exists neither a single fixed parameter mmin nor a single fixed parameter qmin such that a corresponding absolute margin or proportion rule satisfies condition (D) for all numbers of jurors n. Irrespective of whether we prefer to define a voting rule in terms of a required proportion or in terms of a required absolute margin, the parameter of that rule will be a function of n. 19 This can be shown by setting p0 = ½ + ε0 and p1 = ½ + ε1. Then ((1-p0)p0)/((1-p1)p1) = ((½- ε0)(½+ε0))/((½-ε1)(½+ε1))=(1/4-ε02)/(1/4-ε12). If ε0<ε1, then (1/4-ε02)>(1/4-ε12); if ε0=ε1, then (1/4-ε02)=(1/4- ε12); if ε0>ε1, then (1/4-ε02)<(1/4-ε12). 20 Crucially, where m is held fixed. 18 7. Concluding Remarks Condorcet’s insight on the significance of the absolute margin is no less striking today than it must have been when it was first discovered by Condorcet. Within the classical Condorcet jury model, the insight is valid irrespective of whether we are concerned with the degree of belief we assign to the truth of the hypothesis given the jurors’ verdicts, or with the degree of support the jurors’ verdicts give to the truth of the hypothesis. An important implication is that, if we accept the classical Condorcet jury model and we want to make decisions by special majority voting, then absolute margin rules are the appropriate types of special majority rules, while proportion rules are questionable from an epistemic perspective. However, all these results depend crucially on the assumption of symmetrical competence. The absolute margin between the majority and the minority is the uniquely significant epistemic criterion if and only if juror competence is symmetrical. Appendix Proof of theorem 3. Suppose h > n/2. By Bayes’s law, P(X=x) P(Nx=h|X=x) P(X=x|Nx = h) =  P(Nx=h) P(X=x) P(Nx=h|X=x) =  . P(X=x) P(Nx=h|X=x) + P(X≠x) P(Nx=h|X≠x) By equation (a) in lemma 1, n P(X=x) P(Nx=h|X=x) = r ( ) ph(1-p)n-h. h By the symmetry assumption, n P(X≠x) P(Nx=h|X≠x) = (1-r) ( ) (1-p)hpn-h. h So n r ( ) ph(1-p)n-h h P(X=x|Nx = h) =  n n r ( ) ph(1-p)n-h + (1-r) ( ) (1-p)hpn-h h h r p2h-n =  r p2h-n + (1-r) (1-p)2h-n 19 r pm r =  =  , r pm + (1-r) (1-p)m r + (1-r) (1/p - 1)m where m = 2h-n. Proof of theorem 4. Suppose h > n/2. P(E|H) l(H, E) =  . P(E|¬H) By equation (a) in lemma 1, n P(E|H) = ( ) ph(1-p)n-h. h By the symmetry assumption, n P(E|¬H) = ( ) (1-p)hpn-h. h Hence n ( ) ph(1-p)n-h h l(H, E) =  n ( ) (1-p)hpn-h h p 2h-m = () 1-p p m = () , 1-p where m = 2h-n. Proof of theorem 5. By theorem 3, we know that r P(X=x|Nx=h) =  , r + (1-r) (1/p - 1)m where m = 2h-n. Hence we have P(X=x|Nx=h) ≥ (>) c r if and only if  ≥ (>) c r + (1-r) (1/p - 1)m r-cr if and only if  ≥ (>) (1/p - 1)m c-cr 20 r-cr log() c-cr if and only if m ≥ (>)  , log(1/p - 1) as required. Definition. A condition φ on the probability p is consistent if there exists a value of p satisfying φ. A condition φ on p is strict if, for every value of p satisfying φ, there exists an ε > 0 such that, whenever |p*-p|<ε, then p* also satisfies φ. An example of a consistent strict condition on p is p > 1/2. The condition p = 1/2 is consistent, but not strict. The condition p < 0 is not consistent. Lemma (Convergence Lemma). Suppose p satisfies the consistent strict condition φ. Then P(Nx/n satisfies φ | X=x) converges to 1 as n tends to infinity. The lemma can be derived from the weak law of large numbers. Proof of theorem 7. Let qmin > 1/2. • Suppose p satisfies 1/2 < p < qmin. As this condition is consistent and strict, the convergence lemma above implies that P(1/2 < Nx/n < qmin|X=x) converges to 1 as n tends to infinity. The result follows. • Suppose p satisfies p > qmin. As this condition is consistent and strict, the convergence lemma above implies that P(Nx/n > qmin|X=x) converges to 1 as n increases. The result follows. Proof of theorem 8. Suppose that p > 1/2. Let mmin > 0. Then 2p > 1, and 2p-1 > 0. In particular, there exists ε > 0 such that 2p-1 > ε. As this condition is consistent and strict, the convergence lemma above implies that P(2Nx-n>nε|X=x)=P(2Nx/n-1>ε|X=x) converges to 1 as n tends to infinity. But, when n > mmin/ε, we have nε > mmin. Then P(2Nx-n>mmin|X=x) converges to 1 as n tends to infinity. But 2Nx-n = Nx-N1-x is precisely the margin between the majority and the minority. The result follows. Proof of theorem 9. Let H be the hypothesis that X=1, and E the evidence that N1 = h. We first prove part (b). By the definition of l(H, E) and the binomial distributions of N1 and N0, we have n ( ) p1h(1-p1)n-h h p1h(1-p1)n-h p1 m l(H, E) =  =  = () (1/α)k, n (1-p0)hp0n-h 1-p0 ( ) (1-p0)hp0n-h h (1-p0)p0 where m = 2h-n, k = n-h, and α =  . (1-p1)p1 21 Now part (a) can be derived straightforwardly from P(H) P(H|E) =  . P(H) + P(¬H) 1/l(H, E) Definition. The difference measure is defined as d(H, E) := P(H|E)-P(H), and the ratio measure is defined as P(H|E) r(H, E) := . P(H) For a detailed discussion of these measures, see Fitelson (2001).21 Theorem 10. Suppose h > n/2. Then r d(H, E) =  - r, r + (1-r) (1/p - 1)m 1 and r(H, E) =  , r + (1-r) (1/p - 1)m where m = 2h-n. Proof of theorem 10. The theorem follows immediately from theorem 3 and the definitions of d and r. The theorem implies that d(H, E) and r(H, E) both depend only on the parameters p, r and m and are increasing functions of each of these.22 References Ben-Yashar, R. and S. Nitzan (1997) “The optimal decision rule for fixed-size committees in dichotomous choice situations: the general result”, International Economic Review 38: 175-186. Borland, P. J. (1989) “Majority systems and the Condorcet jury theorem”, Statistician 38: 181-189. Cohen, J. (1986) “An epistemic conception of democracy”, Ethics 97: 26-38. Coleman, J., and J. Ferejohn (1986) “Democracy and social choice”, Ethics 97: 26-38. 21 Fitelson (2001) discusses another measure, the so-called normalized difference measure, defined by s(H, E) := P(H|E) - P(H|¬E). However, he shows that the measure violates several attractive conditions and argues that it is not a plausible measure of the notion of degree of support. It turns out that, under the normalized difference measure, the absolute margin loses its special significance. But given the measure’s lack of plausibility, it is unclear how to interpret this finding. 22 So long as p > ½. 22 Coughlan, P. J. (2000) “In Defense of Unanimous Jury Verdicts: Mistrials, Communication, and Strategic Voting”, American Political Science Review 94: 375-393. Dahl, R. A. (1979) “Procedural democracy”, in Laslett, P. and J. Fishkin (eds) Philosophy, Politics & Society, 5th series, Oxford (Blackwell): 97-133. Dietrich, F. and C. List (2002) “A Model of Jury Decisions where All Jurors have the Same Evidence”, Nuffield College Working Paper in Economics 2002-W23. Estlund, D. (1993) “Making truth safe for democracy”, in Copp, D., J. Hampton and J. E. Roemer (eds) The Idea of Democracy, New York (Cambridge University Press): 71-100. Estlund, D. (1994) “Opinion leaders, independence and Condorcet's jury theorem”, Theory & Decision 36: 131-162. Estlund, D. (1997) “Beyond fairness and deliberation: the epistemic dimension of democratic authority”, in Bohman, J. and W. Rehg (eds) Deliberative Democracy, Cambridge, MA (MIT Press): 173-204. Feddersen, T. and W. Pesendorfer (1998) “Convicting the innocent: the inferiority of unanimous jury verdicts under strategic voting”, American Political Science Review 92: 23-36. Fey, M. (2001) “A Note on the Condorcet Jury Theorem with Supermajority Voting Rules”, unpublished manuscript, University of Rochester. Fitelson, B. (2001) “A Bayesian Account of Independent Evidence with Applications”, Philosophy of Science 68 (Proceedings): S123-S140. Gerardi, D. (2000) “Jury Verdicts and Preference Diversity”, American Political Science Review 94: 395-406. Goodin, R. E. (2002) “The Paradox of Persisting Opposition”, Politics, Philosophy, Economics 1: 109-146. Grofman, B., G. Owen and S. L. Feld (1983) “Thirteen theorems in search of the truth”, Theory and Decision 15: 261-278. Guarnaschelli, S., R. McKelvey and T. S. Palfrey (2000) “An Experimental Study of Jury Decision Rules”, American Political Science Review 94: 407-423. Hawthorne, J. (2001) “Voting in Search of the Public Good: The Probabilistic Logic of Majority Judgements”, unpublished manuscript, University of Oklahoma. Kanazawa, Satoshi (1998) “A Brief Note on a Further Refinement of the Condorcet Jury Theorem for Heterogeneous Groups”, Mathematical-Social-Sciences 35: 69-73. Lahda, K. (1992) “The Condorcet jury theorem, free speech and correlated votes”, American Journal of Political Science 36: 617-634. List, C., and R. E. Goodin (2001) “Epistemic Democracy: Generalizing the Condorcet Jury Theorem”, Journal of Political Philosophy 9: 277-306. McLean, I. and F. Hewitt (1994) (transl’s and eds) Condorcet : foundations of social choice and political theory, Aldershot (Elgar). Nitzan, S., and Paroush, J. (1984) “Are qualified majority rules special?”, Public Choice 42: 257-272. Nozick, R. (1981) Philosophical explanations, Oxford (Clarendon Press). Royall, R. (1997) Statistical Evidence: A likelihood paradigm, London and New York (Chapman and Hall).