Microsoft Word - Disbelief.doc 1 February 14, January 16, 2006 Rev. December 5; 2006; February 18, July 12, 2007. Disbelief as the Dual of Belief John D. Norton Department of History and Philosophy of Science and Center for Philosophy of Science University of Pittsburgh www.pitt.edu/~jdnorton The duality of truth and falsity in a Boolean algebra of propositions is used to generate a duality of belief and disbelief. To each additive probability measure that represents belief there corresponds a dual additive measure that represents disbelief. The dual measure has its own peculiar calculus, in which, for example, measures are added when propositions are combined under conjunction. A Venn diagram of the measure has the contradiction as its total space. While additive measures are not self-dual, the epistemic state of complete ignorance is represented by the unique, monotonic, non-additive measure that is self-dual in its contingent propositions. Convex sets of additive measures fail to represent complete ignorance since they are not self-dual. 1. Introduction A common view is that belief comes in degrees governed by the probability calculus, so that understanding belief requires an understanding of the algebraic properties assigned to it by the probability calculus. What of disbelief? Is there a natural calculus governing disbelief? What are its algebraic properties? If degrees of belief are probabilities,1 then the natural intuition is 1 If degrees of belief are the same as degrees of confirmation, then we cannot always associate these degrees with probabilities; or so I have argued in Norton (2003, 2005, 2007). 2 that degrees of disbelief are complements to probabilities, something like “one minus probability,” and are governed by whatever rules befit this notion. The goal of this paper is to show that this informal intuition is essentially correct, but that the full analysis brings us to a calculus of disbelief that is a little more complicated than the simple slogan suggests and a little more intriguing. The device used to arrive at this calculus is the duality of truth and falsity in a Boolean algebra. This duality induces a duality of degrees of belief and disbelief that allows us to pass from the additive measure of belief to a new measure of disbelief. This new measure is governed by a calculus that looks very different from the calculus of probabilities. It is additive, but its measures add when propositions are conjoined – “and’ed”—not when they are disjoined—“or’ed”. Conditionalization is defined; yet, a proposition conditioned on any of its logical consequences has unit disbelief. The dual Venn diagram in which the measures are represented as areas has the contradiction as its total space. In the following, Section 2 will review briefly the duality of truth and falsity in a Boolean algebra of propositions; there is a map that switches truth and falsehood while preserving the algebra. It is shown in Section 3 that his same map applied to an additive measure on the Boolean algebra generates what I shall call a “dual additive measure” that is a measure of disbelief. We may well wonder what assurance we can have of the consistency of the odd calculus just sketched. We shall see that assurance in Section 3; it comes from the duality of belief and disbelief used to generate the calculus. Every axiom, proof and theorem of one calculus is mirrored by its dual in the other. So whatever assurance of consistency or even comfort we have with the probability calculus ought to be inherited by the new calculus. Section 4 will develop the notion of conditionalization in the dual additive measures and Section 5 will describe a dual form of the familiar Venn diagram in which magnitudes assigned by a dual additive measure are represented by the areas of geometric shapes. Finally, in Section 6, I will suggest a philosophical application. The duality will be used to generate a representation of the epistemic state of complete ignorance. That state will be characterized by its invariance under the negation map; that is, a measure representing complete ignorance is self-dual in its contingent propositions. We shall see that additive measures individually or sets of additive probability measures, whether convex or not, all fail to be self- dual and so are not admissible as representations of complete ignorance. A larger class of 3 monotonic measures will be defined in which the requirement of self-duality will pick out a unique measure. 2. Duality of Truth and Falsity in a Boolean Algebra A Boolean algebra of propositions A1, A2, …, An is a set of propositions assumed closed under the familiar operations ∼ (negation), ∨ (disjunction) and & (conjunction).2 Implication ⇒ is stronger than material implication; A ⇒ B means that that propositions are so related that ∼A∨B must always be true; that is, ∼A∨B = Ω. The universally true proposition, Ω, is implied by every proposition in the algebra. The universally false contradiction, ∅, implies every proposition. Contingent propositions are defined as those that may be true or false according to the interpretation chosen. Propositions A1, A2, …, An are contingent, as are their Boolean combinations, unless they are logically equivalent to ∅ or Ω. The following transformations of propositions comprise a dual map on the algebra. For all propositions or propositional formulae A and B in the algebra, one carries out the substitutions recursively according to the rules: (1a) Ω → ∅ (1b) ∅ → Ω (1c) Α&Β → A∨B (1d) Α∨Β → A&B (1e) ∼A → ∼A (1f) A ⇒ B → ∼(B ⇒ A) For example (A∨∼Β)⇒C becomes ∼(C⇒(A&∼B)) under (1f) and (1d). The rules Ω→∅ and ∅→Ω directly exchange truth and falsity. The rules (1c) and (1d), which exchange & and ∨, have the same effect, since they exchange the always true (A∨∼A) and the always false (A&∼A). The importance of this dual map is that it preserves truths about propositions in a Boolean algebra. For example, the truth (A∨∼A)=Ω becomes the truth (A&∼A)=∅. The simplest 2 For a more precise characterization, including an axiom system, see Marciszewski (1981) “Algebraic Structures,” Section 6.7, pp. 8-9, and Section 6.9, pp. 9-10, for a discussion of the dualities of Boolean Algebra. For a lengthier treatment of an axiom system and its self-duality, see Goodstein, Ch. II. 4 way to see that the map preserves truths is to note that the dual map takes commonly used axioms of the algebra to axioms of the algebra. For example, the common axioms in the left column are mapped to the corresponding axioms on the right and conversely. For any propositions or propositional formulae A, B, C: ∅ ∨ A = A → ← Ω & A = A A ∨ ∼A = Ω → ← A & ∼Α = ∅ A ∨ (B & C) = (A ∨ B) & (A ∨ C) → ← A & (B ∨ C) = (A & B) ∨(A & C) So, if we have any theorem, such as one of de Morgan’s laws ∼(A∨B) = (∼A&∼B), then the dual map (1) takes it to another theorem, in this case the other of the de Morgan’s laws, ∼(A&B) = (∼A∨∼B). For if there is a proof of the first theorem from the axioms, then there is a corresponding dual proof of the second that begins with the duals of the axioms used to prove the first.3 Since the total body of axioms and theorems are mapped onto themselves, the Boolean algebra of propositions is self-dual.4 3 The paired proofs are too lengthy for a footnote. See Goodstein, Ch. II. In a simpler example, we start with the axioms A∨∼Α=Ω and ∅∨A=A, substitute ∅ for A in the first and ∼∅ for A in the second to recover ∅∨∼∅=Ω and ∅∨∼∅=∼∅, so that Ω=∼∅. In the dual proof, we start with dual axioms A&∼A=∅ and Ω&A=A, substitute Ω for A in the first and ∼Ω for A in the second to recover Ω&∼Ω=∅ and Ω&∼Ω=∼Ω, to recover the dual theorem, ∅=∼Ω. 4 Massey (1992) has used this duality to argue for the indeterminacy of translation. 5 3. Dual Additive Measures 3.1 Duality of the Theories Let us now assume that we have an additive measure m defined on the Boolean algebra of proposition that satisfies the standard Kolmogoroff axioms shown below (as presented in Marciszewski, 1981, p. 287, slightly augmented). We introduce a dual additive measure M by adding one transformation to the dual map (1): m(A) → M(A) M(A) → m(A) (1’) If the augmented map (1), (1’) is applied to these axioms, what results is a set of axioms that we will take to define dual additive measures. Conversely, these axioms are mapped back to the original axioms by the dual map. Axioms for Additive measure m(⋅) Axioms for Dual additive measure M(⋅) For any propositions A, B: (2a) m(Ω)=1 m(∅) = 0 → ← (3a) M(∅)=1 M(Ω) = 0 (2b) m(A) ≥ 0 → ← (3b) M(A) ≥ 0 (2c) If A&B = ∅, m(A∨B) = m(A) + m(B) → ← (3c) If A∨B = Ω, M(A&B) = M(A) + M(B) (2d) For m(B) non-zero, m(A|B) =def m(A&B)/m(B) → ← (3d) For M(B) non-zero, 5 M(A|B) =def M(A∨B)/M(B) 5 Although Hajek (2003) urges that it does not handle all cases well, I introduce conditional measures with this ratio definition. It gives us the simplest, first approach to the calculus of dual 6 Unlike the algebra of propositions, the theory of additive measures is not self-dual. The axioms (2) of the additive measure are not mapped back onto themselves by the dual map. Instead they are mapped onto the axioms (3) of the dual additive measures, which contradict (2). Otherwise, matters are not so different. We now have two isomorphic structures. Any theorem of one axiom system will have a dual theorem in the other, proved by a dual proof. Take for example the law of total probability—here total measure—of additive measures. It is deduced from the axioms (2) for any propositions C and D by: m(C) = m((C&D) ∨ (C&∼D)), using C = (C&D) ∨ (C&∼D) = m(C&D) + m(C&∼D), by (2c) setting A = (C&D) and B = (C&∼D) so that A&B = (C&D) & (C&∼D) = C&D&∼D = ∅ = m(C|D)m(D) + m(C|∼D)m(D), by definition (2d) (4) The law of total dual measure is derived from the axioms (3) in a proof whose individual lines are the duals of the first proof. M(C) = M((C∨D) & (C∨∼D)), using C = (C∨D) & (C∨∼D) = M(C∨D) + M(C∨∼D), by (3c) setting A = (C∨D) and B = (C∨∼D) so that A∨B = (C∨D) ∨ (C∨∼D) = C∨D∨∼D = Ω = M(C|D)M(D) + M(C|∼D)M(D), by definition (3d) (4’) Of course the striking axiom is (3c), which calls for the addition of the dual measures of propositions when the propositions are combined by conjunction &, with the curious proviso that this can only be done when their disjunctions are universally true. That the theory of dual additive measures is isomorphic to the theory of additive measures assures us that this rule and the entire theory is consistent, or, more cautiously, as consistent as the theory of additive measures. additive measures and avoids the greater complication of an axiom system for conditional measures. 7 Some useful results now follow. For additive measures, we have from axioms (2a) and (2c), that, for any A 1= m(Ω) = m(A∨∼A) = m(A) + m(∼A) so that m(∼A) = 1 – m(A) (5) The dual of this inference yields the corresponding result on dual measures. From axioms (3a) and (3c), we have, for any A, 1= M(∅) = M(A&∼A) = M(A) +M(∼A) so that M(∼A) = 1 – M(A) (5’) Axiom (2c) asserts the pairwise additivity of the ordinary measures under disjunction. Repeated application of the axiom yields m(B1∨B2∨…∨Βm) = m(B1) + m(B2) + … + m(Bm) (6) where each distinct pair Bi, Bk satisfies Bi&Bk = ∅. The dual inference yields M(C1&C2&…&Cm) = M(C1) + M(C2) + … + M(Cm) (6’) where each distinct pair Ci, Ck satisfies Ci∨Ck = Ω. If an ordinary additive measure m is used to represent degrees of belief, with m(Ω)=1 representing full belief in a universal truth and m(∅)=0 representing no belief in a contradiction, then a dual additive measure M represents degrees of disbelief. For M(Ω) = 0 represents no disbelief in a truth; M(∅) = 1 represents full disbelief in a contradiction; and intermediate degrees of disbelief are assigned to remaining propositions in a way that conforms to the structure of their Boolean algebra through the restrictions of rules (3a), (3b) and (3c). 3.2 Dualities of Individual Measures The dual map (1), (1’) employed so far is a map on sentences about measures, dual measures and propositions. It has enabled us to define an axiom system for dual additive measures and to set up correspondences between the two axiom systems and the proofs of theorems in each. There is a second map that is useful once the space of dual additive measures has been defined. It is a map from the space of additive measures to the space of dual additive measures, and its inverse, placing the two in one-one correspondence. For all propositions A, 8 m(A) → M(A) = m(∼A) M(A) → m(A) = M(∼A) (7) That one can create a dual additive measure from a measure m(⋅) merely by forming M(⋅) = m(∼ ⋅), and conversely, is asserted as: Proposition 1. If m(⋅) is an additive measure, then M(⋅) = m(∼ ⋅) produced by (7) is a dual additive measure; and if M(⋅) is a dual additive measure, then m(⋅) = M(∼ ⋅) produced by (7) is an additive measure. To see the first clause, assume that m(⋅) is an additive measure. The axioms (3) follow for M(⋅) defined by (7). From axiom (2a), we have m(Ω)=1 and m(∅) = 0, which entails axiom (3a), since M(∅) = m(∼∅) = m(Ω) = 1 and M(Ω) = m(∼Ω) = m(∅) = 0. Similarly axiom (2b) asserts m(A) ≥ 0 is true for all A, including ∼A. That entails (3b), M(A) = m(∼A) ≥ 0. Finally we have from (2c), substituting ∼A for A and ∼B for B that m(∼A∨∼B) = m(∼A) + m(∼B), when ∼A&∼B=∅. Therefore, for this case, using a de Morgan law, we have m(∼(A&B)) = m(∼A) + m(∼B) which is immediately M(A&B) = M(A) + M(B) where the condition ∅ = ∼A&∼B = ∼(A∨B) is just Ω = (A∨B) as required by axiom (3c). Since (3d) is a definition, it requires no proof. The second, converse clause is demonstrated by the dual proof. Additive measures, construed as probabilities, have the familiar connections to relative frequencies as asserted in the various laws of large numbers. The measure of an outcome A is affiliated through them in a suitable limit to the relative frequency with which outcome A arises in many, repeated independent trials. The additivity of measures of mutually exclusive outcomes A and B is affiliated with the additivity of the corresponding relative frequencies. The map (7) allows us to make corresponding affiliations for dual additive measures. Under that map M(A)=m(∼A). So the dual additive measure M(A) of an outcome A is affiliated in a suitable limit with the relative frequency rf(∼A) of the complementary outcome ∼A in many, repeated, independent trials. The connections are summarized in the table: 9 Additive measure m Dual additive measure M Relative frequency rf(A) of outcome A among many, repeated, independent trials Relative frequency rf(∼A) of the complement of outcome A among many, repeated, independent trials Additivity of relative frequencies rf(A∨B) = rf(A) + rf(B) where A&B=∅. Additivity of relative frequencies of complements rf(∼(A&B)) = rf(∼A) + rf(∼B) where A∨B = Ω. This map (7) also vindicates the intuition that disbelief is “one minus probability,” almost. For it entails that the dual satisfies M(A) = 1 – m(A) (8) which is the “one minus probability” rule. However the rule (7) is more robust since it will continue to give good results even if we relax the axioms of additivity (2c) and (3c) of the two measures (as we do in Section 6 below), whereas the “one minus probability” rule would not. Finally, the map (7) is, to some extent, analogous to the map A → ∼A ∼A → A (7’) on propositions, which assigns to each proposition A another proposition with the opposite truth value. The two maps are analogous in so far as (7) assigns to each measure of belief m a dual measure of disbelief M, and conversely. The two maps (7) and (7’) are not perfectly analogous since the theory of Boolean algebras is self-dual, whereas the theory of additive measures on Boolean algebras is not. They are disanalogous in that the measure of disbelief M does not invert degrees of belief for each proposition to which belief is assigned. Rather it expresses the same information as contained in the distribution of degrees of belief m, but now as a distribution of degrees of disbelief M, governed by a different calculus. For example, full belief in Ω, expressed as m(Ω)=1, is re-expressed as the equivalent zero disbelief in Ω, that is, M(Ω)=0. 3.3 Illustration One rapidly sees how the dual additive measures behave through an example. The outcomes of a die toss are one, two, three, four, five and six. The degrees of belief assigned to them and their disjuncts and the associated degrees of disbelief are shown in the table: 10 Outcome Additive measure m Dual additive measure M Each of one, two, three, four, five or six. 1/6 5/6 Each of low (= one or two); medium = (three or four); high = (five or six). 1/3 2/3 Each of even or odd. 1/2 1/2 Each of ∼one, ∼two, ∼three, ∼four, ∼five or ∼six. 5/6 1/6 The table displays the result (5’) above on negations: 1/6 = M(∼one) = 1–M(one) = 1–5/6. The atoms of the additive measure are one, two, three, four, five and six, in the sense that the measure of any (non-∅) outcome is recovered from their disjuncts. Analogously the atoms of the dual additive measure are their negations, ∼one, ∼two, ∼three, ∼four, ∼five and ∼six, and the dual measure of any (non-Ω) outcome is recovered from their conjuncts. For example, we have the logical formula for outcomes even = ∼one & ∼three & ∼five so that, from (6’), we have M(even) = M(∼one & ∼three & ∼five) = M(∼one) + M(∼three) + M(∼five) = 1/6+1/6+1/6 = 1/2 where the propositions satisfy ∼one ∨ ∼three = Ω, ∼three ∨ ∼five= Ω and ∼one ∨ ∼five= Ω. (For a graphical presentation of these relations in terms of dual Venn diagrams, see Section 5 below.) 4. Conditionalization 4.1 The Rule The rule (3d) defining conditional dual measures at first seems strange. To see why, compare it with the rule for ordinary measures in the familiar case of the die toss. m(two|even) = m(two & even)/m(even) = (1/6)/(1/2) = 1/3 Yet the dual definition gives M(two|even) = M(two ∨ even)/M(even) = M(even)/M(even) = 1 More generally, if A⇒B, 11 m(A|B) = m(A&B)/m(A) = m(A)/m(B) ≤ 1 (9a) M(A|B) = M(A∨B)/M(B) = M(B)/M(B) = 1 (9b) The oddness of this difference between the two measures disappears if we attend to the natural meaning of the conditional measure. The operation m(⋅|even) creates a new measure in which we assign unit belief to even. That is, we have m(even|even)=1. The operation M(⋅|even) is the dual. It is a new dual measure created from the old by assigning unit disbelief to even. That is, the new measure in effect assigns unit belief to ∼even=odd. In that case, our belief in outcome two should be zero; that is, unit disbelief, so that M(two|even) = 1 is expected. More generally, we form M(⋅|B) by shifting unit disbelief onto B. Therefore we must also assign unit disbelief to A, if A⇒B. (These relations are displayed graphically below by means of Venn diagrams and dual Venn diagrams in Section 5.) To see that the duality of the two calculi is preserved, we need only review a case that is dual, in so far as we switch the direction of entailments:6 m(even|two) = m(even & two)/m(two) = m(two)/m(two) = 1 M(even|two) = M(even ∨ two)/M(two) = M(even)/M(two) = (1/2)/(5/6) = 3/5 For the general case, if B⇒A, m(A|B) = m(A&B)/m(B) = m(B)/m(B) =1 (10a) M(A|B) = M(A∨B)/M(B) = M(A)/M(B) ≤ 1 (10b) To see that the value M(even|two) = 3/5 is reasonable, consider the natural meaning of M(even|two). The dual measure M(⋅|two) presumes unit disbelief on two; that is unit belief on ∼two = one ∨ three ∨ four ∨ five ∨ six. We note that 3 of 5 cases are unfavorable to even, which, assuming uniform distributions of belief over the five cases, yields 3/5. 4.2 Dualities of Individual Conditional Measures There is an analog for conditional measures of the dual map (7) on unconditional measures. For all propositions A and B such that neither m(∼B) nor M(∼B) are zero m(A|B) → M(A|B) = m(∼A|∼B) M(A|B) → m(A|B) = M(∼A|∼B) (7’) 6 We replace (A⇒B) = Ω by its dual ∼(B⇒A) = ∅, using (1f) and (1a). The latter is equivalent to (B⇒A) = Ω, or more simply, B⇒A. 12 That the transformation does form dual conditional additive measures from conditional additive measures and conversely is asserted as: Proposition 2. If m(A|B) is a conditional additive measure, then M(A|B) = m(∼A|∼B) produced by (7’) is a dual conditional additive measure; and if M(A|B) is a dual conditional additive measure, then m(A|B) = M(∼A|∼B) produced by (7’) is a conditional additive measure. To demonstrate the first clause, assume that m(A|B) is a conditional additive measure. That means, by definition (2d), there exists an additive measure m(⋅) such that m(A|B) = m(A&B)/m(B). The dual map (7) forms a dual additive measure M(⋅) = m(∼⋅) from m(⋅). Using this dual measure we can rewrite m(∼A|∼B) as m(∼A|∼B) = m(∼A&∼B)/m(∼B) = m(∼(A∨B))/m(∼B) = M(A∨B)/M(B) By (3d), this last ratio is the definition of the dual conditional additive measure M(A|B), affirming that the quantities defined from m(⋅|⋅) as M(A|B) = m(∼A|∼B) is a dual conditional additive measure. The second clause is shown by a dual demonstration. As indicated in Figure 1, the formation of dual measures from measures by rules (7) or (7’) commutes with the formation of conditional measures from measures by the definitions (2d) or (3d). That is, assume we commence with an additive measure m(⋅). First we form the conditional additive measure m(A|B) = m(A&B)/m(B) by definition (2d). Second we form the dual conditional additive measure M(A|B) = m(∼A|∼B) = m(∼A&∼B)/m(∼B) by dual map (7’). Then the dual conditional additive measure arrived at is the same one that would be arrived at by switching the order of operations. That is, first we form the dual additive measure M(A) = m(∼A) by dual map (7); and second we form the dual conditional additive measure M(A|B) = M(A∨B)/M(B) by definition (3d). For the latter is just M(A|B) = M(A∨B)/M(B) = M(∼(∼A&∼B))/M(∼(∼B)) = m(∼A&∼B)/m(∼B) as before. 13 m(⋅) → (7) M(⋅) ↓ (2d) ↓ (3d) m(⋅|⋅) → (7’) M(⋅|⋅) Figure 1. Commuting of rules for forming conditional measures and dual measures The dual result holds by analogous reasoning for the commuting of the formation of measures from dual measures by rules (7) or (7’) and the formation of conditional measures by the definitions (2d) and (3d). 4.3 Degrees of Confirmation and Disconfirmation and the Flow of Belief and Disbelief There is an important result for chains of propositions related by entailment ∅ ⇒ B1 ⇒ B2 ⇒ … ⇒ Bm-1 ⇒ Bm ⇒ Ω (11a) All ordinary measures are non-decreasing as we proceed along the chain from ∅ to Ω: 0 = m(∅) ≤ m(B1) ≤ m(B2) ≤ … ≤ m(Bm-1) ≤ m(Bm) ≤ m(Ω) = 1 (11b) with at least one of the inequalities strict. It follows immediately from the dual map (7) that all dual additive measures are non-increasing on the same chain: 1 = M(∅) ≥ M(B1) ≥ M(B2) ≥ … ≥ M(Bm-1) ≥ M(Bm) ≥ M(Ω) = 0 (11c) with at least one of the inequalities strict. The two sets of inequalities (11b) and (11c) capture how belief and disbelief flow through the algebra of propositions. Inequalities (11b) show unit belief localized on Ω and flowing through the propositions of the chain (11a) contrary to the direction of entailment. As it flows, belief is diluted, eventually to zero. The extent to which belief passes past each link of the chain is given by the conditional measures m(Bi|Bi+1) = m(Bi)/m(Bi+1), according to (9a). Inequalities (11c) show unit disbelief localized on ∅ and flowing with the direction of entailment. The extent 14 to which disbelief passes past each link of the chain is given by the conditional measures M(Bi+1|Bi) = M(Bi+1)/M(Bi), according to (10b) These flows have an important meaning if we use our measures not just to represent degrees of belief or disbelief, but degrees of warranted belief or disbelief; that is, if they are degrees of confirmation and disconfirmation. Then, the measure m(A|B) represents the degree to which A is confirmed by the truth of B, since believing B commits us to according belief m(A|B) to A. The measure M(A|B) represents the degree to which A is disconfirmed by the falsity of B, since total disbelief in B commits us to according disbelief M(A|B) to A. The two sets of inequalities (11b) and (11c) now represent the flow of truth from Ω and falsity from ∅ through the algebra of propositions. 4.5 Bayes’ Theorem Bayes’ theorem describes how belief is redistributed when we learn the truth of some proposition. According to it, our belief in hypothesis H on learning the truth of evidence E is given by ! m(H |E) = m(E | H) m(E) m(H) (12a) It is derived in the familiar way by applying the definition (2d) in two ways to m(H&E) = m(H|E)m(E) = m(E|H)m(H). The dual form of Bayes’ theorem is deduced by an analogous inference. Using definition (3d) twice we have M(H∨E) = M(H|E)M(E) = M(E|H)M(H), so that ! M(H |E) = M(E | H) M(E) M(H) (12b) Its interpretation is the dual of the original form (12a) of the theorem. If we commence with disbelief M(H) in hypothesis H, learn the falsity of E, then our disbelief in H is adjusted to M(H|E). The easiest way to see that the original form (12a) and dual form (12b) of Bayes’ theorem can be used to carry out identical inferences is to write the dual form with the substitution of ∼H for H and ∼E for E. We then have ! M(~ H |~E) = M(~E |~ H) M(~E) M(~ H) (12c) 15 This is exactly the same formula that results from substituting for the individual terms of (12a) using the transformation M(∼⋅) = m(⋅) (7) and M(∼⋅|∼⋅) = m(⋅|⋅) (7’). Therefore, for fixed H and E, the formulae (12a) and (12c) will have the same numerical values in corresponding slots. Each of the familiar inferences Bayes’ theorem supports will have its dual. Other terms equal, we read from (12a) that higher prior belief m(H) in H leads to proportionally higher posterior belief m(H|E). Correspondingly, we read from (12c) that higher prior disbelief M(∼H) in ∼H leads to proportionately higher posterior disbelief M(∼H|∼E). In so far as high disbelief in ∼X corresponds to high belief in X, then this latter conclusion is the same as the former. In another important case, other factors being equal, we expect that H gets the greatest incremental confirmation from the truth of evidence E when H⇒E. We recover this result from the ordinary form (12a) of Bayes’ theorem, by noting that the likelihood m(E|H) has a maximum value of unity when H⇒E. Correspondingly in this case the dual term M(∼Ε|∼H) has a maximum value of unity. For when H⇒E, we have (H&E)=H; so that (∼Η∨∼E)=∼H; and M(∼Ε|∼H) = M(∼Ε∨∼H)/M(∼H) = M(∼H)/M(∼H) = 1. We can also write down a hybrid form of Bayes’ theorem by substituting ∼E for E in dual form of Bayes’ theorem (12b). ! M(H |~E) = M(~E | H) M(~E) M(H) (12d) It is a hybrid in so far as it tells us how to adjust our disbelief in hypothesis H given that we have assigned full belief to evidence E; that is, full disbelief to ∼E. It is common in interpreting Bayes’ theorem to presume the independence of the three terms on the right hand side: the likelihood m(E|H), the expectedness m(E) and the prior m(H). So we hold the likelihood m(E|H) and the prior m(H) fixed and infer that our posterior belief m(H|E) is greater if the expectedness m(E) is lower—that is, informally, the hypothesis H is rewarded more for comporting well with less probable evidence E. This assumption of independence of terms can fail, thereby defeating inferences that presume it. For example, it fails if E⇒H. In that case the likelihood m(E|H) = m(E)/m(H) and the three terms m(E|H), m(E) and m(H) are not independent. Any two entail the third. 16 A similar failure of independence of terms arises for the hybrid form (12d) of Bayes’ theorem in the important case in which H⇒E. For in this case,7 M(∼E|H) = 1+ (M(∼E)–1)/M(H). Therefore any two of the three terms M(∼E|H), M(∼E) and M(H) will entail the third. If one fails to notice this dependence, one would fallaciously infer that an increase in M(∼E) would be associated with a decrease in M(H|∼E); that is, one would fallaciously infer that a decrease in our prior belief in E would be associated with a decrease in our posterior belief in H. None of the versions (12a), (12b), (12c) and (12d) above require the additivity of m(⋅) or M(⋅). Augmented versions of Bayes’ theorem that do require additivity are recovered by substituting for each of m(E) and M(E) using the laws of total measure (4) and total dual measure (4’): m(E) =m(E|H)m(H) + m(E|∼H)m(∼H) M(E) =M(E|H)M(H) + M(E|∼H)M(∼H) 5. Dual Venn Diagrams One of the most helpful tools for visualizing an additive measure is a Venn diagram, in which geometric shapes are associated with propositions and the areas of the shapes give their measure. Since dual additive measures are additive, a dual Venn diagram with these same properties is possible. However, the dual diagrams are different in many ways from the ordinary Venn diagram. The principal differences between Venn and dual Venn diagrams are shown in Figure 2. An additive measure assigns zero measure to the contradiction ∅, which corresponds to the empty set of points of the diagram and thus is nowhere in the diagram;8 and an additive measure 7 To see this, note that H = (H∨∼E) & (Η∨E) and, when H⇒E, (H∨E)=E. Therefore, for a dual additive measure, M(H∨∼Ε) = M(H)–M(H∨E) = M(H)–M(E) = M(H)+M(∼E)–1; so that M(∼E|H) = M(∼E∨H)/M(H) = 1+ (M(∼E)–1)/M(H). 8 If we presume that the contradiction ∅ corresponds to an area or point in the diagram, that area or point must satisfy contradictory requirements. Since ∅⇒A and ∅⇒B for any pair of mutually exclusive propositions A and B, the area or point corresponding to ∅ must be contained within both of the disjoint areas corresponding to A and B. 17 assigns the largest possible measure to the universally true Ω, so Ω corresponds to the total space of unit area of the diagram. Figure 2. Venn Diagrams and Dual Venn Diagrams The dual Venn diagram associated with a dual additive measure is constructed analogously. The universally true Ω, zero in the dual measure, is represented by the empty set of points of the diagram and thus is nowhere in the diagram. Since the contradiction ∅ is assigned the greatest measure of unity, it corresponds to the total space of the diagram and has unit area. In a Venn diagram, entailment is represented by containment, as shown in Figure 3. If A⇒B, then the area corresponding to A in the Venn diagram is contained within that corresponding to B. Analogously, for A⇒B in a dual Venn diagram, the area associated with B is contained within that associated with A—the reverse of the Venn diagram. The remaining parts of Figure 3 illustrate the differences in binary relations between additive measures and dual additive measures. In the former, conjunction is represented by geometric intersections of areas; and disjunction is represented by unions of geometric areas. For dual additive measures, that relationship is reversed. 18 Figure 3. Binary relations and operations in Venn and dual Venn diagrams Thus taking disjunctions tends to reduce the area of the associated shape, so that smaller areas in the dual Venn diagram correspond to disjunctions of more propositions. Figure 4 shows that negation in both ordinary and dual Venn diagrams corresponds to taking the geometric complement. However the universally true disjunction of a proposition and its negation corresponds to the total space of an ordinary Venn diagram; whereas the universally false conjunction of a proposition with its negation corresponds to the total space of a dual Venn diagram. Figure 4. Negation in Venn and dual Venn diagrams 19 Figure 5 displays the condition required by axiom (3c) if the dual measure of two propositions is to be added to give the dual measure of their conjunction. In geometric form, the condition is quite unfamiliar in an ordinary Venn diagram. In the dual Venn diagram, it is merely the familiar requirement that the geometric shapes associated with the two propositions be disjoint. Figure 5. Condition for addition of measures in dual additive measures Figure 6 illustrates the formation of conditional measures for the important case of propositions A and B, where A⇒B. We can reduce the relevant difference between the two measures to one simple idea. In the two forms of the Venn diagrams, the geometric containment relations that express A⇒B are reversed. So, for ordinary measures, we have generically9 that m(A&B) = m(A) < m(B); but for the dual measures, generically, M(A∨B) = M(B) < M(A). Hence we have generically that m(A|B) = m(A&B)/m(B) < 1 and m(B|A) = 1; but M(B|A) = M(A∨B)/M(A) < 1 and M(A|B) = 1, as expressed in (9a), (9b), (10a) and (10b). 9 The condition “generically” rules out the special case in which m(A)=m(B) and M(A)=M(B). 20 Figure 6. Conditionals in additive and dual additive measures. Many of the above ideas are illustrated in the dual Venn diagrams of Figure 7 that represent a die toss. The total space is hexagonal to reflect the symmetry of the six outcomes. The propositions asserting an outcome of each individual die face, one, two, … , six are represented as 5/6th portions of the total space. The negation of each is the complementary 1/6th wedge, representing ∼one, ∼two, etc. The outcome even is the geometric intersection of the areas representing outcomes two, four and six. 21 Figure 7. Dual Venn diagrams for a die toss The shape representing even is contained within two; therefore, as indicated in Section 4.1 above, we can form M(even|two) and compute it as the ratio of the area 3/6 assigned to even and 5/6 assigned to two; that is, M(even|two) = (3/6)/(5/6) = 3/5. 6. The Representation of Complete Ignorance 6.1 Complete Ignorance as Self-Duality Additive measures and their dual additive measures have been used so far as ways of representing belief and disbelief. The duality explored here, however, can also be used to give a principled remedy to a limitation on the epistemic states that additive measures can represent. That limitation derives directly from their additivity. Such measures cannot directly represent ignorance as opposed to disbelief. Rather, as the values of an additive measures span the range from one to zero, they range from the representation of complete belief to complete disbelief. I have argued elsewhere (Norton, 2007, §4.1) that this interpretation of the range of values may be established in a principled way from the characteristic reciprocity of belief and disbelief. A high degree of belief in some proposition A forces a correspondingly high degree of disbelief in its negation ∼A. This reciprocity is implemented in the additivity of an additive measure. A high value of m(A), interpreted as representing a high degree of belief in A, forces a correspondingly 22 low value for the measure m(∼A) assigned to the negation ∼A. Therefore the low values of m must be interpreted as representing disbelief. How then might we represent ignorance as opposed to disbelief? We might invent alternative non-additive calculi, whose small values, we announce, represent ignorance or some mix of ignorance and disbelief. (See, for example, Shafer, 1976.) Since many such devices or alternative calculi can be invented, we should ask if there are any principled ways of identifying how ignorance should properly be represented. Elsewhere (Norton, forthcoming, §§6.2, 6.3) I have identified such a principle for the special case of complete ignorance. Informally, we are in a state of complete ignorance if we have no preference for any contingent proposition A over its negation, ∼A. That means that switching every contingent proposition with its negation leaves our epistemic state unchanged. Or, for every contingent proposition A, a state of complete ignorance assigns the same measure to A and to ∼A. This condition has an obvious formal expression: Self-Duality of Complete Ignorance: An epistemic state of complete ignorance is invariant in its contingent propositions under the dual map (7); that is, the state is self-dual in its contingent propositions, so that m(A) = M(A) =m(∼A) for all contingent A. It follows immediately that no additive measure can represent a state of complete ignorance. For, no additive measure can be its own dual, even in its contingent propositions only.10 While the contingent propositions of an additive measure conform to the addition rule (2c); the contingent propositions of a dual additive measure conform to the incompatible addition rule (3c). 6.2 Why Complete Ignorance Cannot be Represented by Sets of Additive Measures The above principle of self-duality enables us to render a verdict on a popular means of using additive measures to represent ignorance. While no one measure can do it, a long-standard 10 The trivial exception is the outcome space of one proposition, A. Then the additive measure m(A)=m(~A)=1/2 is self dual in the one contingent proposition A. 23 proposal is that we employ sets of additive measures, sometimes convex, sometimes not.11 Let the set of measures {mi}, where i varies over some index set, be a candidate representation of complete ignorance. Under the dual map (7) this set is not mapped back to itself. Instead it is mapped to the corresponding set of additive dual measures, {Mi}. That is, a set of additive measures fails to be self-dual, whether the set is convex or not. While sets of additive measures are not self-dual, we can readily define sets of measures that are self-dual. The simplest is just the set consisting of some additive measure m and its dual M, that is {m, M}. Under the negation map (7), m → M and M → m, so that {m, M} → {M, m} = {m, M}. More generally, a set of additive and dual additive measures {µi} will be self- dual, just in case if, for any additive measure m in the set, its dual M is also contained in the set; and conversely. Clearly many such sets are possible. Should we represent complete ignorance by such self-dual sets? The proposal faces interpretative, formal and pragmatic problems. On the interpretative level, how are we to think of the measures that form the set? Both additive and dual additive measures enter equally into the set of measures. If we are to interpret each in the same way, then they cannot be measures of belief or disbelief. For additive measures are non-decreasing as we pass from propositions to their logical consequences, a distinctive mark of a measure of belief; but dual additive measures are non-increasing as we pass from propositions to their logical consequences, a distinctive mark of a measure of disbelief. The obvious escape from this difficulty is somehow to keep the two measures distinct in the set. If we do keep them distinct, designating the additive measures as measures of belief and the dual additive measures as measures of disbelief,12 then we violate the 11 A set of measures is convex if, whenever m1and m2 are in the set, then so is λm1+(1-λ)m2, for all 0<λ<1. See Kyburg and Pittarelli (1996) for discussion and an inventory of the problems raised by the convexity of the sets. 12 This would correspond to replacing the set {m,M} = {M, m} by the ordered pair , where the first member is reserved for the additive measure and the second for the dual additive measure. The ordered pair is not self-dual, for, under the negation map (7) . In addition, in this simple case, the information in the ordered pair is 24 self-duality. For, under the map (7), an additive measure becomes a dual additive measure and vice versa. On the formal level, the requirement of self-duality does not pick out a unique ignorance state. There are as many self-dual sets as there are additive measures m and sets of them. Since the epistemic state of complete ignorance we seek seems to be unique, at most one of these sets can be the right representation. How are we to choose, in a principled way, which, if any, is the right one? Or if we are to say that each of the sets gets something partially right in characterizing the state, how are we to extract what is gotten right? Finally on the pragmatic level, the state of complete ignorance is interesting to us as an extreme case of partial ignorance. How are we to extend these sets to those that can plausibly represent partial ignorance? My preference is not to seek to represent ignorance through sets of measures, whatever their type. There are two problems facing the general idea using sets in this way. First, the use of sets renders ignorance as a second order sort of belief. We allow that many different belief- disbelief states are possible. We represent ignorance by presenting them all, in effect saying that we don’t know which is the pertinent one. The sort of ignorance I seek to characterize is first order ignorance; it is just not knowing which is the true outcome; not a second order uncertainty about an uncertainty. Second, sets of measures do not provide a local representation of ignorance. By a local representation, I mean one that assigns a definite “ complete ignorance” value to some outcome, whose meaning is independent of the values assigned elsewhere. For that is the natural way that ignorance arises. Within obvious limits, we can be ignorant of the truth of proposition A1, while having different beliefs about the other proposition A2, A3, … of the outcome space. If we are representing ignorance by sets of measures, the value assigned by one measure to one outcome cannot be interpreted without taking into account the other values assigned by the other measures. In sum, the self-duality of complete ignorance is an algebraic property that does not obtain for additive or dual additive measures. The attempt to realize the property by means of sets of measures amounts to an attempt to use additive measures to simulate the behavior of something non-additive. We shall see in the following that, if we relax the requirement of equivalent to the additive measure m and thus unable to represent ignorance for reasons given earlier. 25 additivity to monotonicity of the measures, then the requirement of self-duality will pick out a unique monotonic measure representing complete ignorance. 6.3 Relaxing Additivity: Monotonicity An automatic property of additive measures is that the measure m(A) assigned to proposition A cannot be greater than the measure m(B) assigned to any of its consequences B. The corresponding property for a dual additive measure is that the dual measure M(A) assigned to A cannot be less than the dual measure M(B) assigned to any of its consequences B. That is: Monotonicity. If A⇒B, then m(A) ≤ m(B); and M(A) ≥ M(B). (13) We generalize the notion of a measure if we drop the requirement of additivity (2c), (3c) and merely require monotonicity. The requirement has an intuitive meaning. The belief accorded to a proposition cannot be greater than the belief accorded to its consequences; and the disbelief accorded to a proposition cannot be less than the disbelief accorded to its consequences. In forgoing additivity, we now no longer demand that the belief assigned to a proposition is a function of the beliefs assigned to its disjunctive parts (and the corresponding property for disbelief). 6.4 A Unique Self-Dual Monotonic Measure Among monotonic measures, there is a unique complete ignorance measure mI and its dual MI that satisfies the requirement of the Self-Duality of Complete Ignorance: mI(A) = I for all contingent propositions A; mI(Ω) = 1 mI(∅) = 0 (14a) MI(A) = I for all contingent propositions A; MI(Ω) = 0 MI(∅) = 0 (14b) where the arbitrarily chosen “complete ignorance value” I lies in 0