Microsoft Word - A pluralist view about information.doc 1 A pluralist view about information Sebastian Fortin 1 – Olimpia Lombardi 1 – Leonardo Vanni 2 1 CONICET – Universidad de Buenos Aires 2 Universidad de Buenos Aires Abstract Focusing on Shannon information, this article shows that, even on the basis of the same formalism, there may be different interpretations of the concept of information, and that disagreements may be deep enough to lead to very different conclusions about the informational characterization of certain physical situations. On this basis, a pluralist view is argued for, according to which the concept of information is primarily a formal concept that can adopt different interpretations that are not mutually exclusive, but each useful in a different specific context. 1. Introduction In the Book 11 of his Confessions, St. Augustine asks himself: “What, then, is time? If no one asks me, I know what it is. But if I wish to explain it to one that asketh, I know not.” Something similar happens today with information. Both in everyday life and in science, the word ‘information’ is so pervasive that we all believe we know what we mean by it. However, as soon as we are asked for its precise meaning, the opinions substantially diverge. As many recognize, information is a polysemantic concept that can be associated with different phenomena (Floridi 2010). In this conceptual tangle, the first distinction to be introduced is between a semantic and a non-semantic view of information. According to the first view, information is something that carries semantic content (Bar-Hillel and Carnap 1953; Bar-Hillel 1964), and which is therefore strongly related with semantic notions such as reference, meaning and representation. In general, semantic information is carried by propositions that intend to represent states of affairs; so, it has “aboutness”, that is, it is directed to other things. And although it is still controversial whether false factual content may qualify as information, semantic information is strongly linked with the notion of truth. 2 Non-semantic information, also called ‘mathematical’ or ‘statistical’, is concerned with the statistical properties of a given system and/or the correlations between the states of two systems, independently of the meanings of those states. The classical locus of mathematical information is the paper where Shannon (1948) introduces a precise formalism designed to solve certain specific technological problems. Shannon’s theory is purely quantitative, it ignores any issue related to informational content: “[the] semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages” (Shannon 1948, 379). Although Shannon’s theory is the traditional formalism to quantify information, it is not the only one. For instance, Fisher information measures the dependence of a random variable X on an unknown parameter θ upon which the probability of X depends (Fisher 1925), and algorithmic information measures the length of the shortest program that produces a string on a universal Turing machine (Chaitin 1987). In quantum information theory, von Neumann entropy gives a measure of the quantum resources necessary to faithfully encode the state of the source-system (Schumacher 1995). It might be supposed that, when confined to a formal framework, the meaning of ‘information’ is clear: given the mathematical theory, information is what this theory describes. However, this is not the case. Even on the basis of the same formalism, there may be different interpretations of the concept of information, and disagreements may be deep enough to lead to different conclusions in certain physical situations. Although disagreements may arise regarding any formalism, we will focus on Shannon’s theory since it is the most widespread formalism, even applicable in the quantum context (Rovelli 1996; Timpson 2003). Finally, we will argue for a pluralist view according to which, once mathematically characterized, the concept of information is a formal concept that can adopt different interpretations not mutually exclusive, each useful in a different context. 2. Shannon’s Theory According to Shannon’s theory (Shannon 1948), transmission of information requires a source S, a receiver R and a channel CH. If S has a range of possible states 1 ,..., n s s –letters–, whose respective probabilities of occurrence are 1 ( ),..., ( ) n p s p s , the amount of information generated at the source by the occurrence of i s is defined as ( ) log(1 ( )) i i I s p s= . When ‘log’ is the logarithm to the base 2, the resulting unit of measurement is called ‘bit’ (if the natural logarithm is used, the unit is the nat, and in 3 the case of the logarithm to the base 10, the unit is the Hartley). Since S produces long sequences of states –messages–, the average amount of information generated at the source is defined as: 1 ( ) ( ) log(1 ( )) n i i i I S p s p s = =∑ Analogously, if the possible states of R are 1 ,..., m r r , with respective probabilities 1 ( ),..., ( ) m p r p r , the amount of information received at the receiver by the occurrence of j r is ( ) log(1 ( )) j j I r p r= , and the average amount of information received at the receiver is: 1 ( ) ( ) log(1 ( )) m j j j I R p r p r = =∑ The relationship between ( )I S and ( )I R can be represented as: ( ; )I S R : mutual information E : equivocation N : noise where ( ; ) ( ) ( )I S R I S E I R N= − = − is the information generated at S and received at R, E is the information generated at S but not received at R, and N is the information received at R but not generated at S (always average amounts). E and N are measures of the dependence between S and R and, therefore, are functions not only of S and R, but also of the channel CH, defined by the matrix ( ) j i p r s   , where ( )j ip r s is the conditional probability of the occurrence of jr given the occurrence of i s , and the elements in any row must sum to 1. Thus, N and E are computed as: 1 1 ( ) ( )log(1 ( )) n m i j i j i i j N p s p r s p r s = = =∑ ∑ 1 1 ( ) ( )log(1 ( )) m n j i j i j j i E p r p s r p s r = = =∑ ∑ where ( ) ( ) ( ) ( ) i j j i i j p s r p r s p s p r= . One of the most relevant results in Shannon’s theory is the noiseless coding theorem, according to which the value of ( )I S is equal to the average number of bits necessary to code a letter of the source using an ideal code: ( )I S measures the optimal compression of the source messages. In fact, the messages of N letters produced by S fall into two classes: one of approximately ( )2NI S typical messages, and the other of atypical messages. When N → ∞ , the probability of an atypical message becomes negligible; so, the source can be conceived as producing only ( )2NI S possible messages. This suggests a I(S) I(R) I(S,R) E N 4 natural strategy for coding: each typical message is coded by a binary sequence of length ( )NI S , in general shorter than the length N of the original message. Given this formalism, it seems that there is nothing controversial in the concept of Shannon information: it would be what Shannon’s theory describes. However, matters are not so simple. During the last years, it has been usual to hear in the philosophy of physics (not in the physics) community that the problem of the interpretation of information is dissolved because the word ‘information’ is an abstract noun. Timpson (2004, 2008) insists that what is produced at the source and that we desire to transmit is not a token-sequence but a type-sequence; but types are abstract and, so, they are not part of the spatio-temporal content of the world. Therefore, according to this view information is not a substance, not even a physical entity, because it is not an entity at all: there is nothing the word ‘information’ refers to. Despite the diffusion of this position, one may suspect that information is even more abstract than a type. In fact, types are not items to be measured in bits. Moreover, the information of very different types may be the same, since the only relevant aspect in Shannon information is that the actual sequence is one selected from a set of possible sequences. And it is not even the case that we always want to transmit the same type-sequence: the states of the receiver may be completely different, even in a type sense, than the states of the source: the success of information transmission depends on the decision about the expected correlations, embodied in the fidelity function, between the source states and the receiver states. In brief, Timpson unwittingly reintroduces semantic issues −analogous to those related with the difference between proposition, sentence and utterance− in the discussion about Shannon information, a field where semantics plays no role at all. Of course, these briefs comments are not a full analysis of Timpson’s very articulated position, which deserves a specific article. Nevertheless, they open the way to focus on the different views about Shannon information that are still present in philosophical and physical discussions. 3. Epistemic and Physical Interpretations of Information A concept usually connected with the notion of information is that of knowledge: information provides knowledge, modifies the state of knowledge of those who receive it. Some believe that the link between information and knowledge is a feature of the everyday notion of information, which must be carefully distinguished from the Shannon’s technical concept (Timpson 2004). However, the idea of knowledge is present also in the philosophical and the physical discourse about information. 5 In fact, it is common to find authors who even define information in terms of knowledge. For instance, on the basis of Shannon’s theory as the underlying formalism for his proposal, Dretske says: “information is a commodity that, given the right recipient, is capable of yielding knowledge.” (1981, 47). According to MacKay, information is linked to an increase in knowledge on the receiver’s side: “Suppose we begin by asking ourselves what we mean by information. Roughly speaking, we say that we have gained information when we know something now that we didn't know before; when ‘what we know’ has changed.” (1969, 10). This presence of the notion of knowledge is not confined to authors who try to supply a semantic content to statistical information. Some philosophers of physics are also persuaded that the core meaning of the concept of information, even in its technical sense, is linked to the concept of knowledge (Myrvold, personal communication). And physicists frequently speak about what we know or may know when dealing with information. For instance, Rovelli (1997) insists that quantum mechanics is a theory about information because it talks about the relations between what different observers “know” about a quantum system. Zeilinger even equates information and knowledge when he says that “We have knowledge, i.e., information, of an object only through observation” (1999, 633) or, with Bruckner, “For convenience we will use here not a measure of information or knowledge, but rather its opposite, a measure of uncertainty or entropy” (2009, 681-82). Even in a traditional textbook about Shannon’s theory one can read that information “is measured as a difference between the state of knowledge of the recipient before and after the communication of information.” (Bell 1957, 7), and that it must be relativized with respect to the background knowledge available before the transmission: “the datum point of information is then the whole body of knowledge possessed at the receiving end before the communication.” (Bell 1957, 7). It is worth stressing that, from the epistemic perspective, the possibility of acquiring knowledge about a source by consulting the state of a receiver is rooted in the nomic character of the regularities underlying the whole situation. In fact, the conditional probabilities that define the channel do not represent merely de facto correlations; they are determined by a network of lawful connections between the states of the source and the states of the receiver. A different view about information is the one that detaches the concept from the notion of knowledge and considers information as a physical magnitude. This is the position of many physicists and most engineers, for whom the essential feature of information is its capacity to be generated at one point of the physical space and transmitted to another point; it can also be accumulated, stored and 6 converted from one form to another, like other physical magnitudes such as energy. In this case, the capability of providing knowledge is not a central issue, since the transmission of information can be used only for control purposes, such as controlling a device at the receiver end by modifying the state of the source. According to this view, it is precisely because of the physical nature of information that the dynamics of its flow is constrained by physical laws and facts: “Information handling is limited by the laws of physics and the number of parts available in the universe” (Landauer 1991, 29; see also Bennett and Landauer 1985). In general, the physical interpretation of information comes strongly linked with the idea expressed by the well-known dictum ‘no information without representation’: the transmission of information between two points of physical space necessarily requires an information-bearing signal, that is, a physical process propagating from one point to the other. Landauer is an explicit defender of this position when he claims that “Information is not a disembodied abstract entity; it is always tied to a physical representation. It is represented by engraving on a stone tablet, a spin, a charge, a hole in a punched card, a mark on a paper, or some other equivalent.” (1996, 188). This view is also adopted by some philosophers of science; for instance, Kosso states that “information is transferred between states through interaction.” (1989, 37). The need of a carrier signal is natural in the light of the generic idea that physical influences can only be transferred through interactions. In the context of this physical interpretation, information tends to be compared with energy, which was born in the specific field of mechanics as a pragmatic notion related with the resources we can draw from a mechanical system, but ended up being conceived as a highly wide reaching concept: at present the word ‘energy’ refers to an item that pervades the whole world of physics. On this basis, information is conceived by many physicists as a physical entity with the same ontological status as energy; it has also been claimed that its essential property is the power to manifest itself as structure when added to matter (Stonier 1990, 1996). 4. Epistemic versus Physical Interpretations of Information If the difference between the epistemic and the physical interpretations of information is clear from a conceptual viewpoint, it turns out to be even more clear when the concept of information is applied to particular situations. Let us consider a source S that transmits information to two physically isolated receivers RA and RB via a certain physical link. In this case, the correlations between the states of the two receivers are 7 not accidental, but functions of the physical dependence of RA and RB on S. Nevertheless, there is no physical interaction between the receivers. The informational description of this situation is completely different from the viewpoints given by the two interpretations of the concept of information. According to the physical interpretation, it is clear that no information is being transferred between RA and RB since there is no physical signal traveling between them. However, from an epistemic interpretation, nothing prevents us from admitting the existence of an informational link between the two receivers. In fact, we can define a communication channel between RA and RB because it is possible to learn something about RB by looking at RA and vice versa: “from a theoretical point of view [. . .] the communication channel may be thought of as simply the set of depending relations between [a system] S and [a system] R. If the statistical relations defining equivocation and noise between S and R are appropriate, then there is a channel between these two points, and information passes between them, even if there is no direct physical link joining S with R.” (Dretske 1981, 38). The receiver RB may even be farther from the source S than RA, so that the events at RB may occur later than those at RA. Nevertheless, this is irrelevant from the epistemic view of information: although the events at RB occur later, RA carries information about what will happen at RB. Somebody might consider that the difference in the informational characterization of the situation described above is a mere curiosity with no philosophical interest. However, this kind of disagreements has also relevant consequences in the characterization of central notions in the philosophy of science. For instance, there is an important philosophical tradition that explains scientific observation in terms of information. In order to elucidate the notion of observation without resorting to perceptual matters, Shapere proposes that x is directly observed if information is received by an appropriate receptor and that information is transmitted from the entity x to the receptor without interference (Shapere 1982). Brown agrees with Shapere in stressing that observing an item I consists in gaining information about I by examining another item I* (Brown 1987). Kosso (1989) also adheres to this tradition with his “interaction-information” account of scientific observation. In general (with the exception of Kosso, who relies on Shannon’s theory), in the discussions about scientific observation the concept of information is not sufficiently specified in formal terms, so the interpretation of the concept is even less considered. However, the way in which information is conceived leads to very different consequences regarding the view about observation. This turns out to be particularly clear in the so-called ‘negative experiments’, which were originally devised as a theoretical tool for analyzing the quantum measurement problem (see Jammer 1974). Nevertheless, 8 they can be regarded independently from quantum mechanics: in a negative experiment it is assumed that an event has been observed by noting the absence of some other event. This is the case of neutral weak currents, which are observed by noticing the absence of charged muons. But the conceptual core of negative experiments can be understood by means of a very simple example. Let us consider a tube in whose middle point a classical particle is emitted towards one of the ends of the tube; a detection device is placed at one of the ends, say A, in order to know in which direction the particle was emitted. Since there is a perfect anticorrelation between both ends of the tube, by looking at the right end A, we can know the state at the left end B. Nevertheless, the instantaneous propagation of a signal between A and B is physically impossible. If after an appropriate time –depending on the velocity of the particle and the length of the tube– the device at A indicates no detection, we can conclude that the particle was emitted toward the left end B. But, have we observed the direction of the emitted particle? From an informational account of scientific observation, the answer depends on the interpretation of the concept of information adopted. On the basis of an epistemic interpretation, a communication channel between the two ends of the tube can be defined, which allows us to observe the presence of the particle at B, even though there is no signal between B to A. The physical view leads us to a concept of observation narrower than the previous one: by looking at the detector we observe the state at A, but we do not observe the state at B; such a state is inferred. As it has been repeatedly noticed, Shannon information is not tied to classical physics: any type of physical system can be used to design the informational situation (Timpson 2003, 2004; Dwell 2003). Therefore, Shannon’s theory can in principle be applied to the quantum domain, in particular, to EPR- type experiments, characterized by theoretically well-founded correlations between two spatially separated particles. During many years it was repeated that information cannot be sent between both particles because the propagation of a superluminal signal from one particle to the other is impossible: there is no information-bearing signal that can be modified at one point of space in order to carry information to the other spatially separated point. But the fact that the physical interpretation of information underlies that claim was usually not noticed. On the contrary, the epistemic interpretation, which only requires correlations, would face no problem in defining an informational channel between the two EPR-particles. Disagreements increase when quantum information comes into play. Teleportation is one of the paradigmatic phenomena in this field. Broadly speaking, an unknown quantum state is transferred from Alice to Bob with the assistance of a shared pair prepared in an entangled state and of two classical bits 9 sent from Alice to Bob (the description of the protocol can be found in any text on the matter). Although the situation is usually not strictly described in informational terms (not Shannon’s nor quantum informational terms), the idea is that the very large (strictly infinite) amount of information required to specify the teletransported state is transferred from Alice to Bob by sending only two bits. When addressing this problem, many physicists try to find a physical link between Alice and Bob that could play the role of carrier of information. For instance, Penrose (1998) and Jozsa (1998, 2004) claim that information may travel backwards in time: “How is it that the continuous ‘information’ of the spin direction of the state that she wishes to transmit […] can be transmitted to Bob when she actually sends him only two bits of discrete information? The only other link between Alice and Bob is the quantum link that the entangled pair provides. In spacetime terms this link extends back into the past from Alice to the event at which the entangled pair was produced, and then it extends forward into the future to the event where Bob performs his.” (Penrose 1998, 1928). According to Deutsch and Hayden (2000), the information travels hidden in the classical bits. These physicists do not explicitly acknowledge that the problem derives from the physical interpretation of information to which they strongly adhere, and that an epistemic view would not commit them to find a physical channel between Alice and Bob. Of course, an elucidation of the concept of information does not dissolve all the conundrums involved in teleportation (see Timpson 2006), or in the phenomenon of entanglement that underlies it. Nevertheless, such elucidation would help us to find a way out of the problems derived from the informational characterization of teleportation. One may wonder how essential the need of a spatio- temporal link is in the physical interpretation of information. Or one may reconstruct the situation in Shannon terms to conclude that the information effectively transmitted (the mutual information) is really not very large, to the extent that the receiver cannot retrieve the whole information generated at the source. Or one may even decide to leave aside the physical interpretation in favor of an epistemic view that recovers the relation between information and knowledge. 5. A Pluralist Approach to Information Up to this point, the epistemic and the physical interpretations of Shannon information were presented as rival; nevertheless, this is not necessarily the case. Although the physical interpretation has been the most usual in the traditional textbooks used in engineer’s training, this has changed in recent times: in general, present-day textbooks explain information theory in a formal way, with no mention of sources, receivers or signals, and the basic 10 concepts are introduced in terms of random variables and probability distributions over their possible values. Only when the formalism has been presented, is the theory applied to the traditional case of communication. For instance, in their extensively used book Cover and Thomas emphasize that: “Information theory answers two fundamental questions in communication theory […]. For this reason some consider information theory to be a subset of communication theory. We will argue that it is much more. Indeed, it has fundamental contributions to make in statistical physics […], computer sciences […], statistical inference […] and to probability and statistics.” (1991, 1) The idea that the concept of information is completely formal is not new. Already Khinchin (1957) and Reza (1961) conceived information theory as a new chapter of the theory of probability. From this perspective, Shannon information not only is not a physical magnitude, but it also loses its nomic ingredient: the mutual information between two random variables can be defined even if there is no lawful relationship between them and their conditional probabilities express only de facto correlations. If the concept of information is purely formal and belongs to a mathematical theory, the word ‘information’ does not belong to the language of empirical sciences −or to ordinary language−: it has no extralinguistic reference in itself. Its “meaning” has only a syntactic dimension. According to this view, the generality of the concept of Shannon information derives from its exclusively formal nature; this generality is what makes it a powerful formal tool for empirical science, applicable to a variety of fields. From this formal perspective, the relationship between the word ‘information’ and the different views of information is the logical relationship between a mathematical object and its interpretations, each one of which endows the term with a specific referential content. The epistemic view, then, is only one of the different possible interpretations, which may be applied in psychology and in cognitive sciences by using the concept of information to conceptualize the human abilities of acquiring knowledge (see e.g. Hoel, Albantakis and Tononi 2013). The epistemic interpretation might also serve as a basis for the philosophically motivated attempts to add a semantic dimension to a formal theory of information (MacKay 1969; Nauta 1972; Dretske 1981) In turn, the physical view, which makes information a physical magnitude carried by signals, is clearly the appropriate interpretation in communication theory, in which the main problem consists in optimizing the transmission of information by means of physical carriers whose energy and bandwidth is constrained by technological and economic limitations. But this is not the only possible physical 11 interpretation: if S is not interpreted as a source with states but a macrostate compatible with many equiprobable microstates, ( )I S represents the Boltzmann thermodynamic entropy of S. Furthermore, in computer sciences a communicational information may be defined, such that, if S is interpreted as a binary string of finite length, ( )I S can be related with the algorithmic complexity of S. The understanding of the relationship between the formal concept of information and its interpretations serves for assessing the usually obscure extrapolations from communication theory to thermodynamics or computing. Summing up, this pluralist view about information rejects the question about “the” meaning of information: “The word ‘information’ has been given different meanings by various writers in the general field of information theory. [...] It is hardly to be expected that a single concept of information would satisfactorily account for the numerous possible applications of this general field.” (Shannon 1993, 180). 6. References Bar-Hillel, Yehoshua 1964. Language and Information: Selected Essays on Their Theory and Application. Reading, Mass: Addison-Wesley. Bar-Hillel, Yehoshua, and Rudolf Carnap 1953. “Semantic Information.” The British Journal for the Philosophy of Science 4:147-57. Bell, David 1957. Information Theory and its Engineering Applications. London: Pitman & Sons. Bennett, Charles, and Rolf Landauer 1985. “The Fundamental Physical Limits of Computation.” Scientific American 253:48-56. Brown, Harold 1987. Observation and Objectivity. Oxford: Oxford University Press. Brukner, Časlav, and Anton Zeilinger 2009. “Information Invariance and Quantum Probabilities.” Foundations of Physics 39:677-89. Chaitin, Gregory 1987. Algorithmic Information Theory. New York: Cambridge University Press. Cover, Thomas, and Joy Thomas 1991. Elements of Information Theory. New York: JohnWiley & Sons. Deutsch, David, and Patrick Hayden 2000. “Information Flow in Entangled Quantum Systems.” Proceedings of the Royal Society of London A 456:1759-74. Dretske, Fred 1981. Knowledge and the Flow of Information. Oxford: Basil Blackwell. Duwell, Armond 2003. “Quantum Information Does Not Exist.” Studies in History and Philosophy of Modern Physics 34:479-99. 12 Fisher, Ronald 1925. “Theory of Statistical Estimation.” Proceedings of the Cambridge Philosophical Society 22:700-25. Floridi, Luciano 2010. Information – A Very Short Introduction. Oxford: Oxford University Press. Hoel, Erik, Larissa Albantakis, and Giulio Tononi 2013. “Quantifying Causal Emergence Shows that Macro Can Beat Micro.” Proceedings of the National Academy of Sciences 110:19790-95. Jammer, Max 1974. The Philosophy of Quantum Mechanics. New York: John Wiley & Sons. Jozsa, Richard 1998. “Entanglement and Quantum Computation.” In The Geometric Universe, ed. S. Huggett, L. Mason, K. P. Tod, S. T. Tsou, and N. M. J. Woodhouse, 369-79. Oxford: Oxford University Press. ––– 2004. “Illustrating the Concept of Quantum Information.” IBM Journal of Research and Development 4:79-85. Khinchin, Aleksandr 1957. Mathematical Foundations of Information Theory. New York: Dover. Kosso, Peter 1989. Observability and Observation in Physical Science. Dordrecht: Kluwer. Landauer, Rolf 1991. “Information is Physical.” Physics Today 44:23-29. ––– 1996. “The Physical Nature of Information.” Physics Letters A 217:188-93. MacKay, Donald 1969. Information, Mechanism and Meaning. Cambridge: MIT Press. Nauta, Doede 1972. The Meaning of Information. The Hague: Mouton. Penrose, Roger 1998. “Quantum Computation, Entanglement and State Reduction.” Philosophical Transactions of the Royal Society of London A 356:1927-39. Reza, Fazlollah 1961. Introduction to Information Theory. New York: McGraw-Hill. Rovelli, Carlo 1996. “Relational Quantum Mechanics.” International Journal of Theoretical Physics 35:1637-78. Schumacher, Benjamin 1995. “Quantum Coding.” Physical Review A 51:2738-47. Shannon, Claude 1948. “The Mathematical Theory of Communication.” Bell System Technical Journal 27:379-423. ––– 1993. Collected Papers, ed Neil Sloane, and Aaron Wyner. New York: IEEE Press. Shapere, Dudley 1982. “The Concept of Observation in Science and Philosophy.” Philosophy of Science 49:485-525. Stonier, Tom 1990. Information and the Internal Structure of the Universe: An Exploration into Information Physics. New York-London: Springer. ––– 1996. “Information as a Basic Property of the Universe.” Biosystems 38:135-40. 13 Timpson, Christopher 2003. “On a Supposed Conceptual Inadequacy of the Shannon Information in Quantum Mechanics.” Studies in History and Philosophy of Modern Physics 34:441-68. ––– 2004. Quantum Information Theory and the Foundations of Quantum Mechanics. PhD diss., University of Oxford (quant-ph/0412063). ––– 2006. “The Grammar of Teleportation.” The British Journal for the Philosophy of Science 57:587- 621. ––– 2008. “Philosophical Aspects of Quantum Information Theory.” In The Ashgate Companion to the New Philosophy of Physics, ed. Dean Rickles, 197-261. Aldershot: Ashgate Publishing. Zeilinger, Anton 1999. “A Foundational Principle for Quantum Mechanics.” Foundations of Physics 29:631-43.