key: cord-0968843-7sq41zd5 authors: Haig, David title: A Textual Deconstruction of the RNA World date: 2021-08-23 journal: Biosemiotics DOI: 10.1007/s12304-021-09444-w sha: 2e662e7e652ad4e073d6261f1c14020936ba7ae7 doc_id: 968843 cord_uid: 7sq41zd5 RNAs can do many things. They can store information, act in the world, and respond to the world. Because of these capabilities biologists have proposed a primordial ‘RNA world’ in which RNA, rather than DNA, performed the central role of replicator and repository of adaptive information. Deacon dismisses this hypothesis because replication is not about anything and because the structure of replicating molecules cannot contain information about the environment. I dispute both claims. An RNA and its opposite-sense complement represent each other and, by two rounds of complementation, represent themselves. Although (with some exceptions) nucleic acid sequences do not change in response to their present environment, these sequences embody information about ancestral environments via the selective filtering of alternative sequences in those environments. Nucleic acid sequences are the textual record of what has worked in the past. Deacon criticizes the idea that RNA replicators preceded the evolution of proteins. He reverses the "currently popular view that replicating molecules intrinsically constitute biological information" in favor of an evolutionary sequence in which "protein-like molecules are present long before nucleic acids." I am a fan of the RNA-world and the central role of replication in the origin of life. I also believe that most heritable information resides in nucleic acids and find the autogen to be a rather complex candidate for 'simplest possible' interpreter. Deacon and I agree that the construction of meaning resides in mechanisms of interpretation not in the inputs being interpreted. What then is the nature of our disagreements over the credentials of the RNA world and the centrality of replication? Do we merely use different words to describe similar concepts or is something substantive at stake? From Darwin to Derrida (D2D, Haig, 2020) considers the origins of biological interpretation. An analysis, of where Deacon and I differ, will require an introduction to D2D's formal vocabulary. D2D (Chap. 12) defines an interpreter as a mechanism that uses information in choice. Such mechanisms are associated with two entropies: an 'uncertainty' with respect to what could be observed from a repertoire of possible inputs (information); and an 'indecision' with respect to what action should be performed from a repertoire of possible outputs (meaning). Uncertainty is resolved by observation, indecision by choice of action. If an observation 'informs' an interpreter's choice of action, then the chosen action is the meaning of the information for the interpreter. It is a 'difference that makes a difference.' The internal mechanisms of interpreters determine how different inputs are related to different outputs. These mechanisms have evolved (or been trained or designed) to make apt connections between inputs and outputs. Complex interpreters are made up of systems of simpler interpreters. D2D defines a 'text' as an output that is intended to be used as an input to another process of interpretation. In D2D's definitions, meaning (output) encompasses all choices among alternatives, including firing or non-firing of a neuron, formation of a memory, construction of a phenomenal experience, and spoken and written statements about what inputs mean. Consider a watcher in a firetower with her finger on an alarm. Her internal mechanisms couple one bit of uncertainty (smoke or no-smoke) to one bit of indecision (alarm or no-alarm). An observation informs her choice of action: 'smoke' means 'alarm', 'no-smoke' means 'no-alarm'. Now consider another watcher for tell-tale whiffs of smoke outside a papal conclave. Her internal mechanisms couple one bit of uncertainty about the color of smoke from burning ballots (white or black) to one bit of indecision about the outcome of the most recent vote by the college of cardinals (decisive or indecisive): 'black' means the cardinals remain undecided; 'white' means they have elected a new pope. Smoke is not a text for the first watcher. It is unintended evidence of a fire. Smoke is a text for the second watcher. Its color was intended by the burner of ballots to be interpreted appropriately by watchers whose interpretative mechanisms were already adapted to the arbitrary code. (Our freedom resides in the immense degrees of freedom of our uncertainty and indecision, not merely single bits, and our ability to rewire our internal mechanisms from experience.) Let us now consider how a coronavirus genome is replicated. The semantics are simple: U means A, A means U, C means G, G means C (U = uracil; A = adenine; C = cytosine; G = guanine). Two successive applications of these semantic rules by an RNA-dependent RNA polymerase are required for replication of the viral genome: first a 'positive-sense' sequence is interpreted as its 'negative-sense' complement, then the 'negative-sense' sequence is interpreted as its 'positive-sense' complement. Positive-sense and negative-sense are reciprocally necessary for each other's production. The positive-sense genome can also be interpreted by a ribosome as instructions how to produce viral proteins (including RNA-dependent RNA polymerases and capsid proteins). [Coronaviruses are 'positive-strand RNA viruses' in which infectious particles contain a positive-sense single-stranded RNA from which proteins can be directly translated. Ebola virus, by contrast, is a 'negative-strand RNA virus' in which proteins are translated from positive-sense messenger RNAs transcribed from the negative-sense genome of the infectious particle.] The positive-sense genome of a coronavirus is intended to be interpreted by a polymerase as negative-sense RNA and by host ribosomes as instructions for the assembly of viral proteins. The same text means different things to different interpreters. Negative-sense RNA is intended to be interpreted as positive-sense RNA. The intentionality of these processes arises from reproductive recursion: genomes that are copied today were copied in the past but genomes that were not copied in the past are not present to be copied today; only genomes of the past that encoded functional polymerases have left living descendants. Cells can be simultaneously infected by two coronaviruses whose genomes encoded different capsid proteins and different polymerases. In such cases of superinfection, it is possible that an α-strain genome could be replicated by a β-strain polymerase and packaged in β-strain capsid proteins. However, if the resulting chimeric viral particle infects a new cell, the ribosomes of that cell will produce α-strain polymerases and αstrain capsid proteins. The α-strain RNA informs the assembly of the next generation of proteins. Capsid proteins are not a text that informs the next generation of proteins. The heritable transfer of information resides in the text (RNA) not in the interpreters of the text (polymerases and ribosomes), but a text only has meaning through its interpreters. Here is the rub. A coronavirus replicates by using the protein-synthesizing machinery of a host cell. Its RNA genome can do nothing on its own. The modern coronavirus must trace its ancestry to a much simpler form of replication. Deacon is dismissive of the idea that replication of RNAs could be central to the origins of life: "replication isn't about anything …[it] just is what gets copied or not". Furthermore, "replicating molecules are passive artifacts. They don't actively adapt to their environments, and so their structure does not contain or acquire information about the environment." A DNA sequence is 'passive' in the same sense that Deacon's essay is passive. It provides information to be interpreted but is not an 'active' participant in the interpretations of its readers. However, I strongly disagree with Deacon that replicating molecules do not contain information about the environment. Differential replication, dismissed by Deacon as "what gets copied or not", results in the accumulation of information in surviving texts about what has worked in past environments. This information comes from the environment unintendedly choosing embodied interpretations of alternative texts. By this process, surviving texts come to provide specifications for the assembly of interpreters that actively interpret information from their environment. I also disagree that replication is not about anything. A positive-sense RNA informs synthesis of a negative-sense RNA which informs synthesis of a copy of the original positive-sense RNA. Aboutness, for Deacon, is being about something different. Deacon dismisses RNA replication as possessing aboutness because what is replicated is the same as the starting point, but this ignores the two rounds of application of the semantic rules of RNA complementation that are required to regenerate the original sequence. Positive-sense and negative-sense RNAs represent each other in different chemical forms, with complementary nucleotides in reverse order relative to the chemical backbone. The Iliad and the Odyssey are now learned from written texts or recited directly from written texts. The passive text is now distinct from active interpretation of the text, but the poems were composed long before they were first written down. During this earlier stage of transmission, bards learned an oral text from hearing recitations of other bards. Text and interpretation were united in active oral performances in bardic chains of transmission. Recitation was much easier once there was a written text. Information could be more easily retained by writing it down than by committing past performances to memory. The RNA-world hypothesis posits that RNAs were once active participants in their own replication. They embodied both text and performance. They were not merely the passive record of past choices of nature, copied by simple semantic rules, but also dynamic actors in the world, including active participants in their own replication. The need to record and to act undoubtedly involved compromises for the optimal performance of both functions. An advantage from separating the archival function from active interpretation of the text may have been one reason why relatively passive DNA prevailed over more active RNA as the archival repository of hereditary information. At the same time, many of the metabolic functions of RNAs have been assumed by proteins with their choice of 20 amino acid side-chains rather than the four nucleotides of RNA. Proteins are a more expressive language than RNA. However, this relegation of RNA to the role of a messenger between DNA text and protein performance neglects the central roles that 'non-coding' RNAs continue to play in information-processing within cells (Mainieri & Haig, 2018 , 2019 Haig & Mainieri, 2020) . Chapter 13 of D2D describes ancient RNA sequences (riboswitches) that function as simple interpreters by conformational changes that respond to things present in nature. The 'aptamer' of a riboswitch detects something physical, often a close chemical relative of a ribonucleotide, and then the 'expression platform' responds with some action. There is no physicochemical reason why a particular aptamer must be associated with a particular expression platform. Indeed, a synthetic biologist can exploit the 'arbitrary nature of the sign' to create useful devices that respond to the world in novel ways by recombining aptamers and expression platforms. The response to a particular molecule is physicochemically arbitrary, but the riboswitches that have survived the filter of natural selection combine an aptamer with an expression platform that is functionally apposite. A successful riboswitch responds to information from the world with meaningful action. In the RNA world, a riboswitch was responsible not only for its interpretation of the world but also for its replication in the world. Modern riboswitches do not replicate but are transcribed from a DNA text. One of Deacon's leitmotifs is to downplay the informational functions of nucleic acids compared to the metabolic functions of proteins. He sees the energetic functions of nucleotides as primary and their informational function as secondary. In his view, the encoding of proteins by nucleic acids evolved as a constraint on possible proteins. (An advocate for nucleic acids might say instead that the code determines the specificity of proteins.) In general, Deacon sees the informational and replicative functions of nucleic acids as relative late-comers. By contrast, the RNA-world hypothesis sees proteins as relative late-comers. It posits that the RNA hexanucleotides 5'-AUGUGG-3' and 3'-CCACAU-5' reciprocally represented each other before 5'-AUGUGG-3' came to represent methionine-tryptophan or 5'-UACACC-3' came to represent tyrosine-threonine. Metabolic RNAs may have been interacting with amino acids as co-factors long before RNAs began to guide the linking together of amino acids in specified sequences as complex polypeptides. If metabolism and replication evolved together, then it would be a moot point which came first. A leitmotif of D2D is that arguments about which came first-the proverbial chicken or epigrammatic egg-often misunderstand the nature of reproductive recursion and its implications for concepts of cause and effect. The fundamental confusion comes from not distinguishing particular events from kinds of events. It is a matter of fact whether a particular chicken laid an egg or hatched from that egg-it cannot do both-but chickens considered historically have been both causes and effects of eggs. The relationship of genotype to phenotype resembles that between chickens and eggs. Genotypes inform the construction of phenotypes but the information in genes is derived from past selection of phenotypes that determines which genotypes are replicated. Genotypes considered historically have been both causes and effects of phenotypes. An explanation of causes by their effects is an explanation by final causes. One can truly say that genotypes exist for the sake of phenotypes and phenotypes exist for the sake of genotypes. Similarly, proteins exists for the sake of genes and genes exist for the sake of proteins. The recursive testing and emendation of a reliable text is the fundamental source of the intentionality of living things. In the final stages of writing this commentary, I read Hoffmeyer and Emmeche (1991). The parallels between what I had just written and what I was now reading were unheimlich. We even used the same metaphors, not to mention our mutual fascination with chickens and eggs. "Self-reference is the fundament on which life evolves, the most basal requirement" (p. 126). The transition from RNA to DNA resembled the transition from a spoken to a written language (p. 142). The informational sphere of RNA is separated from the functional sphere of peptides (p. 151). Despite significant differences in our approaches, the similarities are striking. Meaning resides in the use of information in choice. The useful information in genetic sequences has come from past choices of the environment that have caused some sequences to be replicated and others eliminated. The genetic sequence is a text that preserves information about what has worked in the past. The evolving text is the trace of différance, a record of the unintended choices of nature that have selected intentional organisms with purposeful parts and, at the same time, it is the record of alternatives rejected because there can be no choice without a difference. No meaningful information can accumulate without a record of past choices. The origin of meaning was the origin of textual inscription. In the beginning was the word. How molecules became signs From Darwin to Derrida The evolution of imprinted microRNAs and their RNA targets Lost in translation: the 3'-UTR of IGF1R as a long noncoding RNA. Evolution Retrotransposon gag-like 1 (RTL1) and the molecular evolution of selftargeting imprinted microRNAs Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations