key: cord-0323677-t0hwjw0e authors: U.A.Rozikov, title: Thermodynamics of DNA-RNA renaturation date: 2020-07-01 journal: nan DOI: 10.1142/s0219887821500961 sha: 8d54e9ba777f0b10844e4d14504feba91c4045b2 doc_id: 323677 cord_uid: t0hwjw0e We consider a new model which consists of a DNA together with a RNA. Here we assume that DNA is from a mammal or bird but RNA comes from a virus. To study thermodynamic properties of this model we use methods of statistical mechanics, namely, the theory of Gibbs measures. We use these measures to describe phases (states) of the DNA-RNA system. Using a Markov chain (corresponding to Gibbs measure) we give conditions (on temperature) of DNA-RNA renaturation. Each molecule of DNA is a double helix formed by two complementary strands of nucleotides held together by hydrogen bonds between G+C and A+T base pairs, where C=cytosine, G=guanine, A=adenine, and T =thymine. Duplication of the genetic information occurs by the use of one DNA strand as a template for formation of a complementary strand. The genetic information stored in an organism's DNA contains the instructions for all the proteins the organism will ever synthesize. It is known that (see, for example, [1] ) genetic information is carried in the linear sequence of nucleotides in DNA. Many experimental and theoretical works have brought quantitative insights into DNA base-pairing dynamics that is reviewed in [7] . RNA 1 is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA is assembled as a chain of nucleotides, but unlike DNA, RNA is found in nature as a single strand folded onto itself, rather than a paired double strand. Cellular organisms use messenger RNA to convey genetic information (using the nitrogenous bases of C, G, A and U=uracil) that directs synthesis of specific proteins. All viruses contain 2 nucleic acid, either DNA or RNA (but not both), and a protein coat, which encases the nucleic acid. Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans, these viruses cause respiratory tract infections that can range from mild to lethal. In this paper we study thermodynamic properties of a model which consists a DNA (from a mammal or bird) together with a RNA (from a virus). Studying DNA's thermodynamics one wants to know how temperature affects the nucleic acid structure of double-stranded DNA [6] . There are few models of thermodynamics of DNAs ( [2] , [9] , [14] ). In the recent papers [11] , [12] we gave Ising and Potts models of DNAs and studied their thermodynamics. Here we shall use the arguments of these papers to study thermodynamic behavior of a system consisting a DNA and an RNA. The paper is organized as follows. In Section 2 we give main definitions and define our model of DNA and RNA. Moreover, we give a system of functional equations, each solution of which defines a consistent family of finite-dimensional Gibbs distributions and guarantees existence of thermodynamic limit for such distributions. These Gibbs measures are important to describe states of the DNA-RNA system. Section 3 is devoted to translation invariant Gibbs measures (i.e. constant solutions of the system of functional equations). We show uniqueness of translation invariant Gibbs measure (depending on parameters of the model). In the last section by properties of Markov chains (corresponding to Gibbs measures) we give conditions (on temperature) of DNA-RNA renaturation 2. System of equations describing of DNA-RNA renaturation The structure of DNA, at the microscopic level, can be described using ideas from statistical physics (see [13] , [15] ), where by a single DNA strand is modelled as a stochastic system of interacting bases with long-range correlations. This approach makes an important connection between the structure of DNA sequence and temperature; e.g., phase transitions in such a system may be interpreted as a conformational (topological) restructuring. In this section we consider a new model which consists a DNA together with an RNA. The bases in nucleic acids can interact via hydrogen bonds. Base pairing stabilizes the native three-dimensional structures of DNA and RNA. Our interpretation of this system is that RNA tries to denature the DNA and renature a new DNA by adding its own nitrogenous bases (as analogue of corona virus's RNA). A DNA denaturation process is the breaking of the hydrogen bonds connecting the two stands under treatment by heat [3] , [15] . The process consists of the splitting of DNA base pairs, or nucleotides, resulting in the separation of two complementary DNA strands 3 . In the past decades DNA denaturation attracted the interest of various researchers, which introduced and studied statistical and dynamical models of this fundamental biological process (see [5] , recent paper [8] and the references therein). It is known that in a DNA each A + T pair connected by two hydrogen bonds, while each C + G pair connected by three hydrogen bonds. Therefore in this section we model them as (spin value) 2 = A + T , 3 = C + G. A melted (broken) under treatment by heat hydrogen bond assigned to (spin value) 0. The base pairs A + T (in DNA), and A + U (in RNA) considered as identical in process of renaturation of DNA from the RNA (of the virus). Then a DNA can be considered as a ladder shown in Fig. 1 . An RNA is a onedimensional line (also showed in Fig. 1 ). Thus our (spin) system is a double-ladder levels of which denoted by integer numbers n ∈ Z. Assume a base pair is either broken or intact. To each base pair of level i ∈ Z assign two parameters d i (to base pair of DNA) and r i (to the base pair of renatured DNA, i.e. between old DNA and RNA). These parameters are defined as Configuration consisting 0, 2, 3 is the state of DNA-RNA denaturationrenaturation at a given temperature T . The value 0 means that the corresponding pair is broken (melted). 0, if the ith base pair between DNA and RNA is broken 2, if the ith base pair between DNA and RNA is intact and at state A + T 3, if the ith base pair between DNA and RNA is intact and at state C + G. Since RNA (as corona virus) will break base pair of DNA and puts its own pair, we have condition (2.1) Thus the configuration space Ω of our system is build by configurations as For each σ ∈ Ω define its energy (Hamiltonian) by where J ∈ R is coupling constant between base pairs, α ∈ R is external field and δ is Kronecker delta: Denote by σ n the restriction of the configuration σ ∈ Ω on Z n = {−n, −n+1, . . . , n− 1, n} and by Ω n the set of all such configurations. In general, for a subset A ⊂ Z denote by Ω A the set of all configurations restricted on A. Define a finite-dimensional distribution of a probability measure µ on Ω n as are real numbers and We say that the probability distributions (2.3) are compatible if for all n ≥ 1 and σ n−1 ∈ Ω n−1 : Here σ n−1 ∨ ω n is the concatenation of the configurations. In this case there exists a unique measure µ on Ω such that, for all n and σ n ∈ Ω n , Such a measure is called a Gibbs measure corresponding to the Hamiltonian (2.2) and values (2.4) . For simplicity assume that Under this condition the following statement describes conditions on h n,i,j guaranteeing compatibility of µ n (σ n ). Theorem 2.1. Probability distributions µ n (σ n ), n = 1, 2, . . ., in (2.3) are compatible iff for any n ≥ 1 the following hold Here, θ = exp(Jβ), η = exp(αβ), Proof. The proof is similar to the proof of Theorem 2.1 of [10] . It is difficult to find general solutions to (2.7). Remark 2.1. For θ = 1 (i.e. J = 0) the system (2.7) has unique solution x n = y n = v n = u n = 1. Therefore below we consider the case θ = 1. We assume that the unknowns do not depend on n, i.e. the value of each unknown is translation invariant. Therefore denote x = θη 2 x n , y = θη 2 y n , u = θη 3 u n , v = θη 3 v n . Then the system (2.7) is reduced to the finding of fixed points of the mapping F , i.e., to solving of system (x, y, u, z) = F (x, y, u, v). From the first equation of this system we get Substituting this in the second equation of (3.2) we get All of the roots of the cubic equation can be found 4 . We are interested in positive solutions, x i = x i (θ, η) of the cubic equation. Moreover, the corresponding u(x i ) defined in (3.4) should be positive too. Thus condition for parameters (θ, η) ∈ R 2 + of the existence of positive solutions can be explicitly written x i (θ, η) > 0, u(x i ) > 0. But the explicit solutions of the cubic equation have some bulky formulas, therefore we do not present the solution here. Instead we consider some concrete cases: 3) In the above-mentioned case 1) we solved the first equation of the system (3.2) with respect to u. Doing similar argument starting from the second equation of (3.2) and solving it with respect to x one gets u = u 1 = θ 2 , if η = 3 θ 1+θ . Corresponding to u 1 one can explicitly find unique positive value of x = x 1 (θ). Thus for any θ > 0 and η = 3 θ 1+θ we can explicitly give unique solution (x 1 , u 1 ) of (3.2). 4) Case: η = 1. In this case the cubic equation is reduced to quadratic equation, which has unique positive solution: Corresponding u 2 = u(x 2 (θ)) is also positive. Note that x 2 (θ) and u 2 (θ) have value +∞ (resp. 1/2) if θ → ∞ (resp. θ → 0). Thus in the case η = 1 the system (3.2) has unique solution (x 2 , u 2 ). 5) Several numerical analysis show that for η = 1 again we have unique solution (see Fig. 2 and 3) . Subtracting from the first equation of this system the second one (resp. from the third equation of the last one) we get x where L ≡ L(x, y, u, v) = η 2 · θ θ 2 + x + y + u + v . Recall that θ = 1. Proof. Since L > 0, θ = 1, if x = y then from the first equation of (3.7) we get u = v. If u = v then from the second equation of (3.7) we get x = y. Assume x = y. Then find u − v from the first equation of (3.7) and substituting it in the second equation we obtain Theorem 3.1. The system (3.6) does not have any solution in R 2 + \ M. Proof. From Lemma 3.3 it follows that in R 2 + \ M may only exist solutions with x = y and u = v. Therefore we denote t = u−v x−y . Case 1: t > 0. Assuming t > 0 from (3.7) we get The last equation has unique positive root: Thus u − v = t 1 · (x − y). Using this from the first equation of (3.7) we get . (3.10) One can see that L satisfies (3.8) . By the last formula we get where the constant B is . (3.12) We are interested in positive solutions of the system. By this system of linear equations we get Using formula of A, C and t 1 one can see that (3.13) is satisfied. Moreover, we have (3.14) In case det(M) = 0 the system Mv = b has unique solution with x = y and u = v. To have its other solutions (with the condition x = y and u = v) we need to the condition det(M) = 0 which by (3.13) is satisfied and rank(M) = 3. Solving the linear system Mv = b, under condition (3.13), we explicitly obtain infinitely many solutions: Thus x < 0 and u < 0, i.e., there is no positive solution x = y and u = v. Case 2: t < 0. In this case from (3.9) we get unique negative root: Thus u − v = t 2 · (x − y). Using this from the first equation of (3.7) we get In the last equality, since θ > 0 and L > 0, it is necessary that (θ − 1)(1 + θ + t 2 ) > 0. It is easy to see that t 2 > −1 − θ. Consequently, the system may have solution only for θ > 1. Therefore we have . (3.16) By the last formula we get where the constant B 2 is . Thus for θ > 1 the system is a linear system of equation of the form Nv = b where (3.18) To have a positive solution of this system it is necessary that But we have The last equality does not hold because θ > 1 and t 2 < 0. Thus in the case t < 0 the system (3.17) does not have any positive solution. This completes the proof of theorem. Let µ be the Gibbs measure corresponding to a solution (x, y, u, v) of the system (3.6). The measure µ defines joint distribution µ σ(n) = (d n , r n ), σ(n + 1) = (d n+1 , r n+1 ) = 1 Z exp Jβ (δ(d n , d n+1 ) + δ(r n , r n+1 )) + m∈{n,n+1} where Z is normalizing factor. From this, the relation between the solutions (x, y, u, v) and the transition matrix for the associated Markov chain is obtained from the formula of the conditional probability. The transition matrix of this Markov chain is defined as follows and (x, y, u, v) a solution of system (3.6) (which depends on both parameters θ and η). Since P is a positive stochastic matrix there exists unique probability vector π = (π 1 , . . . , π 5 ) which satisfies the system of linear equations πP = π (i.e. π is stationary distribution). Note that this linear system can be explicitly solved. Its solution π depends on both parameters θ and η subject to the constraint that elements must sum to 1. But coordinates of the vector π has a bulky form. Therefore we do not present it here. The following is a consequence of known (see p. 55 of [4] ) ergodic theorem for positive stochastic matrices. Theorem 4.1. For the matrix P defined in (4.1) and its stationary distribution π the following holds lim n→∞ XP n = π for all initial probability vector X. Recall r i = 0 means that RNA does not renature DNA at level i ∈ Z. Thus for a given DNA and RNA we say that the RNA (virus) do not destroy the DNA if for any ǫ > 0 there exists N ≥ 1 such that for any n ≥ N the following inequality holds µ(Ω 0 n ) > 1 − ǫ, where µ is Gibbs measure corresponding to a solution of system (2.7) and Ω 0 n = {σ n = {(d i , r i )} ∈ Ω n : r i = 0, ∀i ∈ Z n }. Note that each element of Ω 0 n defines a DNA, which has thermodynamic behavior. For the measure corresponding to a solution (x, y, u, v) of system (3.6) we have (Markov measure): where π σ(−n) ∈ {π 1 , . . . , π 5 } (a coordinate of the stationary distribution) and P σ(i),σ(i+1) ∈ {Pˆi ,ĵ : with odd i, j} see (4.2). For a given solution (x, y, u, v) corresponding to it value µ(Ω 0 n ) depends only on parameters θ, η and n, i.e. µ n (θ, η) := µ(Ω 0 n ). Using explicit formula of solution (x, y, u, v) one can explicitly calculate µ n (θ, η). But it will have a bulky form. For fixed parameters J and α of the model (2.2) the parameters θ and η are functions of temperature T = 1 β (see (2.8) ). Therefore for fixed parameters of the model, the quantity µ n (T ) := µ n (θ, η) is the function of temperature T and n only. By Theorem 4.1 and above formulas of matrices we have the following: Corollary 4.1. For given parameters J and α of the model (2.2) RNA of the virus do not destroy DNA (with respect to measure µ) if temperature T satisfies the following condition: ∀ǫ > 0 there exists N ≥ 1 such that ∀n ≥ N the following inequality holds µ n (T ) > 1 − ǫ. Since µ n (T ) has a bulky form to check this condition one can use numerical analysis. Thermodynamics of DNA microarrays. Stochastic models in biological sciences A probabilistic study of DNA denaturation Gibbs Measures and Phase Transitions Ab initio bubble-driven denaturation of double-stranded DNA: self-mechanical theory Use of Ultraviolet Absorbance-Temperature Profile for Determining the Guanine plus Cytosine Content of DNA Physics of base-pairing dynamics in DNA Thermodynamics of DNA denaturation in a model of bacterial intergenic sequences Mathematics of genome analysis Gibbs measures on Cayley trees Tree-hierarchy of DNA and distribution of Holliday junctions Holliday junctions for the Potts model of DNA Complex Analysis, and Pluripotential Theory The Mathematics of DNA Structure, Mechanics, and Dynamics, IMA Volumes in Mathematics and Its Applications Nearest-neighbor thermodynamics of DNA sequences with single bulge loop Mathematical statistical mechanics