key: cord-0047327-kqepagyd authors: Kristiansen, Lars title: Reversible Programming Languages Capturing Complexity Classes date: 2020-06-17 journal: Reversible Computation DOI: 10.1007/978-3-030-52482-1_6 sha: d10bd161e943c8635e7f17c4cb1818f872796c4c doc_id: 47327 cord_uid: kqepagyd We argue that there is a link between implicit computational complexity theory and the theory of reversible computation. We show that the complexity classes [Formula: see text] and [Formula: see text] can be captured by inherently reversible programming languages. The title above is inspired by the title of a paper I co-authored with Paul Voda more than 15 years ago: Programming languages capturing complexity classes [10] . In that paper we related the computational power of fragments of programming languages to complexity classes defined by imposing time and space constraints on Turing machines. Around that time, I authored and co-authored a number of related papers, e.g. [8, 9, 11] , all of which were clearly inspired by work in implicit computational complexity theory from the 1990s, e.g., Bellatoni and Cook [2] , Leivant [12, 13] and, particularly, Jones [5, 6] . Complexity classes like P, FP, NP, LOGSPACE, EXPTIME, and so on, are defined by imposing explicit resource bounds on a particular machine model, namely the Turing machine. E.g., FP is defined as the class of functions computable in polynomial time on a deterministic Turing machine. The definition puts constraints on the resources available to the Turing machines, but no constraints on the algorithms available to them. A Turing machine may compute a function in the class by any imaginable algorithm as long as it works in polynomial time. Implicit computational complexity theory studies classes of functions (problems, languages) that are defined without imposing explicit resource bounds on machine models, but rather by imposing linguistic constraints on the way algorithms can be formulated. When we explicitly restrict our language for formulating algorithms, that is, our programming language, then we may implicitly restrict the computational resources needed to execute algorithms. If we manage to find a restricted programming language that captures a complexity class, then we will have a so-called implicit characterization. A seminal example is Bellatoni and Cook's [2] characterization of FP. They give a functional programming language (which they call a function algebra). This language consists of a few initial functions and two definition schemes (safe composition and safe primitive recursion) which allow us to define new functions. These schemes put rather severe syntactical restrictions on how we can define functions, but they do not refer to polynomially bounded Turing machines or any other kind of resource bounded computing machinery. It is not easy to write programs when we have to stick to these schemes, even experienced programmers might find it hard to multiply two numbers but, be that as it may, this is a programming language that yields an implicit characterization of a complexity class. It turns out that a function can be computed by a program written in Bellantoni & Cook's language if and only if it belongs to the complexity class FP. There is an obvious link between implicit computational complexity and reversible computing. A programming language based on natural reversible operations will impose restrictions on the way algorithms can be formulated, and thus, also restrictions on the computational resources needed to execute algorithms. Hence, the following question knocks at the door: Will it be possible find reversible programming languages that capture some of the standard complexity classes? The answer turns out to be YES. We will present a reversible language that captures, or if you like, gives an implicit characterization of, the (maybe not very well-known) complexity class ETIME. A few small modifications of this language yield a reversible language that captures the very well-known complexity class P. Our languages are based on a couple of naturally reversible operations. To increase, or decrease, a natural number by 1 modulo a base b is such an operation: . . . 0, 1, 2, . . . , b − 2, b − 1, 0, 1, 2 . . .. The successor of b − 1 becomes 0, and then b − 1 becomes the predecessor of 0. Thus, "increase" and "decrease" are the reverse of each other. To move an element from the top of one stack to the top of another stack is another such operation as we can simply move the element back to the stack it came from. This paper addresses students and researchers interested in programming languages, reversible computations and computer science in general, they will not necessarily be experts in computability or complexity theory. We will give priority to readability over technical accuracy, but still this is a fairly technical paper, and we will assume that the reader is faintly acquainted with Turing machines and basic complexity theory (standard textbooks are Arora and Barac [1] , Jones [7] and Sipser [16] ). Implicit computational complexity theory is definitely a broader and richer research area than our short discussion above may indicate. More on the subject can be found in Dal Lago [3] . An infinite sequence of natural numbers s 1 , s 2 , s 3 , . . . is a bottomless stack if there exists k such that s i = 0 for all i > k. We use x 1 , . . . , x n , 0 * ] to denote the bottomless stack s 1 , s 2 , s 3 , . . . where s i = x i when i ≤ n, and s i = 0 when i > n. We say that x 1 is the top element of x 1 , . . . , x n , 0 * ]. Observe that 0 is the top element of the stack 0 * ]. Furthermore, observe that 0, 0 * ] is the same stack as 0 * ] (since 0, 0 * ] and 0 * ] denote the same sequence of natural numbers). We will refer to 0 * ] as the zero stack. Fig. 1 . The syntax of the language RBS. The variable X in the loop command is not allowed to occur in the loop's body. The syntax of the imperative programming language RBS is given in Fig. 1 . Any element in the syntactic category Command will be called a program, and we will use the word command and the word program interchangeably throughout the paper. We will now explain the semantics of RBS. An RBS program manipulates bottomless stacks, and each program variable holds such a stack. The input to a program is a single natural number m. When the execution of the program starts, the input m will be stored at the top of the stack hold by X 1 , that is, we have X 1 = m, 0 * ]. All other variables occurring in the program hold the zero stack when the execution starts. A program is executed in a base b which is determined by the input: we have b = max(m + 1, 2) if X 1 = m, 0 * ] when the execution starts. The execution base b is kept fixed during the entire execution. Let X and Y be program variables. We will now explain how the primitive commands work. The command (X to Y) pops off the top element of the stack held by X and pushes it onto the stack held by Y, that is The command X + increases the the top element of the stack held by X by 1 (mod b), that is The command X − decreases the the top element of the stack held by X by 1 (mod b), that is Observe that we have The semantics of the command C 1 ; C 2 is as expected. This is the standard composition of the commands C 1 and C 2 , that is, first C 1 is executed, then C 2 is executed. The command loop X { C } executes the command C repeatedly k times in a row where k is the top element of the stack held by X. Note that the variable X is not allowed to occur in C and, moreover, the command loop X { C } will not modify the stack held by X. Let C 2 be the program loop X 1 { X + 2 }; X + 2 ; (X 2 to X 1 ). We have since the execution base is 18. All numbers stored on stacks during an execution will be strictly less than the execution base, and thus, less than or equal to max(m, 1) where m is the input. Intuitively, it should be clear that RBS programs are reversible in a very strong sense. RBS is an inherently reversible programming language in the terminology of Matos [14] . If we like, we can of course state this insight more formally. The next definition and the following theorem will be a step in that direction. We define reverse command of C, written C R , inductively over the structure C: Theorem 3. Let C be a program, and let X 1 , . . . , X n be the variables occurring in C. Furthermore, let m be any natural number. We have It is a nice, and maybe even challenging, exercise to write up a decent proof Theorem 3, even if it should be pretty clear that the theorems holds. We will offer a proof in the next section. The reader not interested in the details of the proof, may skip that section. We will now define the set of problems that can be decided by an RBS programs. To that end, we need to determine how an RBS program should accept, and how an RBS program should reject, its input. Any reasonable convention will do, and we will just pick a simple and convenient one. Comments: ( *swap the top elements of X1 and X2 *) X3 to X2 } A problem is a set of natural numbers. 1 An RBS program C decides the problem A if C accepts all m that belong to A and rejects all m that do not belong to A. Let S denote class of problems decidable by an RBS program. Let A be the set of even numbers. Then A is a problem. Figure 2 shows an RBS program that decides A. Now, any RBS program decides a problem, and S is obviously a well-defined class of computable (decidable) problems. We have defined S by a reversible programming language. We have not defined S by imposing resource bounds on Turing machines or any other machine models. What can we say about the computational complexity of the problems we find in S? May it be the case that S equals a complexity class? This section is dedicated to a detailed proof of Theorem 3 (readers not interested may jump ahead to Sect. 4). First, we need some terminology and notation: We will say that a (bottomless) stack is a b-stack if every number stored on the stack is strictly smaller than b. Furthermore, we will use V(C) to denote the set of program variables occurring in the command C, and for any positive integer m and any command C, we define the command C m by C 1 ≡ C and C m+1 ≡ C m ; C. Now, assume that C is an RBS command with V(C) ⊆ {X 1 , . . . , X n }. Furthermore, assume that C is executed in base b and that α 1 , . . . , α n , β 1 , . . . , β n are b-stacks. With these assumptions in mind, we make the following claim: Theorem 3 follows straightforwardly from this claim. So all we need to do is to prove the claim. We will of course carry out induction on the structure of the command C, and our proof will split into the tree base cases (i) Fig. 1 ). This concludes the proof of case (i). The proofs of the cases (ii) and (iii) are very similar to the proof of case (i). We leave the details to the reader and proceed with the inductive cases. Then there exist b-stacks γ 1 , . . . , γ n such that We apply our induction hypothesis both to C 1 and to C 2 and conclude It follows that Finally, as Definition 2 states that ( This completes the proof of case (iv). and let m be the top element of the stack α i . as the command C 0 will not be executed at all. Thus, we also have and by Definition 2, we have This proves that the claim holds when m = 0. We are left to prove that the claim holds when m > 0. Thus, in the remainder of this proof we assume that by a secondary induction on m. Let m = 1. Then we have C m 0 ≡ C 0 , and an application of our main induction hypothesis to C 0 yields ( †). Let m > 1. Then we have and ( †) holds by our induction hypothesis on m and case (iv) above. This concludes the proof of ( †). We are now ready to complete our proof the claim. By (*), we have Since X i ∈ V(C 0 ), we have β i = α i , and thus, the top element of β i is the same as the top element of α i , namely m. It follows that Finally, as Definition 2 states that loop X This completes the proof of case (v). Let us first see how we can simulate a Turing machine in a standard way in a standard high-level language. Thereafter we will discuss how we can simulate a Turing machine in our rudimentary reversible language. In the standard language we will of course be able to simulate any Turing machine, no matter how much time and space resources the machine requires. In the reversible language we will only be able to simulate those Turing machines that run in time O(2 kn ) (where k is a constant and n is the length of the input). We assume some familiarity with Turing machines. The reader is expected to know that a Turing machine computes by writing symbols from a finite alphabet a 1 , . . . , a A on an infinite tape which is divided into cells; know that one of the cells is scanned by the machine's head; know a there is a finite number of states q 1 , . . . , q Q ; and so on. The input w will be available on the tape when a Turing machine M starts, and the actions taken by M will be governed by a finite transition table. Each entry of the table is a 5-tuple where a i , a j are alphabet symbols; q k , q are states; and D is ether "left" or "right". Such a tuple is called a transition and tells M what to do when it scans the symbol a j in state q k : in that case M should write the symbol a j , move its head one position in the direction given by D, and then proceed in state q . We restrict our attention to deterministic Turing machines, and for each alphabet symbol a i and each non-halting state q k , there will be one, and only one, transition that starts with a i , q k . So a Turing machine knows exactly what to do until it reaches one its halting states, and then it simply halts (if it halts in a dedicated state q accept , it accepts its input; if it halts in a dedicated state q reject , it rejects its input Minimum one transition will be executed each time the loop's body is executed, and the running time of M (on input w) will more or less be the number of times the body is executed. (It might happen that more than one transition is executed when the loop's body is executed once, but that will not cause any trouble.) In order to simulate the actions taken by the transitions, we need a representation of the computing machinery. We need to keep track of the current state, we need to keep track of the symbols on the tape, and we need to identify the scanned cell. The current state can simply be stored in a register STATE, but how should we deal with the tape? The tape is divided into an infinite sequence of cells where one of the cells C s is scanned by the head. Only finitely many of these cells will contain anything else than the blank symbol. Let us say that C i contains blank when i > B 0 . In order to simulate the machine it will obviously be sufficient to store the symbols in the cells C 1 , C 2 , . . . , C B where B = max(B 0 , s) + 1. In addition we need to keep track of the scanned cell C s . A convenient way to deal with the situation will be to use a stack STACK L , a register SCAN, another stack STACK R , and store the tape content in the following way: . . . The input to an RBS program is a natural number, and we will thus discuss to what extent an RBS program can simulate a Turing machine that takes a single natural number as input. We have seen that a program with only one while-loop can simulate a Turing machine (and we will for sure need at least one while-loop in order to simulate an arbitrary Turing machine). Now, while-loops are not available in RBS, and the best we can do in order to simulate a Turing machine is to use a fixed number of nested for-loops: Since an RBS program cannot increase the numerical value of its input, the body of each of these loops will be executed maximum max(m, 1) times where m is the input to the RBS program (and to the Turing machine the program simulates). Thus it is pretty clear that we cannot simulate a Turing machine if its running time is not bounded by m k for some constant k. This corresponds to a bound 2 k|m| where k is a constant and |m| is the length of the input m, that is, |m| equals the number of symbols needed to represent the natural number m in binary notation. In the following we will see that any Turing machine that uses such an amount of computation time can be simulated by an RBS program. It turns out that an RBS program can simulate the transitions of a Turing machine M in essentially the same way as the high-level program sketched above, given that the input to M is sufficiently large (on small inputs the simulation might fail). Stacks are directly available in RBS, and thus an RBS program can easily represent the tape and mimic the movements of the head. On the other hand, assignment statements and if-then statements are not directly available. This makes things a bit tricky. Let us first see how RBS programs to a certain extent can simulate programs written in a non-reversible programming language called LOOP − . The syntax of LOOP − is given in Fig. 3 . Any element in the syntactic category Command will be called a program. A LOOP − program manipulates natural numbers, and each program variable holds a single natural number. The command X := k assigns the fixed number k to the variable X. The command X := Y assigns the number hold by the variable Y to the variable X. The command com ∈ Command ::= X:= k | X:= X | pred(X) | com; com Fig. 3 . The syntax of the language LOOP − . The variable X in the loop command is not allowed to occur in the loop's body. pred(X) decreases the value hold by the variable X by 1 if the value is strictly greater than 0; and leave the value hold by X unchanged if the value is 0. Furthermore, the command C 1 ; C 2 is the standard composition of the commands C 1 and C 2 , and the command loop X { C } executes the command C repeatedly k times in a row where k in the number hold by X. Note that the variable X is not allowed to occur in C and that the command loop X { C } does not modify the value held by X. An RBS program can represent a LOOP − variable X holding natural number k by a variable X (we use the same variable name) holding the stack k, 0 * ]. The command X := k can then be simulated by the program (X to Z); X + ; X + ; . . . X + increase k times where Z is an auxiliary variable (Z works as a trash bin). Now, observe that this will only work if the base of execution is strictly greater than k, but that will good enough to us. The command X := Y can be simulated by the program where Z is an auxiliary variable (Z works as a trash bin). Furthermore, the command pred(X) can be simulated by a program that uses auxiliary variables Y and Z (which represent natural numbers) and the simulations of the assignment statements given above: This shows how RBS programs can simulate all the primitive LOOP − commands. It is easy to see that Hence, any LOOP − program can be simulated by an RBS program given that the input is sufficiently large. On small inputs simulations might fail since the simulation of the assignment X := k only works if the execution base is strictly greater than k. The LOOP − language turns out to be more expressive than one might expect at a first glance, and all sorts of conditional statements and if-then constructions are available in the language. As an example, let us see how we can implement the construction if X = Y then C 1 else C 2 . We will need some axillary variables X , Y , Z, U which do not occur in any of the commands C 1 and C 2 . First we execute the program This program sets both X and Y to 0 if X and Y hold the same number. If X and Y hold different numbers, one of the two variables X , Y will be set to a number strictly greater than 0. Then we execute the program The composition of these two programs executes the program C 1 exactly once (and C 2 will not be executed at all) if X and Y hold the same number. If X and Y hold different numbers, C 2 will be executed exactly once (and C 1 will not be executed at all). The reader should note that this implementation of if-then-else construction does not contain any assignments of the form X := k where k > 1. It is proved in Kristiansen [8] that LOOP − captures the complexity class LINSPACE, that is, the set of problems decidable in space O(n) on a deterministic Turing machine (n is the length of the input). Hence, the considerations above indicate that LINSPACE ⊆ S. However, we are on our way to proving a stronger result, namely that LINSPACE ⊆ S = ETIME. The equality LINSPACE ? = ETIME is one of the many notorious open problems of complexity theory. The general opinion is that the equality does not hold. We have seen that RBS programs (nearly) can simulate LOOP − programs. LOOP − can assign constants to registers and perform if-then-else constructions. This helps us to see how to an RBS program can simulate an arbitrary 2 k|m| time Turing machine M . Such a program may be of the form initiate the tape with the input m ; Y 1 := the input m ; Y 2 := the input m ; . . . ; Y k := the input m ; We represent the symbols in M 's alphabet a 1 , . . . a A by the numbers 1, . . . , A and M 's states q 1 , . . . q Q by the numbers 1, . . . , Q. We use two stacks to hold the content of the tape, and we use registers STATE and SCAN to hold respectively the current state and the scanned cell. Each T s will take care of a transition a i , q k , a j , D, q and be of the form if SCAN = i and STATE = k then { SCAN := j; . . . push and pop . . . ; STATE := }. We are left with a minor problem: This will not work for small inputs. This will only work if the base of execution b = max(m + 1, 2) is strictly greater than max(A, Q). Only then will the simulating program be able to perform the necessary assignments of constants to variables. In some sense we cannot deal with this problem. An RBS program will not be able to simulate (in any reasonable sense of the word) an arbitrary 2 k|m| time Turing machine M on small inputs, but still there will be an RBS program that decides the same problem as M . We have seen that it suffices to assign the constants 0 and 1 to variables in order to implement the if-then-else construction in LOOP − . This entails that the if-then-else construction will work on small inputs as the base of execution always will be strictly greater than 1. Hence, if the problem A is decided by a 2 k|m| time Turing machine M , there will also be an RBS program that decides A. This program will be of the form Let |m| denote the number of digits required to write the natural number m in binary notation. For any natural number k, let ETIME k be the class of problems decidable in time O(2 k|m| ) on a deterministic Turing machine. Let ETIME = i∈N ETIME i . Theorem 6. S = ETIME. Proof. The proof of the inclusion S ⊆ ETIME should be straightforward to anyone experienced with Turing machines. Assume A ∈ S (we will argue that A ∈ ETIME). Then there is an RBS program C that decides A. Let m be the input to C. Each loop in C will be executed maximum m + 1 times since the base of execution will be max(m+1, 2). Thus, there exist constants k 0 , k 1 (not depending on m) such that k 0 (m+1) k1 bounds the number of primitive commands executed by C on input m. A Turing machine can simulate the execution of C on input m with polynomial overhead. Thus there exist constants k 2 , k 3 such that k 2 (m+1) k3 bounds the number of steps a Turing machine needs to decide if m is in A. There exists k such that k 2 (m + 1) k3 < 2 k|m| . Hence, A ∈ ETIME. This proves the inclusion S ⊆ ETIME. We turn to proof of the inclusion ETIME ⊆ S. Assume A ∈ ETIME (we will argue that A ∈ S). Then there is a O(2 k|m| ) time Turing machine M that decides A. Now, M will run in time 2 k0|m| when k 0 is sufficiently large. In the previous section we saw that there will be an RBS program that decides the same problem as M . Hence, A ∈ S. This proves the inclusion ETIME ⊆ S. Would it not be nice if we could find a reversible language that captures a complexity class that is a bit more attractive than ETIME? Now, P is for a number of reasons, which the reader might be aware of, one of most popular and important complexity classes. Luckily, it turns out that a few modifications of RBS yield a characterization of P. First we modify the way RBS programs receive input. The input will now be a string over some alphabet. Any alphabet that contains at least two symbols will do and, for convenience, we will stick to the alphabet {a, b}. The base of execution will at program start be set to the length of the input. Otherwise, nearly everything is kept as before: Every variable will still hold a bottomless stack storing natural numbers. All commands available in the original version of RBS will be available in the new version. A program will still accept its input by terminating with 0 at the top of the stack held by X 1 , otherwise, the program rejects its input. Moreover, all variables including X 1 , the variable that used to import the input, hold the zero stack when the execution of a program starts. (* X3 is a pointer into the input *) { X1 to X9; ( * X 1 holds the zero stack *) X + 1 (* top element of X1 is 1 *) }; X + 3 (* move pointer to the right *) }; ( * end of loop *) case inp[X3]=a: Fig. 4 . The program accepts any string that starts with a nonempty sequence of a's and ends with a single b (the input to a program should at least contain two symbols). The program rejects any string that is not of this form. The program accepts by terminating with X1 = 0 * ] and rejects by terminating with X1 = 1, 0 * ]. We still have a reversible language. The two new commands are reversible. The variable X j is not allowed to occur in C and will consequently not be modified by C. Thus, for x ∈ {a, b}, we may extend Definition 2 by and Theorem 3 will still hold. To avoid confusion we will use RBS to denote our new version of RBS. We require that the input to an RBS program is of length at least 2 (so we exclude the empty string and the one-symbol strings a and b). This is of course a bit artificial, but it seems to be the most convenient way to deal with a few annoying problems of technical nature. Accordingly, we also require that every string in a language (see the definition below) is of length at least 2. A language L is a set of strings over the alphabet {a, b}, moreover, every string in L is of length at least 2. An RBS program C decides the language L if C accepts every string that belongs to L and rejects every string that does not belong to L. Let S be class of languages decidable by an RBS' program. Let |w| denote the length of the string w. For any natural number k, let P k be the class of languages decidable in time O(|w| k ) on a deterministic Turing machine. Let P = i∈N P i . Figure 4 shows an RBS program which decides the language given by the regular expression a * ab. The proof of the next theorem is very similar to the proof of Theorem 6, and the reader should be able to provide the details. Just recall that the execution base of an RBS program is set to the length of the input. Hence, the number of primitive instructions executed by an RBS program will be bounded by |w| k where |w| is the length of the input w and k is a sufficiently large constant, and moreover, an RBS program of the form We have argued that there is a link between implicit computational complexity theory and the theory of reversible computation, and we have showed that both ETIME and P can be captured by inherently reversible programming languages. In general, implicit characterizations are meant to shed light on the nature of complexity classes and the many notoriously hard open problems involving such classes. Implicit characterizations by reversible formalisms might yield some new insights in this respect. It is beyond the scope of this paper to discuss or interpret the theorems proved above any further, but one might start to wonder how different aspects of reversibility relate to time complexity, space complexity and nondeterminism. The author is not aware of any work in reversible computing that is closely related to the work presented above, but some work of Matos [14] is at least faintly related. Matos characterizes the primitive recursive functions by an inherently reversible loop-language. 2 Paolini et al. [15] do also characterize the primitive recursive functions by a reversible formalism. Their work is of a recursiontheoretic nature and has a different flavor than ours, but it is possible that such studies might lead to interesting characterizations of complexity classes. We finish off this paper by suggesting a small research project. It should be possible to extend RBS to an inherently reversible higher-order language. Firstorder programs will be like the ones defined and explained above. Second-order programs will manipulate stacks of stacks, third-order programs will manipulate stacks of stacks of stacks, and so on. This will induce a hierarchy: the class of problems decidable by a first-order RBS program, the class of problems decidable by a second-order RBS program, . . . by a third-order RBS program, and so on. By the same token, RBS will induce a hierarchy: the class of languages decidable by a first-order RBS program, the class of languages decidable by a second-order RBS program, and so on. These two hierarchies should be compared to the alternating time-space hierarchies studied in Goerdt [4] , Jones [6] , Kristiansen and Voda [10] and many other papers. Computational Complexity: A Modern Approach A new recursion-theoretic characterizations of the polytime functions A short introduction to implicit computational complexity Characterizing complexity classes by higher type primitive recursive definitions LOGSPACE and PTIME characterized by programming languages The expressive power of higher-order types or, life without CONS Computability and Complexity from a Programming Perspective Neat function algebraic characterizations of LOGSPACE and LINSPACE On the computational complexity of imperative programming languages Programming languages capturing complexity classes Complexity classes and fragments of C A foundational delineation of computational feasibility Stratified functional programs and computational complexity Linear programs in a simple reversible language On a class of reversible primitive recursive functions and its Turing-complete extensions Introduction to the Theory of Computation