key: cord-0054945-xxife43h
authors: Exibard, Léo; Filiot, Emmanuel; Reynier, Pierre-Alain
title: On Computability of Data Word Functions Defined by Transducers
date: 2020-04-17
journal: Foundations of Software Science and Computation Structures
DOI: 10.1007/978-3-030-45231-5_12
sha: 30a5e065d69324ff461cc4e5e1ddd25d1416daf9
doc_id: 54945
cord_uid: xxife43h

In this paper, we investigate the problem of synthesizing computable functions of infinite words over an infinite alphabet (data [Formula: see text] -words). The notion of computability is defined through Turing machines with infinite inputs which can produce the corresponding infinite outputs in the limit. We use non-deterministic transducers equipped with registers, an extension of register automata with outputs, to specify functions. Such transducers may not define functions but more generally relations of data [Formula: see text] -words, and we show that it is PSpace-complete to test whether a given transducer defines a function. Then, given a function defined by some register transducer, we show that it is decidable (and again, PSpace-c) whether such function is computable. As for the known finite alphabet case, we show that computability and continuity coincide for functions defined by register transducers, and show how to decide continuity. We also define a subclass for which those problems are PTime.

Context Program synthesis aims at deriving, in an automatic way, a program that fulfils a given specification. Such setting is very appealing when for instance the specification describes, in some abstract formalism (an automaton or ideally a logic), important properties that the program must satisfy. The synthesised program is then correct-by-construction with regards to those properties. It is particularly important and desirable for the design of safety-critical systems with hard dependability constraints, which are notoriously hard to design correctly.

Program synthesis is hard to realise for general-purpose programming languages but important progress has been made recently in the automatic synthesis of reactive systems. In this context, the system continuously receives input signals to which it must react by producing output signals. Such systems are not assumed to terminate and their executions are usually modelled as infinite words over the alphabets of input and output signals. A specification is thus a set of pairs (in,out), where in and out are infinite words, such that out is a legitimate output for in. Most methods for reactive system synthesis only work for synchronous systems over finite sets of input and output signals Σ and Γ . In this synchronous setting, input and output signals alternate, and thus implementations of such a specification are defined by means of synchronous transducers, which are Büchi automata with transitions of the form (q, σ, γ, q ), expressing that in state q, when getting input σ ∈ Σ, output γ ∈ Γ is produced and the machine moves to state q . We aim at building deterministic implementations, in the sense that the output γ and state q uniquely depend on q and σ. The realisability problem of specifications given as synchronous non-deterministic transducers, by implementations defined by synchronous deterministic transducers is known to be decidable [14, 20] . In this paper, we are interested in the asynchronous setting, in which transducers can produce none or several outputs at once every time some input is read, i.e., transitions are of the form (q, σ, w, q ) where w ∈ Γ * . However, such generalisation makes the realisability problem undecidable [2, 9] .

In the setting we just described, the set of signals is considered to be finite. This assumption is not realistic in general, as signals may come with unbounded information (e.g. process ids) that we call here data. To address this limitation, recent works have considered the synthesis of reactive systems processing data words [17, 6, 16, 7] . Data words are infinite words over an alphabet Σ × D, where Σ is a finite set and D is a possibly infinite countable set. To handle data words, just as automata have been extended to register automata, transducers have been extended to register transducers. Such transducers are equipped with a finite set of registers in which they can store data and with which they can compare data for equality or inequality. While the realisability problem of specifications given as synchronous non-deterministic register transducers (NRT syn ) by implementation defined by synchronous deterministic register transducers (DRT syn ) is undecidable, decidability is recovered for specifications defined by universal register transducers and by giving as input the number of registers the implementation must have [7, 17] .

Computable Implementations In the previously mentioned works, both for finite or infinite alphabets, implementations are considered to be deterministic transducers. Such an implementation is guaranteed to use only a constant amount of memory (assuming data have size O(1)). While it makes sense with regards to memoryefficiency, some problems turn out to be undecidable, as already mentioned: realisability of NRT syn specifications by DRT syn , or, in the finite alphabet setting, when both the specification and implementation are asynchronous. In this paper, we propose to study computable implementations, in the sense of (partial) functions f of data ω-words computable by some Turing machine M that has an infinite input x ∈ dom(f ), and produces longer and longer prefixes of the output f (x) as it reads longer and longer prefixes of the input x. Therefore, such a machine produces the output f (x) in the limit. We denote by TM the class of Turing machines computing functions in this sense. As an example, consider the function f that takes as input any data ω-word u = (σ 1 , d 1 )(σ 2 , d 2 ) . . . and outputs (σ 1 , d 1 ) ω if d 1 occurs at least twice in u, and otherwise outputs u. This function is not computable, as an hypothetic machine could not output anything as long as d 1 is not met a second time. However, the following function g is computable. It is defined only on words (σ 1 , d 1 )(σ 2 , d 2 ) . . . such that σ 1 σ 2 · · · ∈ ((a + b)c * ) ω , and transforms any (σ i , d i ) by (σ i , d 1 ) if the next symbol in {a, b} is an a, otherwise it keeps (σ i , d i ) unchanged. To compute it, a TM would need to store d 1 , and then wait until the next symbol in {a, b} is met before outputting something. Since the finite input labels are necessarily in ((a + b)c * ) ω , this machine will produce the whole output in the limit. Note that g cannot be defined by any deterministic register transducer, as it needs unbounded memory to be implemented.

However, already in the finite alphabet setting, the problem of deciding if a specification given as some non-deterministic synchronous transducer is realisable by some computable function is open. The particular case of realisability by computable functions of universal domain (the set of all ω-words) is known to be decidable [12] . In the asynchronous setting, the undecidability proof of [2] can be easily adapted to show the undecidability of realisability of specifications given by non-deterministic (asynchronous) transducers by computable functions.

Functional Specifications As said before, a specification is in general a relation from inputs to outputs. If this relation is a function, we call it functional. Due to the negative results just mentioned about the synthesis of computable functions from non-functional specifications, we instead here focus on the case of functional specifications and address the following general question: given the specification of a function of data ω-words, is this function "implementable", where we define "implementable" as "being computable by some Turing machine". Moreover, if it is implementable, then we want a procedure to automatically generate an algorithm that computes it. This raises another important question: how to decide whether a specification is functional ? We investigate these questions for asynchronous register transducers, here called register transducers. This asynchrony allows for much more expressive power, but is a source of technical challenge.

Contributions In this paper, we solve the questions mentioned before for the class of (asynchronous) non-deterministic register transducers (NRT). We also give fundamental results on this class. In particular, we prove that:

1. deciding whether an NRT defines a function is PSpace-complete, 2. deciding whether two functions defined by NRT are equal on the intersection of their domains is PSpace-complete, 3. the class of functions defined by NRT is effectively closed under composition, 4. computability and continuity are equivalent notions for functions defined by NRT, where continuity is defined using the classical Cantor distance, 5. deciding whether a function given as an NRT is computable is PSpace-c, 6. those problems are in PTime for a subclass of NRT, called test-free NRT.

Finally, we also mention that considering the class of deterministic register transducers (DRT for short) instead of computable functions as a yardstick for the notion of being "implementable" for a function would yield undecidability. Indeed, given a function defined by some NRT, it is in general undecidable to check whether this function is realisable by some DRT, by a simple reduction from the universality problem of non-deterministic register automata [19] .

The notion of continuity with regards to Cantor distance is not new, and for rational functions over finite alphabets, it was already known to be decidable [21] . Its connection with computability for functions of ω-words over a finite alphabet has recently been investigated in [3] for one-way and two-way transducers. Our results lift some of theirs to the setting of data words. The model of test-free NRT can be seen as a one-way non-deterministic version of a model of two-way transducers considered in [5] .

For a (possibly infinite) set S, we denote by S * (resp. S ω ) the set of finite (resp. infinite) words over this alphabet, and we let S ∞ = S * ∪ S ω . For a word u = u 1 . . . u n , we denote u = n its length, and, by convention, for u ∈ S ω , u = ∞. The empty word is denoted ε.

In this case, we define u −1 v = w. For u, v ∈ S ∞ , we say that u and v mismatch, written mismatch(u, v), when there exists a position i such that 1 ≤ i ≤ u ,

Finally, for u, v ∈ S ∞ , we denote by u ∧ v their longest common prefix, i.e. the longest word w ∈ S ∞ such that w u and w v.

Data Words In this paper, Σ and Γ are two finite alphabets and D is a countably infinite set of data. We use letter σ (resp. γ, d) to denote elements of Σ (resp. Γ , D). We also distinguish an arbitrary data value d 0 ∈ D. Given a set R, let τ R 0 be the constant function defined by τ R 0 (r) = d 0 for all r ∈ R. Given a finite alphabet A, a labelled data is a pair x = (a, d) ∈ A × D, where a is the label and d the data. We define the projections lab(x) = a and dt(x) = d. A data word over A and D is an infinite sequence of labelled data, i.e. a word w ∈ (A × D) ω . We extend the projections lab and dt to data words naturally, i.e. lab(w) ∈ A ω and dt(w) ∈ D ω . A data word language is a subset L ⊆ (A × D) ω . Note that here, data words are infinite, otherwise they are called finite data words.

Register transducers are transducers recognising data word relations. They are an extension of finite transducers to data word relations, in the same way register automata [15] are an extension of finite automata to data word languages. Here, we define them over infinite data words with a Büchi acceptance condition, and allow multiple registers to contain the same data, with a syntax close to [18] . The current data can be compared for equality with the register contents via tests, which are symbolic and defined via Boolean formulas of the following form. Given R a set of registers, a test is a formula φ satisfying the following syntax:

where r ∈ R. Given a valuation τ : R → D, a test φ and a data d, we denote by τ, d |= φ the satisfiability of φ by d in valuation τ , defined as τ, d |= r = if τ (r) = d and τ, d |= r = if τ (r) = d. The Boolean combinators behave as usual.

We denote by Tst R the set of (symbolic) tests over R.

The semantics of a register transducer is given by a labelled transition system:

* is the set of labels, and we have, for all (q, τ ), (q , τ ) ∈ C and for all (l, w) ∈ Λ, that (q, τ ) -(Matching labels) σ = σ -(Compatibility) d satisfies the test φ ∈ Tst R , i.e. τ, d |= φ.

-(Update) τ is the successor register configuration of τ with regards to d and asgn: τ (r) = d if r ∈ asgn, and τ (r) = τ (r) otherwise -(Output) By writing o = (γ 1 , r 1 ) . . . (γ m , r m ), we have that m = n and for all 1 ≤ i ≤ n, γ i = γ i and d i = τ (r i ).

Then, a run of T is an infinite sequence of configurations and transitions

. . . We also define its sequence of states st(ρ) = q 0 q 1 . . . , and its trace tr

Finally, it is accepting if it is both initial and final. We

to express that there is a final run ρ of T starting from (q 0 , τ 0 ) such that in(ρ) = u and out(ρ) = v. In the whole paper, and unless stated otherwise, we always assume that the output of an accepting run is infinite (v ∈ (Γ × D) ω ), which can be ensured by a Büchi condition.

A partial run is a finite prefix of a run. The notions of input, output and states are extended by taking the corresponding prefixes. We then write (q 0 , τ 0 ) u|v − − → T (q n , τ n ) to express that there is a partial run ρ of T starting from configuration (q 0 , τ 0 ) and ending in configuration (q n , τ n ) such that in(ρ) = u and out(ρ) = v. Finally, the relation represented by a transducer T is:

As an example, consider the register transducer T rename depicted in Figure 1 . It realises the following transformation: consider a setting in which we deal with logs of communications between a set of clients. Such a log is an infinite sequence of pairs consisting of a tag, chosen in some finite alphabet Σ, and the identifier of the client delivering this tag, chosen in some infinite set of data values. The transformation should modify the log as follows: for a given client that needs to be modified, each of its messages should now be associated with some new identifier. The transformation has to verify that this new identifier is indeed free, i.e. never used in the log. Before treating the log, the transformation receives as input the id of the client that needs to be modified (associated with the tag del), and then a sequence of identifiers (associated with the tag ch), ending with #. The transducer is non-deterministic as it has to guess which of these identifiers it can choose to replace the one of the client. In particular, observe that it may associate multiple output words to a same input if two such free identifiers exist.

Trename. It has three registers r1, r2 and r0 and four states. σ denotes any letter in Σ, r1 stores the id of del and r2 the chosen id of ch, while r0 is used to output the last data value read as input. As we only assign data to single registers, we write ri for the singleton assignment set {ri}.

Finite Transducers Since we reduce the decision of continuity and functionality of NRT to the one of finite transducers, let us introduce them: a finite transducer (NFT for short) is an NRT with 0 registers (i.e. R = ∅). Thus, its transition relation can be represented as 

Let T be an NRT with k registers, and let X ⊂ f D be a finite subset of data. Then, T ∩ (Σ × X) ω × (Γ × X) ω is recognised by an NFT of exponential size, more precisely with O(|Q| × |X| |R| ) states.

Although automata are simpler machines than transducers, we only use them as tools in our proofs, which is why we define them from transducers, and not the other way around. A non-deterministic register automaton, denoted NRA, is a transducer without outputs: its transition relation is

The semantics are the same, except that now we lift the condition that the output v is infinite since there is no output. For A an NRA, we denote L(A) = {u ∈ (Σ × D) ω | there exists an accepting run ρ of A over u}. Necessarily the output of an accepting run is ε. In this section, we establish technical properties about NRA. Proposition 4, the so-called "indistinguishability property", was shown in the seminal paper by Kaminski and Francez [15, Proposition 1] . Their model differs in that they do not allow distinct registers to contain the same data, and in the corresponding test syntax, but their result easily carries to our setting. It states that if an NRA accepts a data word, then such data word can be relabelled with data from any set containing d 0 and with at least k + 1 elements. Indeed, at any point of time, the automaton can only store at most k data in its registers, so its notion of "freshness" is a local one, and forgotten data can thus be reused as fresh ones. Moreover, as the automaton only tests data for equality, their actual value does not matter, except for d 0 which is initially contained in the registers.

Such "small-witness" property is fundamental to NRA, and will be paramount in establishing decidability of functionality (Section 3) and computability (Section 4). We use it jointly with Lemma 5, which states that the interleaving of the traces of runs of an NRT can be recognised with an NRA, and Lemma 6, which expresses that an NRA can check whether interleaved words coincide on some bounded prefix, and/or mismatch before some given position.

The runs of a register transducer T can be flattened to their traces, so as to be recognised by an NRA. Those traces can then be interleaved, in order to be compared. The proofs of the following properties are straightforward.

two runs of a transducer T . Then, we define their interleaving ρ 1 ⊗ρ 2 = u 1 ·u 1 ·v 1 · v 1 · u 2 · u 2 · v 2 · v 2 . . . and L ⊗ (T ) = {ρ 1 ⊗ ρ 2 | ρ 1 and ρ 2 are accepting runs of T }.

Lemma 5. If T has k registers, then L ⊗ (T ) is recognised by an NRA with 2k registers.

Then, M i j is recognisable by an NRA with 2 registers and with 1 register if i = ∞.

In general, since they are non-deterministic, NRT may not define functions but relations, as illustrated by Example 2. In this section, we first show that deciding whether a given NRT defines a function is PSpace-complete, in which case we call it functional. We show, as a consequence, that testing whether two functional NRT define two functions which coincide on their common domain is PSpace-complete. Finally, we show that functions defined by NRT are closed under composition. This is an appealing property in transducer theory, as it allows to define complex functions by composing simple ones.

Example 7. As explained before, the transducer T rename described in Example 2 is not functional. To gain functionality, one can reinforce the specification by considering that one gets at the beginning a list of k possible identifiers, and that one has to select the first one which is free, for some fixed k. This transformation is realised by the register transducer T rename2 depicted in Figure 2 (for k = 2).

, with four registers r1, r2, r3 and r0 (the latter being used, as in Figure 1 , to output the last read data). After reading the # symbol, it guesses whether the value of register r2 appears in the suffix of the input word. If not, it goes to state 5, and replaces occurrences of r1 by r2. Otherwise, it moves to state 6, waiting for an occurrence of r2, and replaces occurrences of r1 by r3.

Let us start with the functionality problem in the data-free case. It is already known that checking whether an NFT over ω-words is functional is decidable [13, 11] . By relying on the pattern logic of [10] designed for transducers of finite words, it can be shown that it is decidable in NLogSpace.

Deciding whether an NFT is functional is in NLogSpace.

The following theorem shows that a relation between data-words defined by an NRT with k registers is a function iff its restriction to a set of data with at most 2k + 3 data is a function. As a consequence, functionality is decidable as it reduces to the functionality problem of transducers over a finite alphabet.

Theorem 9. Let T be an NRT with k registers. Then, for all X ⊆ D of size |X| ≥ 2k + 3 such that d 0 ∈ X, we have that T is functional if and only if

Proof. The left-to-right direction is trivial. Now, assume T is not functional. Let x ∈ (Σ × D) ω be such that there exists y, z ∈ (Γ × D) ω such that y = z and (x, y), (x, z) ∈ T . Let i = y ∧ z . Then, consider the language L = {ρ 1 ⊗ ρ 2 | ρ 1 and ρ 2 are accepting runs of T, in(ρ 1 ) = in(ρ 2 ) and out(ρ 1 )∧out(ρ 2 ) ≤ i}. Since, by Lemma 5, L ⊗ (T ) is recognised by an NRA with 2k registers and, by Lemma 6, M i ∞ is recognised by an NRA with 2 registers, we get that L = L ⊗ (T ) ∩ M i ∞ is recognised by an NRA with 2k + 2 registers. Now, L = ∅, since, by letting ρ 1 and ρ 2 be the runs of T both with input x and with respective outputs y and z, we have that w = ρ 1 ⊗ ρ 2 ∈ L. Let X ⊆ D such that |X| ≥ 2k + 3 and d 0 ∈ X. By Proposition 4, we get that L ∩ (Σ × X) ω = ∅. By letting w = ρ 1 ⊗ ρ 2 ∈ L ∩ (Σ × X) ω , and x = in(ρ 1 ) = in(ρ 2 ), y = out(ρ 1 ) and z = out(ρ 2 ), we have that (x , y ), (x , z ) ∈ T ∩ ((Σ × X) ω × (Γ × X) ω ) and y ∧ z ≤ i, so, in particular, y = z (since both are infinite words). Thus,

As a consequence of Proposition 8 and Theorem 9, we obtain the following result. The lower bound is obtained by encoding non-emptiness of register automata, which is PSpace-complete [4] .

Deciding whether an NRT T is functional is PSpace-complete.

Hence, the following problem on the equivalence of NRT is decidable:

Theorem 11. The problem of deciding, given two functions f, g defined by NRT, whether for all

} is a function. The latter can be decided by testing whether the disjoint union of the transducers defining f and g defines a function, which is in PSpace by Corollary 10. To show the hardness, we similarly reduce the emptiness problem of NRA A over finite words, just as in the proof of Corollary 10. In particular, the functions f 1 and f 2 defined in this proof (which have the same domain) are equal iff L(A) = ∅.

Note that under the promise that f and g have the same domain, the latter theorem implies that it is decidable to check whether the two functions are equal. However, checking dom(f ) = dom(g) is undecidable, as the languageequivalence problem for non-deterministic register automata is undecidable, since, in particular, universality is undecidable [19] .

Closure under composition is a desirable property for transducers, which holds in the data-free setting [1] . We show that it also holds for functional NRT.

Theorem 12. Let f, g be two functions defined by NRT. Then, their composition f • g is (effectively) definable by some NRT.

Proof (Sketch). By f • g we mean f • g : x → f (g(x) ). Assume f and g are defined by T f = (Q f , R f , q 0 , F f , ∆ f ) and T g = (Q g , R g , p 0 , F g , ∆ g ) respectively. Wlog we assume that the input and output finite alphabets of T f and T g are all equal to Σ, and that R f and R g are disjoint. We construct T such that T = f • g. The proof is similar to the data-free case where the composition is shown via a product construction which simulates both transducers in parallel, executing the second on the output of the first. Assume T g has some transition

Then T has to be able to execute transitions of T f while processing o, even though o does not contain any concrete data values (it is here the main important difference with the data-free setting). However, if T knows the equality types between R f and R g , then it is able to trigger the transitions of T f . For example, assume that o = (a, r g ) and assume that the content of r g is equal to the content of r f , r f being a register of T f , then if T f has some transition of the form p , q ) where the operation r f := r g is a syntactic sugar on top of NRT that intuitively means "put the content of r g into r f ".

Remark 13. The proof of Theorem 12 does not use the hypothesis that f and g are functions, and actually shows a stronger result, namely that relations defined by NRT are closed under composition.

We equip the set of (finite or infinite) data words with the usual distance: for u, v ∈ (Σ × D) ω , d(u, v) = 0 if u = v and d(u, v) = 2 − u∧v otherwise. A sequence of (finite or infinite) data words (x n ) n∈N converges to some infinite data word x if for all > 0, there exists N ≥ 0 such that for all n ≥ N , d(x n , x) ≤ .

In order to reason with computability, we assume in the sequel that the infinite set of data values D we are dealing with has an effective representation. For instance, this is the case when D = N.

We now define how a Turing machine can compute a function of data words. We consider deterministic Turing machines, which three tapes: a read-only oneway input tape (containing the infinite input data word), a two-way working tape, and a write-only one-way output tape (on which it writes the infinite output data word). Consider some input data word x ∈ (Σ × D) ω . For any integer k ∈ N, we let M (x, k) denote the output written by M on its output tape after having read the k first cells of the input tape. Observe that as the output tape is write-only, the sequence of data words (M (x, k)) k≥0 is non-decreasing.

x ∈ dom(f ), the sequence (M (x, k)) k≥0 converges to f (x).

(a) for all sequences of data words (x n ) n∈N converging towards x, where for all i ∈ N, x i ∈ dom(f ), we have that (f (x n )) n∈N converges to f (x).

Then, f is continuous if and only if it is continuous at each x ∈ dom(f ). Finally, a functional NRT T is continuous when T is continuous. Example 16. We give an example of a non-continuous function f . The finite input and output alphabets are unary, and are therefore ignored in the description of f . Such function associates with every sequence s = d 1 d 2 · · · ∈ D ω the word f (s) = d ω 1 if d 1 occurs infinitely many times in s, otherwise f (s) = s itself. The function f is not continuous. Indeed, by taking d = d , the sequence of data words d

Moreover, f is realisable by some NRT which non-deterministically guesses whether d 1 repeats infinitely many times or not. It needs only one register r in which to store d 1 . In the first case, it checks whether the current data d is equal the content r infinitely often, and in the second case, it checks that this test succeeds finitely many times, using Büchi conditions.

One can show that the register transducer T rename2 considered in Example 7 also realises a function which is not continuous, as the value stored in register r 2 may appear arbitrarily far in the input word. One could modify the specification to obtain a continuous function as follows. Instead of considering an infinite log, one considers now an infinite sequence of finite logs, separated by $ symbols. The register transducer T rename3 , depicted in Figure 3 , defines such a function. del, | r1, We now prove the equivalence between continuity and computability for functions defined by NRT. One direction, namely the fact that computability implies continuity, is easy, almost by definition. For the other direction, we rely on the following lemma which states that it is decidable whether a word v can be safely output, only knowing a prefix u of the input. In particular, given a function f , we letf be the function defined over all finite prefixes u of words in dom(f ) byf (u) = (f (uy) | uy ∈ dom(f )), the longest common prefix of all outputs of continuations of u by f . Then, we have the following decidability result:

Lemma 17. The following problem is decidable. Given an NRT T defining a function f , two finite data words u ∈ (Σ × D) * and v ∈ (Γ × D) * , decide whether v f (u).

Theorem 18. Let f be a function defined by some NRT T . Then f is continuous iff f is computable.

Proof. ⇐ Assuming f = T is computable by some Turing machine M , we show that f is continuous. Indeed, consider some x ∈ dom(f ), and some i ≥ 0. As the sequence of finite words (M (x, k)) k∈N converges to f (x) and these words have non-decreasing lengths, there exists j ≥ 0 such that |M (x, j)| ≥ i. Hence, for any data word y ∈ dom(f ) such that |x ∧ y| ≥ j, the behaviour of M on y is the same during the first j steps, as M is deterministic, and thus |f (

showing that f is continuous at x.

⇒ Assume that f is continuous. We describe a Turing machine computing f ; the corresponding algorithm is formalised as Algorithm 1. When reading a finite prefix Now that we have shown that computability is equivalent with continuity for functions defined by NRT, we exhibit a pattern which allows to decide continuity. Such pattern generalises the one of [3] to the setting of data words, the difficulty lying in showing that our pattern can be restricted to a finite number of data.

Theorem 19. Let T be a functional NRT with k registers. Then, for all X ⊆ D such that |X| ≥ 2k + 3 and d 0 ∈ X, T is not continuous at some x ∈ (Σ × D) ω if and only if T is not continuous at some z ∈ (Σ × X) ω .

Proof. The right-to-left direction is trivial. Now, let T be a functional NRT with k registers which is not continuous at some

Now, let X ⊆ D be such that |X| ≥ 2k + 3 and d 0 ∈ X. We need to build two words u and v labelled over X which coincide on a sufficiently long prefix to allow for pumping, hence yielding a converging sequence of input data words whose images do not converge, witnessing non-continuity. To that end, we use a similar proof technique as for Theorem 9: we show that the language of interleaved runs whose inputs coincide on a sufficiently long prefix while their respective outputs mismatch before a given position is recognisable by an NRA, allowing us to use the indistinguishability property. We also ask that one run presents sufficiently many occurrences of a final state q f , so that we can ensure that there exists a pair of configurations containing q f which repeats in both runs.

On reading such u and v, the automaton behaves as a finite automaton, since the number of data is finite ([15, Proposition 1]). By analysing the respective runs, we can, using pumping arguments, bound the position on which the mismatch appears in u, then show the existence of a synchronised loop over u and v after such position, allowing us to build the sought witness for non-continuity.

Relabel over X Thus, assume T is not continuous at some point x ∈ (Σ × D) ω . Let ρ be an accepting run of T over x, and let q f ∈ inf(st(ρ)) ∩ F be an accepting state repeating infinitely often in ρ. Then, let i ≥ 0 be such that for all j ≥ 0, there exists y ∈ dom(f ) such that x ∧ y ≥ j but f (x) ∧ f (y) ≤ i. Now, define K = |Q| × (2k + 3) 2k and let m = (2i + 3) × (K + 1). Finally, pick j such that ρ[1:j] contains at least m occurrences of q f . Consider the language: Choose y ∈ dom(f ) such that x ∧ y ≥ j but f (x) ∧ f (y) ≤ i. By letting ρ 1 (resp. ρ 2 ) be an accepting run of T over x (resp. y) we have ρ 1 ⊗ ρ 2 ∈ L, so L = ∅. By Proposition 4, Figure 4 , where we decompose u as u = u 1 . . . u m ·s and v as v = u 1 . . . u m ·t; their corresponding images being respectively u = u 1 . . . u m · s and u = u 1 . . . u m t . We also let l = (i + 1)(K + 1) and l = 2(i + 1)(K + 1). Since the data of u, v and w belong to X, we know that Repeating configurations First, let us observe that in a partial run of ρ 1 containing more than |Q| × |X| k occurrences of q f , there is at least one productive transition, i.e. a transition whose output is o = ε. Otherwise, by the pigeonhole principle, there exists a configuration µ : R → X such that (q f , µ) occurs at least twice in the partial run. Since all transitions are improductive, it would mean that, by writing w the corresponding part of input, we have (q f , µ)

. This partial run is part of ρ 1 , so, in particular, (q f , µ) is accessible, hence by taking

we have that f (w 0 w ω ) = w 0 , which is a finite word, contradicting our assumption that all accepting runs produce an infinite output. This implies that, for any n ≥ |Q| × |X| k (in particular for n = l), u 1 . . . u n ≥ i + 1.

Locate the mismatch Again, upon reading u l+1 . . . u l , there are (i + 1)(K + 1) occurrences of q f . There are two cases:

(a) There are at least i + 1 productive transitions in ρ 2 . Then, we obtain that u 1 . . . u l > i, so mismatch(u 1 . . . u l , u 1 . . . u l ), since we know f (u) ∧ f (v) ≤ i and they are respectively prefixes of f (u) and f (v), both of length at least i+1. Afterwards, upon reading u l +1 . . . u m , there are K+1 > |Q|×|X| 2k occurrences of q f , so, by the pigeonhole principle, there is a repeating pair: there exist indices p and p such that l ≤ p < p ≤ m and (q f , µ p ) = (q f , µ p ), (q p , τ p ) = (q p , τ p ). Thus, let z P = u 1 . . . u p , z R = u p+1 . . . u p and z C = u p +1 . . . u m · t (P stands for prefix, R for repeat and C for continuation; we use capital letters to avoid confusion with indices). By denoting z P = u 1 . . . u p , z R = u p+1 . . . u p , z P = u 1 . . . u p , z R = u p+1 . . . u p and z C = u p +1 . . . u m · t the corresponding images, z = z P · z R ω is a point of discontinuity. Indeed, define (z n ) n∈N as, for all n ∈ N, z n = z P · z n R · z C . Then, (z n ) n∈N converges towards z, but, since for all n ∈ N, f (z n ) = z P · z L n · z C , we have that f (z n ) − − → n∞ f (z) = z P · z L ω , since mismatch(z P , z P ).

(b) Otherwise, by the same reasoning as above, it means there exists a repeating pair with only improductive transitions in between: there exist indices p and p such that l ≤ p < p ≤ l , (q f , µ p ) = (q f , µ p ), (q p , τ p ) = (q p , τ p ),

Then, by taking z P = u 1 . . . u p , z R = u p+1 . . . u p and z C = u p +1 . . . u m · t, we have, by letting z P = u 1 . . . u p , z R = u p+1 . . . u p , z P = u 1 . . . u p , z R = ε and z C = u n +1 . . . u m · t , that z = z P · z R ω is a point of discontinuity. Indeed, define (z n ) n∈N as, for all n ∈ N, z n = z P · z n R · z C . Then, (z n ) n∈N indeed converges towards z, but, since for all n ∈ N, f (z n ) = z P · z C , we have that f (z n ) − − → n∞ f (z) = z P · z R ω , since mismatch(z P , z P · z C ) (the mismatch necessarily lies in z P , since z P ≥ i + 1).

PSpace-complete.

Proof. Let X ⊆ D be a set of size 2k + 3 containing d 0 . By Theorem 19, T is not continuous iff it is not continuous at some z ∈ (Σ × X) ω , iff T ∩ (Σ × X) ω × (Γ × X) ω is not continuous. By Proposition 3, such relation is recognisable by a finite transducer T X with O(|Q| × |X| |R| ) states, which can be built on-the-fly. By [3] , the continuity of functions defined by NFT is decidable in NLogSpace, which yields a PSpace procedure.

For the hardness, we reduce again from the emptiness problem of register automata, which is PSpace-complete [4] . Let A be a register automaton over some alphabet Σ × D. We construct a transducer T which defines a continuous function iff L(A) = ∅ iff the domain of T is empty. Let f be a non-continous function realised by some NRT H (it exists by Example 16). Then, let # ∈ Σ be a fresh symbol, and define the function g as the function mapping any data word of the form w(#, d)w to w(#, d)f (w ) if w ∈ L(A). The function g is realised by an NRT which simulates A and copies its inputs on the output to implement the identity, until it sees #. If it was in some accepting state of A before seeing #, it branches to some initial state of H and proceeds executing H. If there is some w 0 ∈ L(A), then the subfunction g w0 mapping words of the form w 0 (#, d)w to w 0 (#, d)f (w ) is not continuous, since f is not. Hence g is not continuous. Conversely, if L(A) = ∅, then dom(g) = ∅, so g is continuous.

In [3] , non-continuity is characterised by a specific pattern (Lemma 21, Figure 1 ), i.e. the existence of some particular sequence of transitions. By applying this characterisation to the finite transducer recognising T ∩ ((Σ × X) ω × (Γ × X) ω ), as constructed in Proposition 3, we can characterise non-continuity by a similar pattern, which will prove useful to decide (non-)continuity of test-free NRT in NLogSpace (cf Section 5):

Corollary 21 ( [3] ). Let T be an NRT with k registers. Then, for all X ⊆ D such that |X| ≥ 2k + 3 and d 0 ∈ X, T is not continuous at some x ∈ (Σ × D) ω if and only if it has the pattern of Figure 5 . , where q f is accepting, as well as finite input data words u, v, finite output data words u , v , u , v , and an infinite input data word w admitting an accepting run from configuration (q, τ ) producing output w , such that mismatch(u , u ) ∨ (v = ε ∧ mismatch(u , u w )).

In [7] , we introduced a restriction which allows to recover decidability of the bounded synthesis problem for specifications expressed as non-deterministic register automata. Applied to transducers, such restriction also yields polynomial complexities when considering the functionality and computability problems. An NRT T is test-free when its transition function does not depend on the tests conducted over the input data. Formally, we say that T is test-free if for all

Thus, we can omit the tests altogether and its transition relation can be represented as

Example 22. Consider the function f : (Σ × D) ω → (Γ × D) ω associating, to x = (σ 1 , d 1 )(σ 2 , d 2 ) . . . , the value (σ 1 , d 1 )(σ 2 , d 1 )(σ 3 , d 1 ) . . . if there are infinitely many a in x, and (σ 1 , d 2 )(σ 2 , d 2 )(σ 3 , d 2 ) . . . otherwise.

f can be implemented using a test-free NRT with one register: it initially guesses whether there are infinitely many a in x, if it is the case, it stores d 1 in the single register r, otherwise it waits for the next input to get d 2 and stores it in r. Then, it outputs the content of r along with each σ i . f is not continuous, as even outputting the first data requires reading an infinite prefix when d 1 = d 2 .

Note that when a transducer is test-free, the existence of an accepting run over a given input x only depends on its finite labels. Hence, the existence of two outputs y and z which mismatch over data can be characterised by a simple pattern (Figure 6 ), which allows to decide functionality in polynomial time:

Theorem 23. Deciding whether a test-free NRT is functional is in PTime.

Proof. Let T be a test-free NRT such that T is not functional. Then, there exists x ∈ (Σ × D) ω , y, z ∈ (Γ × D) ω such that (x, y), (x, z) ∈ T and y = z. Then, let i be such that y[i] = z[i]. There are two cases. Either lab(y[i]) = lab(z[i]), which means that the finite transducer T obtained by ignoring the registers of T is not functional. By Proposition 8, such property can be decided in NLogSpace, so let us focus on the second case: ). Finally, for readability, we did not write that r should not be reassigned between j and l . Note that the position of i with regards to j, j , l and l does not matter; nor does the position of l w.r.t. l .

We here give a sketch of the proof: observe that an input x admits two outputs which mismatch over data if and only if it admits two runs which respectively store x[j] and x[j ] such that x[j] = x[j ] and output them later at the same output position i; the outputs y and z are then such that dt(y[i]) = dt(z[i]). Since T is test-free, the existence of two runs over the same input x only depends on its finite labels. Then, the registers containing respectively x[j] and x[j ] should not be reassigned before being output, and should indeed output their content at the same position i (cf Figure 6) . Besides, again because of test-freeness, we can always assume that x is such that x[j] = x[j ]. Overall, such pattern can be checked by a 2-counter Parikh automaton, whose emptiness is decidable in PTime [8] (under conditions that are satisfied here). Now, let us move to the case of continuity. Here again, the fact that test-free NRT conduct no test over the input data allows to focus on the only two registers that are responsible for the mismatch, the existence of an accepting run being only determined by finite labels.

Theorem 24. Deciding whether a test-free NRT defines a continuous function is in PTime.

Proof. Let T be a test-free NRT. First, it can be shown that T is continuous if and only if T has the pattern of Figure 7 , where r is coaccessible (since acceptance only depends on finite labels, T can be trimmed 3 in polynomial time). Fig. 7 . A pattern characterising non-continuity of functions defined by NRT, where we ask that there exist some states q f , q and r, where q f is accepting, as well as finite input data words u, v, z and finite output data words u , v , u , v , z such that mismatch(u , u )∨(v = ε ∧ mismatch(u , u z )). Register assignments are not depicted, as there are no conditions on them. We unrolled the loops to highlight the fact that they do not necessarily loop back to the same configuration.

Now, it remains to show that such simpler pattern can be checked in PTime. We treat each part of the disjunction separately: q, where q f ∈ F and mismatch(u , u ). Then, as shown in the proof of Theorem 23, there exists a mismatch between some u and u produced by the same input u if and only if there exists two runs and two registers r and r assigned at two distinct positions, and later on output at the same position. Such pattern can similarly be checked by a 2-counter Parikh automaton; the only difference is that here, instead of checking that the two end states are coaccessible with a common ω-word, we only need to check that q f ∈ F and that there is a synchronised loop over q f and q, which are regular properties that can be checked by the Parikh automaton with only a polynomial increase. , where q f ∈ F and mismatch(u , u z ). By examining again the proof of Theorem 23, it can be shown that to obtain a mismatch, it suffices that the input is the same for both runs only up to position max(j, j ). More precisely, there is a mismatch between u and u z if and only if there exists two registers r and r and two positions j, j ∈ {1, . . . , u } such that j = j , r is stored at position j, r is stored at position j , r and r are respectively output at input positions l ∈ {1, . . . , u } and l ∈ {1, . . . , uz } and they are not reassigned in the meantime. Again, such property, along with the fact that q f ∈ F and the existence of a synchronised loop can be checked by a 2-counter Parikh automaton of polynomial size.

Overall, deciding whether a test-free NRT is continuous is in PTime.

Transductions and Context-free Languages

Uniformization in Automata Theory

Deciding the computability of regular functions over infinite words

LTL with the freeze quantifier and register automata

Regular transformations of data words through origin information

Synthesis with identifiers

Synthesis of data word transducers

Path logics for querying graphs: Combining expressiveness and efficiency

On equivalence and uniformisation problems for finite transducers

A pattern logic for automata with outputs

Two decidability problems for infinite words

Degrees of lookahead in regular infinite games

Equivalence problems for mappings on infinite strings

Solving sequential conditions finite-state strategies

Finite-memory automata

Register-bounded synthesis

Bounded synthesis of register transducers

Regular expressions for data words

Finite state machines for strings over infinite alphabets

On the synthesis of a reactive module

How to decide continuity of rational functions on infinite words

), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use