key: cord-0043817-rac91hmt authors: Hospodár, Michal; Mlynárčik, Peter title: Operations on Permutation Automata date: 2020-05-26 journal: Developments in Language Theory DOI: 10.1007/978-3-030-48516-0_10 sha: 21a52a9e9829c2ef0229f12b0347a426d0ea0abf doc_id: 43817 cord_uid: rac91hmt We investigate the class of languages recognized by permutation deterministic finite automata. Using automata constructions and some properties of permutation automata, we show that this class is closed under Boolean operations, reversal, and quotients, and it is not closed under concatenation, power, Kleene closure, positive closure, cut, shuffle, cyclic shift, and permutation. We prove that the state complexity of Boolean operations, Kleene closure, positive closure, and right quotient on permutation languages is the same as in the general case of regular languages. Next, we get the tight upper bounds on the state complexity of concatenation ([Formula: see text]), square ([Formula: see text]), reversal ([Formula: see text]), and left quotient ([Formula: see text]; tight if [Formula: see text]). All our witnesses are unary or binary, and the binary alphabet is always optimal, except for Boolean operations in the case of [Formula: see text]. In the unary case, the state complexity of all considered operations is the same as for regular languages, except for quotients and cut. In case of quotients, it is [Formula: see text], and in case of cut, it is either [Formula: see text] or [Formula: see text], depending on whether there exists an integer [Formula: see text] with [Formula: see text] such that [Formula: see text]. A deterministic finite automaton (DFA) is said to be a permutation DFA if every input symbol induces a permutation on the state set. A language recognized by a permutation DFA is called a permutation language. The class of permutation languages has been introduced by Thierrin [22] who also proved some closure properties of this class using cancellative congruences. The aim of this paper is to study this class in detail, including the operational state complexity. The state complexity of a regular operation is the number of states sufficient and necessary in the worst case for a DFA to accept the language resulting from the operation, considered as a function of the numbers of states in DFAs for operands. The state complexity of basic regular operations was given by Maslov [15] and Yu et al. [23] . If the operands of an operation belong to some subclass of regular languages, then the complexity of this operation may be significantly smaller than in the general case. The operational state complexity in several subclasses of regular languages has been studied in the literature. Câmpeanu et al. [6] considered finite languages, Han and Salomaa [8, 9] investigated prefix-free and suffix-free languages, and Brzozowski et al. [3, 4] examined ideal and closed languages. The classes of co-finite, star-free, union-free, and unary languages have been studied as well [1, 5, 13, 18] . In this paper, we first study closure properties of the class of permutation languages. We provide alternative proofs to those in [22] ; our proofs use automata constructions and properties of permutation DFAs. Moreover, we prove that permutation languages are not closed under power, positive closure, cut, shuffle, cyclic shift, and permutation. Then we study the operational state complexity in the class of permutation languages. For each considered operation, we obtain tight upper bounds on its state complexity. Our witnesses are defined over unary or binary alphabets. In the case of left and right quotients, we need m ≤ n to prove tightness. We assume that the reader is familiar with the backgrounds in formal languages and automata theory. For details, the reader may refer to [11, 21] . For a rational number r, we denote the set {i ∈ Z | 0 ≤ i < r} by r. For example, we have 3 = {0, 1, 2}, and for an integer n we have n = {0, 1, . . . , n−1}. Next, we have 7 2 = {0, 1, 2, 3} and n 2 = {0, 1, . . . , n 2 − 1}. For a finite set S, its size is denoted by |S|, and its power-set by 2 S . Let Σ be a non-empty alphabet of symbols. Then Σ * denotes the set of all strings over Σ including the empty string ε. A language is any subset of Σ * . Given two languages K and L over Σ, the complement of L is the language L c = Σ * \L, and the operations of intersection, union, difference, and symmetric difference are defined as standard set operations. The concatenation of K and L is KL = {uv | u ∈ K and v ∈ L}. The k-th power of L is defined as A nondeterministic finite automaton with multiple initial states (MNFA) is a quintuple M = (Q, Σ, ·, I, F ) where Q is a finite non-empty set of states, Σ is a non-empty alphabet of input symbols, I ⊆ Q is the set of initial states, F ⊆ Q is the set of final states, and · : Q × Σ → 2 Q is the transition function which is naturally extended to the domain 2 Q × Σ * . For a state q, a set S, and a string w, we write qw and Sw instead of q · w and S · w if it does not cause any confusion. The language accepted by M is the set For states p and q and a symbol a, we write that M has a transition (p, a, q) if q ∈ pa. We also say that p has an out-transition on a and q has an in-transition on a. A state q is called a sink state if it has the transition (q, a, q) for each input symbol a and no other out-transitions. To omit a state means to remove it from the set of states and to remove all its in-transitions and out-transitions from the transition function. The reverse of an MNFA M is the MNFA M R obtained from M by reverting all transitions and swapping the role of initial and final states. If |I| = 1, we say that M is a nondeterministic finite automaton (NFA) and write (Q, Σ, ·, s, F ) instead of (Q, Σ, ·, {s}, F ). An NFA (Q, Σ, ·, s, F ) is called deterministic (DFA) if |qa| = 1 for each state q and each symbol a. Similarly, an MNFA with this property is said to be an MDFA. We write pa = q instead of pa = {q} and use the notation p a − → q. A DFA is minimal if its language cannot be accepted by any smaller DFA (with respect to number of states). It is well known that a DFA is minimal if and only if all its states are reachable and pairwise distinguishable. The state complexity of a language L, sc(L), is the number of states in a minimal DFA accepting L. Every MNFA M = (Q, Σ, ·, I, F ) can be converted to an equivalent deterministic finite automaton D(M ) = (2 Q , Σ, ·, I, {S | S ∩ F = ∅}) [19] . This DFA is called the subset automaton of M . The DFA D(M ) may not be minimal since some of its states may be unreachable or equivalent to other states. To prove distinguishability of states in subset automata, we use the following observation. To prove minimality of unary DFAs, we use the following lemma. A DFA A = (Q, Σ, ·, s, F ) is said to be a permutation DFA if for each p, q ∈ Q and each a ∈ Σ, p · a = q · a implies that p = q. A language is said to be a permutation language if it is recognized by a permutation DFA. In a permutation DFA, each input symbol a induces a permutation on Q, namely q → q · a. To describe transitions in permutation automata, we use the following notation. The formula a : (p 1 , p 2 , . . . , p k ) denotes that p i · a = p i+1 for i = 1, 2, . . . , k − 1, p k · a = p 1 , and q · a = q if q = p i for i = 1, 2, . . . , k. For example, we denote the maximal cyclic permutation on n by (0, 1, . . . , n−1), the transposition of 0 and 1 by (0, 1), and the identity by (0). The next observation shows that a language is a permutation language if and only if its minimal DFA is a permutation DFA. Proposition 3. Let A be a permutation DFA. Then the minimal DFA equivalent to A is a permutation DFA. Then there exist two distinct states p and q in B and an input symbol a in Σ such that p • a = q • a. Since B is minimal, the states p and q are reachable from s B by some strings u and v, respectively. Let p = s A · u and q = s A · v be the corresponding states in A. Since B is minimal, the states p and q are not equivalent, and therefore also p and q are not equivalent. However, p ·a and q ·a are equivalent since p • a = q • a. Moreover, for every positive i, the states p · a i and q · a i are equivalent as well. Since A is a permutation DFA, there exist positive integers k and such that p · a k = p and q · a = q . Then p · a k = p and q · a k = q . It follows that p and q are equivalent, a contradiction. Notice that the converse does not hold; for example, the minimal unary DFA for a * is a permutation DFA, however, this language is accepted by the two-state DFA (2, {a}, ·, 0, 2) with 0 · a = 1 and 1 · a = 1 which is not a permutation DFA. We use the following corollary of Proposition 3 without citing it. Proof. Let A have a reachable sink state. Let B be the minimal DFA for L(A). Then B must have a reachable sink state d. Since L(A) / ∈ {∅, Σ * }, the state d is reached from some other reachable state q by a symbol a. Then q · a = d · a, hence B is not a permutation automaton, and the lemma follows. A MDFA (Q, Σ, ·, I, F ) is said to be a permutation MDFA if each input symbol a in Σ induces a permutation on Q. Proof. Let M = (Q, Σ, ·, I, F ) be a permutation MDFA. Let us show that the subset automaton D(M ) is a permutation DFA. Let S and T be subsets of Q and assume that Sa = T a. Since M is a permutation automaton, we have |S| = |Sa| = |T a| = |T |. If Sa is empty, then S and T are empty as well. Otherwise, let |Sa| = k and let Sa = {p 1 , p 2 , . . . , p k }. For each p i in Sa there is a state s i in S and a state t i in T such that s i · a = p i = t i · a. This means that s i = t i . Moreover, if i = j, then p i = p j , and therefore s i = s j and t i = t j since M is a permutation MDFA. It follows that S = {s 1 , s 2 , . . . , s k } = {t 1 , t 2 , . . . , t k } = T , so the subset automaton D(M ) is a permutation DFA. In this section, we study the closure properties of the class of permutation languages. The results for Boolean operations, concatenation, Kleene closure, reversal, and quotients have been already obtained by Thierrin [22] using cancellative congruences; here we provide alternative proofs using automata constructions and properties of permutation automata. Moreover, we consider the operations of power, positive closure, cut, shuffle, cyclic shift, and permutation. Proof. (a) Let K and L be accepted by permutation DFAs A and B, respectively. The language L c is accepted by a DFA with the same transitions as in B, hence it is a permutation language. Next, intersection, union, difference, and symmetric difference are accepted by a product automaton with appropriately defined final states. The resulting product automaton is a permutation DFA. The language L R is accepted by a permutation MDFA obtained from B by reversing all transitions and by swapping the roles of initial and final states, and it is a permutation language by Lemma 6. If A = (Q, Σ, ·, s, F ), then the right quotient KL −1 is accepted by a permutation DFA obtained from A by making all states in the set {q | q · w ∈ F for some w in L} final. The left quotient L −1 K is accepted by a permutation MDFA obtained from A by making all states in {s · w | w ∈ L} initial, so it is a permutation language by Lemma 6. (b) Let k ≥ 2 and K = L = a(aa) * . Then K and L are unary permutation languages and we have KL = K ! L = K L = aa(aa) * and L k = a k (aa) * , which are not permutation languages. (c) Let L = aa(aaa) * . Then L is a unary permutation language and we have L * = {ε, aa} ∪ a 4 a * and L + = {aa} ∪ a 4 a * , which are co-finite and different from a * . By Lemma 5, they are not permutation languages. (d) Let A = (3, {a, b}, ·, 0, {0, 2}) be a permutation DFA with a : (0, 1, 2) and b : (0, 1). Construct an MNFA with ε-transitions for shift(L(A)) as described in [14, p. 340 ]. In the corresponding subset automaton, the initial subset is I = {q 00 , q 11 In this section, we study the state complexity of operations on permutation languages. Let us start with Boolean operations. We continue with concatenation, power, Kleene and positive closure, and reversal. We first consider the unary case, then we deal with a general alphabet. We construct an n-state NFA for L + by adding the transitions (q, a, s) whenever q · a ∈ F . In the corresponding subset automaton, the initial subset is {s} and there exists an integer t such that t ≤ n − 1 and |{s} · a t | ≥ 2. Since A is a cyclic unary DFA, we cannot reach the singleton set {s} from any other state of the subset automaton. It follows that to get a DFA for L * , we only need to mark the initial state {s} as final. It is shown in [23, Proof of Theorem 5.3] that the subset automaton for L + has at most (n − 1) 2 + 1 reachable states, and that this upper bound is met by the permutation language (aa) * if n = 2 and a n−1 (a n ) * if n ≥ 3. (d) The upper bound n is met by the permutation language (a n ) * . of · A and · B and moreover the transitions (q, a, s B ) whenever q · A a ∈ F A . In the corresponding subset automaton, only the states of the form {q}∪S with q ∈ Q A and S ⊆ Q B are reachable; denote such a state by (q, S). Moreover, if q ∈ F A , then s B ∈ S. Next, since B is a permutation automaton, we have Q B · w = Q B for every string w in Σ * . It follows that every string is accepted from (p, Q B ) for each p in Q A , and all these m states are equivalent to each other. This gives the upper bound (m − k)2 n + k2 n−1 − m + 1 ≤ m2 n − 2 n−1 − m + 1. (b) If the DFA for L has a unique final state which is initial, then L 2 = L. Otherwise, let L be accepted by a permutation DFA A with the initial state s and at least one final state f with f = s. Construct the NFA N for L 2 = LL as described in case (a). If the initial state s of A is final, then the initial state of the subset automaton D (N ) is (s, {s}) . It follows that for each reachable state (q, S) of D(N ), we have q ∈ S. Moreover, if q is final, then s ∈ S. In total, we get at most 2 n−1 + 2 n−2 + (n − 2)2 n−1 = n2 n−1 − 2 n−2 reachable states. Now assume that s is not final. Let us show that no state (q, S) with q ∈ S is reachable in D(N ). Since s is not final, the initial state of D(N ) is (s, ∅) with s / ∈ ∅. Now let (q, S) be any reachable state with q / ∈ S. Consider the state (p, T ) = (q, S)a for any input symbol a. Assume for a contradiction that p ∈ T . It follows that qa = p and either there is a state r in S such that ra = p, or p = s and p is final. In the former case, we get a contradiction with the fact that A is a permutation DFA. The latter case cannot occur since s is not final. It follows by induction that for every reachable state (q, S) we have q / ∈ S. In total, taking into account that s ∈ S whenever q is final, we get at most (n − 1)2 n−1 + 2 n−2 reachable states. This gives the upper bound. (1) i = n − 1. Then 0 ∈ S and n − 1 / ∈ S. Take S = (S \ {0}) · a n−1 . Then n − 2 / ∈ S , so (n − 2, S ) is reachable by induction and it is sent to (n − 1, S) by a. (2) i = 0 and 1 ∈ S. Take S = S · a n−1 . Since 0 / ∈ S, we have n − 1 / ∈ S , so (n − 1, S ) is reachable as shown in case (1) and it is sent to (0, S) by a. (3) i = 0 and min S > 1. Take S = S · a n−min S+1 . Then 0 / ∈ S and 1 ∈ S , so (0, S ) is reachable as shown in case (2) and it is sent to (0, S) by (ab) min S−1 . (4) 1 ≤ i ≤ n − 2. Take S = S · a n−i . Since i / ∈ S, we have 0 / ∈ S , so the set (0, S ) is reachable as shown in case (1) or (2) and it is sent to (i, S) by a i . Now we prove distinguishability. Let (i, S) and (j, T ) be two distinct states of the subset automaton with i / ∈ S and j / ∈ T . First let S = T , and without loss of generality, assume that s ∈ S \ T . Then the string a n−1−s is accepted from S and rejected from T . Now let S = T , and without loss of generality, let i < j. We have i, j / ∈ S. Consider the string a n−j (ab) j−i−1 . Let S = S · a n−j . Since j / ∈ S, we have 0 / ∈ S . Next, Since 0 / ∈ S , we have 0 / ∈ (S ∪ {1}) · (ab) j−i−1 ; notice that the only state which is set to 0 by a string in (ab) * is 0. This means that the resulting states differ in the second component, therefore (i, S) and (j, S) are distinguishable. This completes our proof for square. (c) The upper bound for Kleene closure is the same as for regular languages. For tightness, consider the n-state DFA A = (n, {a, b}, ·, 0, {n − 1}) with a : (0, 1, . . . , n − 1) and b : (1, 2, . . . , n − 2) . Notice that this DFA, shown in Fig. 2 , differs from the DFA in [17, Figure 4 ] just by having the unique final state n − 1 and by having the transition (n − 2, b, 1) instead of (n − 2, b, 0) . The language L(A) * is accepted by MNFA ({q 0 } ∪ n, {a, b}, •, {q 0 , 0}, {q 0 , n − 1}) where • has the same transitions as · and moreover we have 0 ∈ (n − 2) • a and 0 ∈ (n − 1) We can show by induction on the size of sets that each set in R is reachable in the corresponding subset automaton. We also can prove that the sets in R are pairwise distinguishable. To get an NFA for L + , we only remove the state q 0 from the MNFA for L * described above. The corresponding subset automaton has at most (3/4)2 n − 1 reachable states, and this upper bound is met by our witness for Kleene closure. (d) If L is accepted by an n-state permutation DFA A, then L R is accepted by a permutation MDFA A R obtained from A by reversing all transitions and by swapping the roles of initial and final states. We have sc(L R ) ≤ n n/2 and this upper bound is met by the language accepted by the permutation DFA (n, {a, b}, ·, 0, n 2 ) with a : (n − 1, n − 2, . . . , 0) and b : (0, 1) by Lemma 7. Notice that for concatenation, the upper bound is almost the same as for regular languages, while for square, the bound is exactly one half of the upper bound in the general case. In the next theorem, we consider quotients. In the unary case, we have L −1 K = KL −1 . Therefore sc(L −1 K) ≤ m. Let us show that sc(KL −1 ) ≤ n. The languages K and L are periodic; let their periods be k and , respectively. By construction, the quotient KL −1 has period k. If a i+ ∈ KL −1 , then a i ∈ KL −1 since L has period . Let a i ∈ KL −1 . Then a i w ∈ K for some w ∈ L. It follows that a i a a (k−1) w ∈ K since K has period k. Next, a (k−1) w ∈ L since L has period . It follows that a i+ ∈ KL −1 . Thus KL −1 has period , so sc(KL −1 ) ≤ ≤ n. This gives the upper bound min{m, n} which is met by K = L = (a min{m,n} ) * . Finally, we consider the cut operation on permutation languages. We briefly recall the construction of a DFA for the cut operation, cf. [2, p. 74 ], [7, p. 91] or [10, p. 193 ]. For two languages accepted by DFAs A and B with m and n states, respectively, we can construct the cut automaton A ! B as follows. The cut automaton has states in a grid with a row for every state of A and a column for every state of B, and one additional column which we denote by ⊥. The column ⊥ corresponds to the situation that we have not read a string in L(A) yet. When we reach a final state in A for the first time, we leave the column ⊥ and enter the product part of A ! B in the corresponding state. Unless we reach a final state of A again, the transitions in A ! B are the same as in the product automaton A × B. When we reach a final state in A again, we reset to a state in the column corresponding to the initial state of B. The final states of A ! B are all states in columns corresponding to the final states of B. ∈ L(A) and s = (s A , s B ) otherwise, and for each state (p, q) in Q and each input symbol a in Σ, we have The state complexity of cut in the general case is already solved in [7] where binary permutation languages were used as witnesses. The unary case is more interesting, as shown in the next theorem. Let m ≥ 1 and n ≥ 3. Let K and L be languages accepted by permutation DFAs with m and n states, respectively. Then sc(K ! L) ≤ (m − 1)n + m, and this upper bound is met by binary languages. If K and L are unary, then sc(K ! L) ≤ 2m−1. This bound is tight if m mod = 0 for some with 2 ≤ ≤ n, otherwise the tight upper bound is 2m − 2. Proof. If K = ∅ or L = ∅, then K ! L = ∅, so sc(K ! L) = 1. Let K and L be nonempty. In the general case, the witness languages from [7, Proof of Theorem 3.1] are accepted by permutation DFAs and they meet the upper bound (m−1)n+m. In the unary case, since each non-empty permutation language is infinite, the upper bound is 2m − 1 by [7, Proof of Theorem 3.2]. First assume that there exists with 2 ≤ ≤ n such that m mod = 0. Let K = a m−1 (a m ) * and L = a (m−1) mod (a ) * be unary languages recognized by permutation DFAs A and B of m and Since the loop in B is of length , we have (i, j) a − → (i + , j) for each state (i, j) in the loop; here i + is modulo m. The states (i, j) and (i + , j) have the same finality since all states with j in the second component have the same We examined the class of permutation languages, that is, the languages that are recognized by deterministic finite automata in which every input symbol induces a permutation on the state set. We used automata constructions and properties of permutation automata to show that the class of permutation languages is closed under Boolean operations, reversal, and left and right quotient, and it is not closed under concatenation, power, positive closure, Kleene closure, cut, shuffle, cyclic shift, and permutation. We also studied the state complexity of operations on permutation languages. Our results are summarized in Table 1 . The table also displays the size of alphabet used to describe witnesses. All our witnesses are described over a unary or binary alphabet, and the binary alphabet is always optimal except for the Boolean operations in case of gcd(m, n) = 1. We did not consider the state complexity of shuffle and cyclic shift and leave them for the future work. Although the class of regular languages is not closed under the permutation operation, we conjecture that the permutation of a permutation language is always regular. If this is the case, then the state complexity of this operation is of great interest to us as well. Complexity of operations on cofinite languages Cuts in regular expressions Quotient complexity of ideal languages Quotient complexity of closed languages Quotient complexity of star-free languages State complexity of basic operations on finite languages Tight bounds for cutoperations on deterministic finite automata State complexity of basic operations on suffix-free regular languages State complexity of prefix-free regular languages The range of state complexities of languages resulting from the cut operation Introduction to Automata Theory, Languages, and Computation NFA-to-DFA trade-off for regular operations Complexity in union-free regular languages State complexity of cyclic shift Estimates of the number of states of finite automata Average state complexity of operations on unary automata Kleene closure and state complexity Unary language operations, state complexity and Jacobsthal's function Finite automata and their decision problems The state complexity of L 2 and L k Introduction to the Theory of Computation Permutation automata The state complexities of some basic operations on regular languages