key: cord-0043273-v4bjjpku
authors: Grosshans, Nathan
title: The Power of Programs over Monoids in [Image: see text]
date: 2020-01-07
journal: Language and Automata Theory and Applications
DOI: 10.1007/978-3-030-40608-0_22
sha: 197b37054be5795cd35144207fa0b380c66ea37d
doc_id: 43273
cord_uid: v4bjjpku

The model of programs over (finite) monoids, introduced by Barrington and Thérien, gives an interesting way to characterise the circuit complexity class [Image: see text] and its subclasses and showcases deep connections with algebraic automata theory. In this article, we investigate the computational power of programs over monoids in [Image: see text], a small variety of finite aperiodic monoids. First, we give a fine hierarchy within the class of languages recognised by programs over monoids from [Image: see text], based on the length of programs but also some parametrisation of [Image: see text]. Second, and most importantly, we make progress in understanding what regular languages can be recognised by programs over monoids in [Image: see text]. We show that those programs actually can recognise all languages from a class of restricted dot-depth one languages, using a non-trivial trick, and conjecture that this class suffices to characterise the regular languages recognised by programs over monoids in [Image: see text].

In computational complexity theory, many hard still open questions concern relationships between complexity classes that are expected to be quite small in comparison to the mainstream complexity class P of tractable languages. One of the smallest such classes is NC 1 , the class of languages decided by Boolean circuits of polynomial length, logarithmic depth and bounded fan-in, a relevant and meaningful class, that has many characterisations but whose internal structure still mostly is a mystery. Indeed, among its most important subclasses, we count AC 0 , CC 0 and ACC 0 : all of them are conjectured to be different from each other and strictly within NC 1 , but despite many efforts for several decades, this could only be proved for the first of those classes.

In the late eighties, Barrington and Thérien [3] , building on Barrington's celebrated theorem [2] , gave an interesting viewpoint on those conjectures, relying on algebraic automata theory. They defined the notion of a program over a monoid M : a sequence of instructions (i, f ), associating through function f some element of M to the letter at position i in the input of fixed length. In that way, the program outputs an element of M for every input word, by multiplying out the elements given by the instructions for that word; acceptance or rejection then depends on that outputted element. A language of words of arbitrary length is consequently recognised in a non-uniform fashion, by a sequence of programs over some fixed monoid, one for each possible input length; when that sequence is of polynomial length, it is said that the monoid p-recognises that language. Barrington and Thérien's discovery is that NC 1 and almost all of its significant subclasses can each be exactly characterised by p-recognition over monoids taken from some suitably chosen variety of finite monoids (a class of finite monoids closed under basic operations on monoids). For instance, NC 1 , AC 0 , CC 0 and ACC 0 correspond exactly to p-recognition by, respectively, finite monoids, finite aperiodic monoids, finite solvable groups and finite solvable monoids. Understanding the internal structure of NC 1 thus becomes a matter of understanding what finite monoids from some particular variety are able to p-recognise.

It soon became clear that regular languages play a central role in understanding p-recognition: McKenzie, Péladeau and Thérien indeed observed [12] that finite monoids from a variety V and a variety W p-recognise the same languages if and only if they p-recognise the same regular languages. Otherwise stated, most conjectures about the internal structure of NC 1 can be reformulated as a statement about where one or several regular languages lie within that structure. This is why a line of previous works got interested into various notions of tameness, capturing the fact that for a given variety of finite monoids, p-recognition does not offer much more power than classical morphism-recognition when it comes to regular languages (see [8, 10, 11, 13, 14, [20] [21] [22] ). This paper is a contribution to an ongoing study of what regular languages can be p-recognised by monoids taken from "small" varieties, started with the author's Ph.D. thesis [7] . In a previous paper by the author with McKenzie and Segoufin [8] , a novel notion of tameness was introduced and shown for the "small" variety of finite aperiodic monoids DA. This allowed them to characterise the class of regular languages p-recognised by monoids from DA as those recognised by so called quasi-DA morphisms and represented a first small step towards a new proof that the variety A of finite aperiodic monoids is tame. This is a statement equivalent to Furst's, Saxe's, Sipser's [6] and Ajtai's [1] well-known lower bound result about AC 0 . In [8] , the authors also observed that, while DA "behaves well" with respect to p-recognition of regular languages, the variety J, a subclass of DA, does, in contrast, "behave badly" in the sense that monoids from J do p-recognise regular languages that are not recognised by quasi-J morphisms. Now, J is a well-studied and fundamental variety in algebraic automata theory (see, e.g., [15, 16] ), corresponding through classical morphism-recognition to the class of regular languages in which membership depends on the presence or absence of a finite set of words as subwords. This paper is a contribution to the understanding of the power of programs over monoids in J, a knowledge that certainly does not bring us closer to a new proof of the tameness of A (as we are dealing with a strict subvariety of DA), but that is motivated by the importance of J in algebraic automata theory and the unexpected power of programs over monoids in J. The results we present in this article are twofold: first, we exhibit a fine hierarchy within the class of languages p-recognised by monoids from J, depending on the length of those programs and on a parametrisation of J; second, we show that a whole class of regular languages, that form a subclass of dot-depth one languages [15] , are p-recognised by monoids from J while, in general, they are not recognised by any quasi-J morphism. This class roughly corresponds to dot-depth one languages where detection of a given factor does work only when it does not appear too often as a subword. We actually even conjecture that this class of languages with additional positional modular counting (that is, letters can be differentiated according to their position modulo some fixed number) corresponds exactly to all those p-recognised by monoids in J, a statement that is interesting in itself for algebraic automata theory.

Organisation of the Paper. Following the present introduction, Sect. 2 is dedicated to the necessary preliminaries. In Sect. 3, we present the results about the fine hierarchy and in Sect. 4 we expose the results concerning the regular languages p-recognised by monoids from J. Section 5 gives a short conclusion.

Note. This article is based on unpublished parts of the author's Ph.D. thesis [7] .

We assume the reader is familiar with the basics of formal language theory, semigroup theory and recognition by morphisms, that we might designate by classical recognition; for those, we only specify some things and refer the reader to the two classical references of the domain by Eilenberg [4, 5] and Pin [16] .

General Notations and Conventions. Let i, j ∈ N. We shall denote by [[i, j] ] the set of all n ∈ N verifying i ≤ n ≤ j. We shall also denote by [i] the set [ [1, i] ]. Given some set E, we shall denote by P(E) the powerset of E. All our alphabets and words will always be finite; the empty word will be denoted by ε.

Varieties and Languages. A variety of monoids is a class of finite monoids closed under submonoids, Cartesian product and morphic images. A variety of semigroups is defined similarly. When dealing with varieties, we consider only finite monoids and semigroups, each having an idempotent power, a smallest ω ∈ N >0 such that x ω = x 2ω for any element x. To give an example, the variety of finite aperiodic monoids, denoted by A, contains all finite monoids M such that, given ω its idempotent power,

To each variety V of monoids or semigroups we associate the class L(V) of languages such that, respectively, their syntactic monoid or semigroup belongs to V. For instance, L(A) is well-known to be the class of star-free languages.

Quasi V Languages. If S is a semigroup we denote by S 1 the monoid S if S is already a monoid and S ∪ {1} otherwise.

The following definitions are taken from [17] . Let ϕ be a surjective morphism from Σ * to a finite monoid M . For all k consider the subset ϕ(Σ k ) of M (where Σ k is the set of words over Σ of length k). As M is finite there is a k such that ϕ(Σ 2k ) = ϕ(Σ k ). This implies that ϕ(Σ k ) is a semigroup. The semigroup given by the smallest such k is called the stable semigroup of ϕ. If S is the stable semigroup of ϕ, S 1 is called the stable monoid of ϕ. If V is a variety of monoids or semigroups, then we shall denote by QV the class of such surjective morphisms whose stable monoid or semigroup, respectively, is in V and by L(QV) the class of languages whose syntactic morphism is in QV.

Programs over Monoids. Programs over monoids form a non-uniform model of computation, first defined by Barrington and Thérien [3] , extending Barrington's permutation branching program model [2] . Let M be a finite monoid and Σ an alphabet. A program P over M on Σ n is a finite sequence of instructions of the

The length of P , denoted by |P |, is the number of its instructions. The program P defines a function from Σ n to M as follows. On input w ∈ Σ n , each instruction (i, f ) outputs the monoid element f (w i ). A sequence of instructions then yields a sequence of elements of M and their product is the output P (w) of the program. A language L ⊆ Σ n is consequently recognised by P whenever there exists

A language L over Σ is recognised by a sequence of programs (P n ) n∈N over some finite monoid M if for each n, the program P n is on Σ n and recognises

For s : N → N and V a variety of monoids, we denote by P(V, s(n)) the class of languages recognised by sequences of programs over monoids in V of length at most s(n). The class P(V) = k∈N P V, n k is then the class of languages p-recognised by a monoid in V, i.e. recognised by sequences of programs over monoids in V of polynomial length.

The following is an important property of P(V).

Given two alphabets Σ and Γ , a Γ -program on Σ n for n ∈ N is defined just like a program over some finite monoid M on Σ n , except that instructions output letters from Γ and thus that the program outputs words over Γ . Let now L ⊆ Σ * and K ⊆ Γ * . We say that L program-reduces to K if and only if there exists a sequence (Ψ n ) n∈N of Γ -programs (the program-reduction) such that Ψ n is on Σ n and L =n = Ψ −1 n (K =|Ψn| ) for each n ∈ N. The following proposition shows closure of P(V) also under program-reductions.

Proposition 2 ([7, Proposition 3.3.12 and Corollary 3.4.3]). Let Σ and Γ be two alphabets. Let V be a variety of monoids. Given K ⊆ Γ * in P(V, s(n)) for s : N → N and L ⊆ Σ * from which there exists a program-reduction to K of length t(n), for t : N → N, we have that L ∈ P(V, s(t(n))). In particular, when K is recognised (classically) by a monoid in V, we have that L ∈ P(V, t(n)).

We won't introduce any of the proposed notions of tameness but will only state that the main consequence for a variety of monoids V to be tame in the sense of [8] is that P(V) ∩ Reg ⊆ L(QV). This consequence has far-reaching implications from a computational-complexity-theoretic standpoint when P(V) happens to be equal to a circuit complexity class. For instance, tameness for A implies that P(A) ∩ Reg ⊆ L(QA), which is equivalent to the fact that AC 0 does not contain the language MOD m of words over {0, 1} containing a number of 1s not divisible by m for any m ∈ N, m≥ 2 (a central result in complexity theory [1, 6] ).

Let us now define the variety of monoids J. A finite monoid M of idempotent power ω belongs to J if and only if (xy) ω = (xy) ω x = y(xy) ω for all x, y ∈ M . It is a strict subvariety of the variety DA, containing all finite monoids M of idempotent power ω such that (xy) ω = (xy) ω x(xy) ω for all x, y ∈ M , itself a strict subvariety of A. The variety J is a "small" one, well within A.

We now give some specific definitions and results about J that we will use, based essentially on [9] , but also on [16, Chapter 4, Section 1].

For some alphabet Σ and each k ∈ N, let us define the equivalence relation ∼ k on Σ * by u ∼ k v if and only if u and v have the same set of k-subwords (subwords of length at most k), for all u, v ∈ Σ * . The relation ∼ k is a congruence of finite index on Σ * . For an alphabet Σ and a word u ∈ Σ * , we shall write u Σ * for the language of all words over Σ having u as a subword. In the following, we consider that has precedence over ∪ and ∩ (but of course not over concatenation). We define the class of piecewise testable languages PT as the class of regular languages such that for every alphabet Σ, we associate to Σ * the set PT (Σ * ) of all languages over Σ that are Boolean combinations of languages of the form u Σ * where u ∈ Σ * . In fact, PT (Σ * ) is the set of languages over Σ equal to a union of ∼ k -classes for some k ∈ N (see [18] ). Simon showed [18] that a language is piecewise testable if and only if its syntactic monoid is in J, i.e. PT = L(J).

We can define a hierarchy of piecewise testable languages in a natural way. For k ∈ N, let the class of k-piecewise testable languages PT k be the class of regular languages such that for every alphabet Σ, we associate to Σ * the set PT k (Σ * ) of all languages over Σ that are Boolean combinations of languages of the form u Σ * where u ∈ Σ * with |u| ≤ k. We then have that PT k (Σ * ) is the set of languages over Σ equal to a union of ∼ k -classes. Let us define J k the inclusion-wise smallest variety of monoids containing the quotients of Σ * by ∼ k for any alphabet Σ: we have that a language is k-piecewise testable if and only if its syntactic monoid belongs to J k , i.e. PT k = L(J k ). (See [9, Section 3].)

The first part of our investigation of the computational power of programs over monoids in J concerns the influence of the length of programs on their computational capabilities.

We say two programs over a same monoid on the same set of input words are equivalent if and only if they recognise the same languages. Tesson and Thérien proved in [23] that for any monoid M in DA, there exists some k ∈ N such that for any alphabet Σ there is a constant c ∈ N >0 verifying that any program over M on Σ n for n ∈ N is equivalent to a program over M on Σ n of length at most c · n k . Since J ⊂ DA, any monoid in J does also have this property. However, this does not imply that there exists some k ∈ N working for all monoids in J, i.e. that P(J) collapses to P J, n k .

In this section, we show on the one hand that, as for DA, while P(J, s(n)) collapses to P(J) for any super-polynomial function s : N → N, there does not exist any k ∈ N such that P(J) collapses to P J, n k ; and on the other hand that P(J k ) does optimally collapse to P J k , n k/2 for each k ∈ N.

Given k, n ∈ N, we say that σ is a k-selector over n if σ is a function of P([n]) [n] k that associates a subset of [n] to each vector in [n] k . For any sequence Δ = (σ n ) n∈N such that σ n is a k-selector over n for each n ∈ N-a sequence we will call a sequence of k-selectors-, we set L Δ = n∈N K n,σn , where for each n ∈ N, the language K n,σn is the set of words over {0, 1} of length (k + 1) · n that can be decomposed into k + 1 consecutive blocks u (1) , u (2) , . . . , u (k) , v of n letters where the first k blocks each contain 1 exactly once and uniquely define a vector ρ in [n] k , where for all i ∈ [k], ρ i is given by the position of the only 1 in u (i) (i.e. u (i) ρi = 1) and v is such that there exists j ∈ σ n (ρ) verifying that v j is 1. Observe that for any k-selector σ 0 over 0, we have K 0,σ0 = ∅.

We now proceed similarly to what has been done in Subsection 5.1 in [8] to show, on one hand, that for all k ∈ N, there is a monoid M k in J 2k+1 such that for any sequence of k-selectors Δ, the language L Δ is recognised by a sequence of programs over M k of length at most n k+1 ; and, on the other hand, that for all k ∈ N there is a sequence of k-selectors Δ such that for any finite monoid M and any sequence of programs (P n ) n∈N over M of length at most n k , the language L Δ is not recognised by (P n ) n∈N .

We obtain the following proposition.

Proposition 3. For all k ∈ N, we have P J, n k ⊂ P J, n k+1 . More precisely, for all k ∈ N and d ∈ N, d ≤ k 2 − 1, we have P J k , n d ⊂ P J k , n d+1 .

Looking at Proposition 3, it looks at first glance rather strange that, for each k ∈ N, we can only prove strictness of the hierarchy inside P(J k ) up to exponent k 2 . We now show, in a way similar to Subsection 5.2 in [8] , that in fact P(J k ) does collapse to P J k , n k/2 for all k ∈ N, showing Proposition 3 to be optimal in some sense. Proposition 4. Let k ∈ N. Let M ∈ J k and Σ be an alphabet. Then there exists a constant c ∈ N >0 such that any program over M on Σ n for n ∈ N is equivalent to a program over M on Σ n of length at most c · n k/2 .

In particular, P(J k ) = P J k , n k/2 for all k ∈ N.

The second part of our investigation of the computational power of programs over monoids in J is dedicated to understanding exactly what regular languages can be p-recognised by monoids in J.

It is shown in [8] that P(J) ∩ Reg L(QJ), thus giving an example of a wellknown subvariety of A for which p-recognition allows to do unexpected things when recognising a regular language. How far does this unexpected power go?

The first thing to notice is that, though none of them is in L(QJ), all languages of the form Σ * u and uΣ * for Σ an alphabet and u ∈ Σ + are in P(J). Indeed, each of them can be recognised by a sequence of constant-length programs over the syntactic monoid of u Σ * : for every input length, just output the image, through the syntactic morphism of u Σ * , of the word made of the |u| first or last letters. So, informally stated, programs over monoids in J can check for some constant-length beginning or ending of their input words.

But they can do much more. Indeed, the language (a+b) * ac + does not belong to L(QJ) (compute the stable monoid), yet it is in P(J). The crucial insight is that it can be program-reduced in linear length to the piecewise testable language of all words over {a, b, c} having ca as a subword but not the subwords cca, caa and cb by using the following trick (that we shall call "feedback-sweeping") for input length n ∈ N: read the input letters in the order 2, 1, 3, 2, 4, 3, 5, 4, . . . , n, n− 1, output the letters read. This has already been observed in [8, Proposition 5] .

Using variants of the "feedback-sweeping" reading technique, we can prove that the phenomenon just described is not an isolated case. Lemma 2. The languages (a + b) * ac + , (a + b) * ac + a(a + b) * , c + a(a + b) * ac + , (a + b) * bac + and (a + b) * ac + (a + b) * ac + do all belong to P(J) \ L(QJ).

Hence, we are tempted to say that there are "much more" regular languages in P(J) than just those in L(QJ), even though it is not clear to us whether L(QJ) ⊆ P(J) or not. But can we show any upper bound on P(J) ∩ Reg? It turns out that we can, relying on two known results.

First, since J ⊆ DA, we have P(J) ⊆ P(DA), so Theorem 6 in [8] , that states P(DA) ∩ Reg = L(QDA), implies that P(J) ∩ Reg ⊆ L(QDA).

Second, let us define an important superclass of the class of piecewise testable languages. Let Σ be an alphabet and u 1 , . . . , u k ∈ Σ + (k ∈ N >0 ); we define [u 1 , . . . , u k ] = Σ * u 1 Σ * · · · Σ * u k Σ * . The class of dot-depth one languages is the class of Boolean combinations of languages of the form Σ * u, uΣ * and [u 1 , . . . , u k ] for Σ an alphabet, k ∈ N >0 and u, u 1 , . . . , u k ∈ Σ + . The inclusion-wise smallest variety of semigroups containing all syntactic semigroups of dot-depth one languages is denoted by J * D and verifies that L(J * D) is exactly the class of dot-depth one languages. (See [11, 15, 19] .) It has been shown in [11, Corollary 8] that P(J * D) ∩ Reg = L(Q(J * D)) (if we extend the program-over-monoid formalism in the obvious way to finite semigroups). Now, we have J ⊆ J * D, so that P(J) ⊆ P(J * D) and hence P(J) ∩ Reg ⊆ L(Q (J  *  D) ).

To summarise, we have the following. *  D) ).

In fact, we conjecture that the inverse inclusion does also hold. *  D) ).

Why do we think this should be true? Though, for a given alphabet Σ, we cannot decide whether some word u ∈ Σ + of length at least 2 appears as a factor of any given word w in Σ * with programs over monoids in J (because Σ * uΣ * / ∈ L(QDA)), Lemma 2 and the possibilities offered by the "feedback-sweeping" technique give the impression that we can do it when we are guaranteed that u appears at most a fixed number of times in w, which seems somehow to be what dot-depth one languages become when restricted to belong to L(QDA). This intuition motivates the definition of threshold dot-depth one languages.

The idea behind the definition of threshold dot-depth one languages is that we take the basic building blocks of dot-depth one languages, of the form [u 1 , . . . , u k ] for an alphabet Σ, for k ∈ N >0 and u 1 , . . . , u k ∈ Σ + , and restrict them so that, given l ∈ N >0 , membership of a word does really depend on the presence of a given word u i as a factor if and only if it appears less than l times as a subword. Definition 1. Let Σ be an alphabet. For all u ∈ Σ + and l ∈ N >0 , we define [u] l to be the language of words over Σ containing u l as a subword or u as a factor, i.e. [u] l = Σ * uΣ * ∪ u l Σ * . Then, for all u 1 , . . . , u k ∈ Σ + (k ∈ N, k ≥ 2) and l ∈ N >0 , we define [u 1 , . . . ,

Obviously, for each Σ an alphabet, k ∈ N >0 and u 1 , . . . , u k ∈ Σ + , the language [u 1 , . . . , u k ] 1 equals u 1 · · · u k Σ * . Over {a, b, c}, the language [ab, c] 3 contains all words containing a letter c verifying that in the prefix up to that letter, ababab appears as a subword or ab appears as a factor. Finally, the lan-

We then define a threshold dot-depth one language as any Boolean combination of languages of the form Σ * u, uΣ * and [u 1 , . . . , u k ] l for Σ an alphabet, for k, l ∈ N >0 and u, u 1 , . . . , u k ∈ Σ + .

Confirming the intuition briefly given above, the technique of "feedbacksweeping" can indeed be pushed further to prove that the whole class of threshold dot-depth one languages is contained in P(J), and we dedicate the remainder of this section to prove it. Concerning Conjecture 1, our intuition leads us to believe that, in fact, the class of threshold dot-depth one languages with additional positional modular counting is exactly L(QDA) ∩ L(Q(J * D)). We simply refer the interested reader to Section 5.4 of the author's Ph.D. thesis [7] , that contains a partial result supporting this belief, too technical and long to be presented here.

Let us now move on to the proof of the following theorem.

As P(J) is closed under Boolean operations (Proposition 1), our goal is to prove, given an alphabet Σ, given l ∈ N >0 and u 1 , . . . , u k ∈ Σ + (k ∈ N >0 ), that [u 1 , . . . , u k ] l is in P(J); the case of Σ * u and uΣ * for u ∈ Σ + is easily handled (see the discussion at the beginning of Subsect. 4.1). To do this, we need to put [u 1 , . . . , u k ] l in some normal form. It is readily seen that [u 1 , . . . ,

Definition 2. Let Σ be an alphabet.

Building directly a sequence of programs over a monoid in J that decides L (l) (u1,q1) · · · L (l) (u k ,q k ) for some alphabet Σ and q 1 , . . . , q k ∈ {1, l} seems however tricky. We need to split things further by controlling precisely how many times each u i for i ∈ [k] appears in the right place when it does less than l times. To do this, we consider, for each α ∈ [l] k , the language R α l (u 1 , . . . , u k ) defined below.

Now, for a given α ∈ [l] k , we are interested in the words of R α l (u 1 , . . . , u k ) such that for each i ∈ [k] verifying α i < l, the word u i indeed appears as a factor in the right place. We thus introduce a last language S α l (u 1 , . . . , u k ) defined as follows.

For all u 1 , . . . ,

We now have the normal form we were looking for to prove Theorem 1: [u 1 , . . . , u k ] l is equal to the union, over all α ∈ [l] k , of the intersection of R α l (u 1 , . . . , u k ) and S α l (u 1 , . . . , u k ). Though rather intuitive, the correctness of this decomposition is not so straightforward to prove and, actually, we can only prove it when for each i ∈ [k], the letters in u i are all distinct. Lemma 3. Let Σ be an alphabet, l ∈ N >0 and u 1 , . . . , u k ∈ Σ + (k ∈ N >0 ) such that for each i ∈ [k], the letters in u i are all distinct. Then, q1,...,q k ∈{1,l}

Our goal now is to prove, given an alphabet Σ, given l ∈ N >0 and u 1 , . . . , u k ∈ Σ + (k ∈ N >0 ) such that for each i ∈ [k], the letters in u i are all distinct, that for any α ∈ [l] k , the language R α l (u 1 , . . . , u k ) ∩ S α l (u 1 , . . . , u k ) is in P(J); closure of P(J) under union (Proposition 1) consequently entails that [u 1 , . . . , u k ] l ∈ P(J). The way R α l (u 1 , . . . , u k ) and S α l (u 1 , . . . , u k ) are defined allows us to reason as follows. For each i ∈ [k] verifying α i < l, let L i be the language of words w over Σ containing x i,1 u i αi x i,2 as a subword but not x i,1 u i αi+1 x i,2 and such that w = y 1 u i y 2 with y 1 ∈ x i,1 Σ * and

,αi<l L i does belong to P(J) by closure of P(J) under intersection, Proposition 1. The lemma that follows, the main lemma in the proof of Theorem 1, exactly shows this. The proof crucially uses the "feedback sweeping" technique, but note that we actually don't know how to prove it when we do not enforce that for each i ∈ [k], the letters in u i are all distinct.

Lemma 4. Let Σ be an alphabet and u ∈ Σ + such that its letters are all distinct. For all α ∈ N >0 and x 1 , x 2 ∈ Σ * , we have

Proof (Sketch). Let Σ be an alphabet and u ∈ Σ + such that its letters are all distinct. Let α ∈ N >0 and x 1 , x 2 ∈ Σ * . We let

If |u| = 1, the lemma follows trivially because L is piecewise testable and hence belongs to L(J), so we assume |u| > 1.

For each letter a ∈ Σ, we shall use 2 |u| − 1 distinct decorated letters of the form a (i) for some i ∈ [[0, 2 |u| − 2]], using the convention that a (0) = a; of course, for two distinct letters a, b ∈ Σ, we have that a (i) and b (j) are distinct for all i, j ∈ [[0, 2 |u| − 2]]. We denote by A the alphabet of these decorated letters. The main idea of the proof is, for a given input length n ∈ N, to build an A-program Ψ n over Σ n such that, given an input word w ∈ Σ n , it first ouputs the |u| − 1 first letters of w and then, for each i going from |u| to n, outputs w i , followed by w (1) i−1 · · · w (|u|−1) i−|u|+1 (a "sweep" of |u| − 1 letters backwards down to position i − |u| + 1, decorating the letters incrementally) and finally by w (|u|) i−|u|+2 · · · w (2|u|−2) i (a "sweep" forwards up to position i, continuing the incremental decoration of the letters). The idea behind this way of rearranging and decorating letters is that, given an input word w ∈ Σ n , as long as we make sure that w and thus Ψ n (w) do contain x 1 u α x 2 as a subword but not x 1 u α+1 x 2 , then Ψ n (w) can be decomposed as Ψ n (w) = y 1 zy 2 where y 1 ∈ x 1 Σ * , y 2 ∈ x 2 Σ * , and |y 1 | , |y 2 | are minimal, with z containing u β u (1) |u|−1 · · · u (|u|−1) 1 u (|u|) 2 · · · u (2|u|−2) |u| u α−β as a subword for some β ∈ [α] if and only if w ∈ (x 1 Σ * )u(x 2 Σ * ). This means we can check whether w ∈ L by testing whether w belongs to some fixed piecewise testable language over A.

As explained before stating the previous lemma, we can now use it to prove the result we were aiming for. Proposition 6. Let Σ be an alphabet, l ∈ N >0 and u 1 , . . . , u k ∈ Σ + (k ∈ N >0 ) such that for each i ∈ [k], the letters in u i are all distinct. For all α ∈ [l] k , we have R α l (u 1 , . . . , u k ) ∩ S α l (u 1 , . . . , u k ) ∈ P(J). We thus derive the awaited corollary. However, what we really want to obtain is that [u 1 , . . . , u k ] l ∈ P(J) without putting any restriction on the u i 's. But, in fact, to remove the constraint that the letters must be all distinct in each of the u i 's, we simply have to decorate each of the input letters with its position minus 1 modulo a big enough d ∈ N >0 . This finally leads to the following proposition.

Proposition 7. Let Σ be an alphabet, l ∈ N >0 and u 1 , . . . , u k ∈ Σ + (k ∈ N >0 ). Then [u 1 , . . . , u k ] l ∈ P(J).

This finishes to prove Theorem 1 by closure of P(J) under Boolean combinations (Proposition 1) and by the discussion at the beginning of Subsect. 4.1.

Although P(J) is very small compared to AC 0 , we have shown that programs over monoids in J are an interesting subject of study in that they allow to do quite unexpected things. The "feedback-sweeping" technique allows one to detect presence of a factor thanks to such programs as long as this factor does not appear too often as a subword: this is the basic principle behind threshold dot-depth one languages, that our article shows to belong wholly to P(J).

Whether threshold dot-depth one languages with additional positional modular counting do correspond exactly to the languages in L(QDA) ∩ L(Q(J * D)) seems to be a challenging question, that we leave open. In his Ph.D. thesis [7] , the author proved that all strongly unambiguous monomials (the basic building blocks in L(DA)) that are imposed to belong to L(J * D) at the same time are in fact threshold dot-depth one languages. However, the proof looks much too complex and technical to be extended to, say, all languages in L(DA) ∩ L(J * D). New techniques are probably needed, and we might conclude by saying that proving (or disproving) this conjecture could be a nice research goal in algebraic automata theory.

Σ 1 1 -formulae on finite structures

Bounded-width polynomial-size branching programs recognize exactly those languages in NC 1

Finite monoids and the fine structure of NC 1

Automata, Languages, and Machines

Automata, Languages, and Machines

Parity, circuits, and the polynomial-time hierarchy

The limits of Nečiporuk's method and the power of programs over monoids taken from small varieties of finite monoids

The power of programs over monoids in DA

Hierarchies of piecewise testable languages

An algebraic point of view on the Crane Beach property

Programs over semigroups of dot-depth one

NC 1 : the automata-theoretic viewpoint

Classes de circuits booléens et variétés de monoïdes

Finite semigroup varieties defined by programs

The dot-depth hierarchy, 45 years later

Varieties of Formal Languages

Some results on C-varieties

Piecewise testable events

Finite semigroup varieties of the form V * D

When can one finite monoid simulate another?

Languages defined with modular counting quantifiers

Computational complexity questions related to finite monoids and semigroups

The computing power of programs over finite monoids

Acknowledgements. The author thanks the anonymous referees for their helpful comments and suggestions.