key: cord-0060375-5sp1e7cc
authors: Piedeleu, Robin; Zanasi, Fabio
title: A String Diagrammatic Axiomatisation of Finite-State Automata
date: 2021-03-23
journal: Foundations of Software Science and Computation Structures
DOI: 10.1007/978-3-030-71995-1_24
sha: 1870fa0f322434fc989d77dd97868f20e8bfc49c
doc_id: 60375
cord_uid: 5sp1e7cc

We develop a fully diagrammatic approach to finite-state automata, based on reinterpreting their usual state-transition graphical representation as a two-dimensional syntax of string diagrams. In this setting, we are able to provide a complete equational theory for language equivalence, with two notable features. First, the proposed axiomatisation is finite— a result which is provably impossible for the one-dimensional syntax of regular expressions. Second, the Kleene star is a derived concept, as it can be decomposed into more primitive algebraic blocks.

Finite-state automata are one of the most studied structures in theoretical computer science, with an illustrious history and roots reaching far beyond, in the work of biologists, psychologists, engineers and mathematicians. Kleene [25] introduced regular expressions to give finite-state automata an algebraic presentation, motivated by the study of (biological) neural networks [31] . They are the terms freely generated by the following grammar: e, f ::= e + f | e f | e * | 0 | 1 | a ∈ A

Equational properties of regular expressions were studied by Conway [14] who introduced the term Kleene algebra: this is an idempotent semiring with an operation (−) * for iteration, called the (Kleene) star. The equational theory of Kleene algebra is now well-understood, and multiple complete axiomatisations, both for language and relational models, have been given. Crucially, Kleene algebra is not finitely-based: no finite equational theory can appropriately capture the behaviour of the star [35] . Instead, there are purely equational infinitary axiomatisations [28, 4] and Kozen's finitary implicational theory [26] . Since then, much research has been devoted to extending Kleene algebra with operations capturing richer patterns of behaviour, useful in program verification. Examples include conditional branching (Kleene algebra with tests [27] , and its recent guarded version [37] ), concurrent computation (CKA [19, 23] ), and specification of message-passing behaviour in networks (NetKAT [1] ).

The meta-theory of the formalisms above essentially rests on the same three ingredients: (1) given an operational model (e.g., finite-state automata), (2) devise a syntax (regular expressions) that is sufficiently expressive to capture the class of behaviours of the operational model (regular languages), and (3) find a complete axiomatisation (Kleene algebra) for the given semantics.

In this paper, we open up a direct path from (1) to (3) . Instead of thinking of automata as a combinatorial model, we formalise them as a bona-fide (twodimensional) syntax, using the well-established mathematical theory of string diagrams and monoidal categories [36] . This approach lets us axiomatise the behaviour of automata directly, freeing us from the necessity of compressing them down to a one-dimensional notation like regular expressions.

This perspective not only sheds new light on a venerable topic, but has significant consequences. First, as our most important contribution, we are able to provide a finite and purely equational axiomatisation of finite-state automata, up to language equivalence. Intriguingly, this does not contradict the impossibility of finding a finite basis for Kleene algebra, as the algebraic setting is different: our result gives a finite presentation as a symmetric monoidal category, while the impossibility result prevents any such presentation to exist as an algebraic theory (in the standard sense). In other words, there is no finite axiomatisation based on terms (tree-like structures), but we demonstrate that there is one based on string diagrams (graph-like structures).

Secondly, embracing the two-dimensional nature of automata guarantees a strong form of compositionality that the one-dimensional syntax of regular expressions does not have. In the string diagrammatic setting, automata may have multiple inputs and outputs and, as a result, can be decomposed into subcomponents that retain a meaningful interpretation. For example, if we split the automata below left, the resulting components are still valid string diagrams within our syntax, below right:

In line with the compositional approach, it is significant that the Kleene star can be decomposed into more elementary building blocks (which come together to form a feedback loop):

This opens up for interesting possibilities when studying extensions of Kleene algebra within the same approach-we elaborate on this in Section 6. Finally, we believe our proof of completeness is of independent interest, as it relies on fully diagrammatic reformulation of Brzozowski's minimisation algorithm [12] . In the string diagrammatic setting, the symmetries of the equational theory give this procedure a particularly elegant and simple form. Because all of the axioms involved in the determinisation procedure come with a dual, a codeterminisation procedure can be defined immediately by simply reversing the former. This reduces the proof of completeness to a proof that determinisation can be performed diagrammatically.

We should also note that this is not the first time that automata and regular languages are recast into a categorical mould. The iteration theories [5] of Bloom and Ésik, sharing graphs [17] of Hasegawa or network algebras [39] of Stefanescu are all categorical frameworks designed to reason about iteration or recursion, that have found fruitful applications in this domain. They are based on a notion of parameterised fixed-point which defines a categorical trace in the sense of [22] . While our proposal bears resemblance to (and is inspired by) this prior work, it goes beyond in one fundamental aspect: it is the first to give a finite complete axiomatisation of automata up to language equivalence.

A second difference is methodological: our syntax (4) does not feature any primitive for iteration or recursion. In particular, the star is a derived concept, in the sense that it is decomposable into more elementary operations (3). Categorically, our starting point is a compact-closed rather than traced category.

We elaborate on the relation between ours and existing work in Section 6. Omitted proofs can be found in [33] .

Syntax. We fix an alphabet Σ of letters a ∈ Σ. We call Aut Σ the symmetric strict monoidal category freely generated by the following objects and morphisms:

three generating objects ('action'), ('right') and ('left') with their identity morphisms depicted respectively as , and . -the following generating morphisms, depicted as string diagrams [36] :

Freely generating Aut Σ from these data (usually called a symmetric monoidal theory [42, 11] ) means that morphisms of Aut Σ will be the string diagrams obtained by pasting together (by sequential composition and monoidal product in Aut Σ ) the basic components in (4), and then quotienting by the laws of symmetric monoidal categories. For instance, (3) is a morphism of Aut Σ of type → , and is one of type → .

Semantics. We first define the semantics for string diagrams simply as a function, and then discuss how to extend it to a functor from Aut Σ to another category. Our interpretation maps generating morphisms to relations between regular expressions and languages over Σ:

In (5), the semantics e R ∈ 2 A * of a regular expression e ∈ RegExp is defined inductively on e (see (1) ), in the standard way:

where e n+1 := ee n and e 0 := 1. The semantics highlights the different roles played by red 1 and black generators. In a nutshell, red generators stand for regular expressions ( the sum, is 0, the product, is 1, the Kleene star, and a the letters of Σ), and black generators for operations on the set of languages ( is copy, is delete, and feed back outputs into inputs, in a way made more precise later). These two perspectives, which are usually merged, are kept distinct in our approach and only allowed to communicate via , which represents the product action of regular expressions (the red wire) on languages via concatenation on the right.

In order for this mapping to be functorial from Aut Σ , we now introduce a suitable target semantic category. Interestingly, this will not be the category Rel of sets and relations: indeed, the identity morphisms and are not interpreted as identities of Rel. Instead, the semantic domain will be the category Prof B of Boolean(-enriched) profunctors [15] (also called in the literature relational profunctors [20] or weakening relations [32] ). Definition 1. Given two preorders (X, ≤ X ) and (Y, ≤ Y ), a Boolean profunctor R : X → Y is a relation R ⊆ X × Y such that if (x, y) ∈ R and x ≤ X x, y ≤ Y y then (x , y ) ∈ R.

Preorders and Boolean profunctors form a symmetric monoidal category Prof B with composition given by relational composition. The identity for an object (X, ≤ X ) is the order relation ≤ X itself. The monoidal product is the usual product of preorders.

The rich features of our diagrammatic language are reflected in the profunctor interpretation. Indeed, the order relation is built into the wires and . The two possible directions represent the identities on the ordered set of languages and the same set with the reversed order, respectively. The additional red wire represents the set RegExp of regular expressions, with equality as the associated order relation. 2 It is clear that all monochromatic generators satisfy the condition of Definition 1. Similarly, the action generator is a Boolean profunctor: if ((e, L), K) are such that L e R ⊆ K and L ⊆ L, K ⊆ K then we have L e R ⊆ L e R ⊆ K ⊆ K by monotony of the product of languages. We can conclude that Proposition 1. · defines a symmetric monoidal functor of type Aut Σ → Prof B .

In particular, because Aut Σ is free, we can unambiguously assign meaning to any composite diagram from the semantics of its components using composition and the monoidal product in Prof B :

Example 1. We include here a worked out example to show how to compute the behaviour of a composite diagram which, as we will see, represents the action by concatenation of the regular language a * . We assign variable names to each wire: O to the top wire of the feedback loop, N to the output wire of the action node, and M to the middle wire joining to so that we can compute:

In Figure 1 we introduce = KDA , the (finite) equational theory of Kleene Diagram Algebra, on Aut Σ . It will be later shown to be complete for the given semantics. We explain some salient features of = KDA below. -(A1)-(A2) relate and , allowing us to bend and straighten wires at will. This makes the full subcategory of Aut Σ on and , modulo (A1)-(A2), compact closed [24] . (A3) allows us to eliminate isolated loops. Note that the whole category is not compact closed because has no dual. These laws mimic the usual definition of the action of a semiring on a set, except for (C5) which is novel and captures the interaction with the Kleene star. Here lies a distinctive feature of our theory: the behaviour of the star is derived from its decomposition as the feedback loop on the right of (C5). -The D block forces the action to be a comonoid ((D1)-(D2)) and monoid ((D1)-(D2)) homomorphism. -The E block axiomatises the purely red fragment. Remarkably, these axioms do not describe any of the actual Kleene algebra structure: they just state that and form a commutative comonoid ((E1)-(E3)) and that all other red generators are comonoid homomorphisms ((E4)-(E15)). This means that the red fragment is actually the free (cartesian) algebraic theory (cf. [42, 11] ) on generators , , , , , a (a ∈ Σ), where the remaining generators and act as copy and discard of variables. Let = KDA be the smallest equational theory containing all equations in Fig. 1 . Their soundness for the chosen semantics is not difficult to show and, for space reasons, we omit the proof. We now state our completeness result, whose proof will be discussed in Section 5.

Remark 1. In the usual approach to the theory of regular languages (e.g. [26] ), a completeness result like Theorem 1 is typically proven by first defining a class of models for the algebraic theory, and showing that the standard semantics constitutes the initial/free model. Our proof is different in flavour, but equivalent: taking advantage of the categorical formulation of our diagrammatic syntax and its semantics, we construct an equivalence of categories between our model and the diagrams quotiented by the equations of KDA.

Remark 2. Some axiomatisations of Kleene algebra use a partial order between terms, which can be defined from the idempotent monoid structure: f ≤ e iff e + f = e. At the semantic level, it corresponds to inclusion of languages. Similarly, using the idempotent bimonoid structure of our equational theory, we can define a partial order on → diagrams: f ≤ e iff e f = e . This partial order structure can also be extended to all morphisms n → m by using the vertical composition of n copies of and m copies of instead.

Remark 3. There are no specific equations relating the atomic actions a (a ∈ Σ). This is because, as we study automata, we are interested in the free monoid Σ * over Σ. However, nothing would prevent us from modelling other structures. Free commutative monoids (powers of N), whose rational subsets correspond to semilinear sets [14, Chapter 11] would be of particular interest.

A major appeal of our approach is that both regular expressions and automata can be uniformly represented in the graphical language of string diagrams, and the translation of one into the other becomes an equational derivation in = KDA . In fact, we will see there is a close resemblance between automata and the shape of the string diagrams interpreting them -the main difference being that string diagrams are composable structures.

In this section we describe how regular expressions (resp. automata) can be encoded as string diagrams, such that their semantics corresponds in a precise way to the languages that they describe (resp. recognise).

In a sense, regular expressions are already part of the graphical syntax, as the red generators: for any regular expression e, one may always construct a 'red' string diagram e : 0 → such that e = {(•, e)}. However, these alone are meaningless, since their image under the semantics is simply the free term algebra RegExp (see (7)) . They acquire meaning as they act on the set of languages over Σ, represented by the black wire.

To define these encodings, it is convenient to introduce the following syntactic sugar. We will write e for the composite of e with the action, as defined below left, with the particular case of a letter a ∈ Σ on the right: 

Using this action, we can inductively define an encoding − of regular expressions into string diagrams of Aut Σ , as the rightmost diagram for each expression below: 

As expected, the translation preserves the language interpretation of regular expressions in a sense that the following proposition makes precise.

For any regular expression e, e = {(L, K) | e R L ⊆ K}.

Example (8) suggests that the string diagram e corresponding to a regular expression e looks a lot like a nondeterministic finite-state automaton (NFA) for e. In fact, the translation − can be seen as the diagrammatic counterpart of Thompson's construction [40] that builds an NFA from a regular expression. We can generalise the encoding of regular expressions and translate NFA directly into string diagrams, in at least two ways. The first is to encode an NFA as the diagrammatic counterpart of its transition relation. The second is to translate directly its graph representation into the diagrammatic syntax.

Encoding the transition relation. This is a simple variant of the translation of matrices over semirings that has appeared in several places in the literature [29, 42] .

Let A be an NFA with set of states Q, initial state q 0 ∈ Q, accepting states F ⊆ Q and transition relation δ ⊆ Q × Σ × Q. We can represent δ as a string diagram d with |Q| incoming wires on the left and |Q| outgoing wires on the right.The left jth port of d is connected to the ith port on the right through an a whenever (q i , a, q j ) ∈ δ. To accommodate nondeterminism, when the same two ports are connected by several different letters of Σ, we join these using and . When (q i , , q j ) ∈ δ, the two ports are simply connected via a plain identity wire. If there is no tuple in δ such that (q i , a, q j ) ∈ δ for any a, the two corresponding ports are disconnected. For example, the transition relation of an NFA with three states and δ = {((q 0 , a, q 1 ), (q 1 , b, q 2 ), (q 2 , a, q 1 ), (q 2 , a, q 2 ))} (disregarding the initial and accepting states for the moment) is depicted on the right. Conversely, given such a diagram, we can recover δ by collecting Σ-weighted paths from left to right ports. To deal with the initial state, we add an additional incoming wire connected to the right port corresponding to the initial state of the automaton. Similarly, for accepting states we add an additional outgoing wire, connected to the left ports corresponding to each accepting state, via if there is more than one. Finally, we trace out the |Q| wires of the diagrammatic transition relation to obtain the associated string diagram. In other words, for a NFA with initial state q 0 , set of accepting states F, transition relation δ, we obtain the string diagram on the right, where d is the diagrammatic counterpart of d f e 0 |Q| |Q| δ as defined above, e 0 is the injection of a single wire as the first amongst |Q| wires, and f deletes all wires that are not associated to states in F with , and applies to merge them into a single outgoing wire. For example, if A with δ as above has initial state q 0 and accepting state {q 2 }, we get the diagram below left; instead, if all states are accepting, we obtain the diagram below right: The correctness of this simple translation is justified by a semantic correspondence between the language recognised by a given NFA A and the denotation of the corresponding string diagram.

Given an NFA A which recognises the language L, let d A be its associated string diagram, constructed as above. Then d A = {(K, K ) | LK ⊆ K }.

From graphs to string diagrams. The second way of translating automata into string diagrams mimics more directly the combinatorial representation of automata. The idea (which should be sufficiently intuitive to not need to be made formal here) is, for each state, to use to represent incoming edges, and to represent outgoing edges. As above, labels a ∈ A will be modelled using a . For example, the graph and the associated string diagram corresponding with the NFA above are 

Note the initial state of the automaton corresponds to the left interface of the string diagram, and the accepting state to the right interface. As before, when there are multiple accepting states, they all connect to a single right interface, via . For example, if we make all states accepting in the automaton above, we get the following diagrammatic representation: 

The previous discussion shows how NFAs can be seen as string diagrams of type → . The converse is also true: we now show how to extract an automaton from any string diagram d :

→ , such that the language the automaton recognises matches the denotation of d.

In order to phrase this correspondence formally, we need to introduce some terminology. We call left-to-right those string diagrams whose domain and codomain contain only , i.e. their type is of the form n → m . The idea is that, in any such string diagram, the n left interfaces act as inputs of the computation, and the m right interfaces act as outputs. For instance, (9) is a left-to-right diagram → .

A string diagram d is atomic if the only red generators occurring in d are of the form a . By unfolding all red components e in any left-to-right diagram, using axioms (C1)-(C5), we can prove the following statement.

Any left-to-right diagram is = KDA -equivalent to an atomic one.

For instance, the string diagram on the left of (8) is = KDA -equivalent to the atomic one on the right.

We call block of a certain subset of generators a vertical composite of these generators followed by some permutations of the wires.

A matrix-diagram (resp. generalised matrix-diagram) is a left-toright diagram that factors as a block of , , followed by a block of a for a ∈ Σ (resp. e for e ∈ RegExp) and finally, a block of , .

To each matrix-diagram d we can associate a unique transition relation δ by gathering paths from each input to each output: (q i , a, q j ) ∈ δ if there is a joining the ith input to the jth output. A transition relation is -free if it does not contain the empty word. It is deterministic if it is -free and, for each i and each a ∈ Σ there is at most one j such that (q i , a, q j ) ∈ δ. We will apply these terms to matrixdiagrams and the associated transition relation inter-a b a a changeably. The example of Section 4.2 above, with the three blocks highlighted, is a matrix-diagram. It is -free but not deterministic since there are two alabelled transitions starting from the third input. Given a matrix-diagram d : l+n → p+m , we will write d ij , with i = l, n and j = p, m, for the subdiagrams corresponding to the appropriate submatrices. For example, given the string diagram below on the left, the one on the right is a representation for it, whose highlighted matrix-diagram is the same as above. 

We will refer to the associated matrix-diagramd as the transition matrix of a given representation. From a → diagram with representationd : l+1 → l+1 we can construct an NFA from its transition matrixd as follows:

its state set is Q = {q 1 , . . . , q l }, i.e., there is one state for each wire ofd ll ; -its transition relation built fromd ll as described above; -its initial states Q 0 are those q i for which there exists an index j such that the ijth coefficient ofd 1l is non-zero (and therefore ); -its final states F are those q j for which there exists an index i such that the ijth coefficient ofd l1 is non-zero (and therefore );

The construction above is the inverse of that of Section 4.2. The link between the constructed automaton and the original string diagram is summarised in the following statement, which is a straightforward corollary of Proposition 3.

For a diagram d : → with a representationd, let Ad be the associated automaton, constructed as above. ThenL is the language recognised by Ad iff d = (K, K ) |LK ⊆ K . The next proposition states that a representation can be extracted from any string diagram.

We established a correspondence between → diagrams and automata. What about arbitrary left-to-right diagrams n → m ? To characterise the precise relationship between our syntax and regular expressions we can prove a Kleene theorem for Aut Σ . Recall, from Definition 2 that a generalised matrix-diagram is the diagrammatic counterpart of a matrix whose coefficients are regular expressions. It turns out that every left-to-right diagram can be put in this form.

Any left-to-right diagram is equal to a generalised matrix diagram.

As a result, the semantics of a given n → m diagram is fully characterised by an m × n array of regular languages.

It is worth pointing out how a simple modification of the diagrammatic syntax takes us one notch up the Chomsky hierarchy, leaving the realm of regular languages for that of context-free grammars and languages. Our syntax allows to specify systems of language equations of the form aX ⊆ Y. In this context, feedback loops can be interpreted as fixed-points. For example, the automaton below left, and its corresponding string diagram, below right, translate to the system of equations at the center:

This translation can be obtained by simply labelling each state with a variable and adding one inequality of the form X i a ⊆ X j for each a-transition from state i to state j. The system we obtain corresponds very closely to the − -semantics of the associated string diagram. The distinction between red and black wires can be understood as a type discipline that only allows linear uses of the product of languages. It is legitimate and enlightening to ask what would happen if we forgot about red wires and interpreted the action directly as the product. We would replace the action by a new generator with semantics = { (M, L), K | ML ⊆ K}. This would allow us to specify systems of language equations with unrestricted uses of the product on the left of inclusions, e.g. UVW ⊆ X. Equations of this form are similar to the production rules (e.g. X → UVW) of context-free grammars and it is well-known that the least solutions of this class of systems are precisely context-free languages [14, Chapter 10] .

For example we could encode the language X → XX | (X) | of properly matched parentheses as least solution of the system ⊆ X, (X) ⊆ X, XX ⊆ X which gives the diagram displayed on the right. 

This section is devoted to prove our completeness result, Theorem 1. We use a normal form argument: more specifically we mimic automata-theoretic results to rewrite every string diagram to a normal form corresponding to a minimal deterministic finite automaton (DFA). We achieve it by implementing Brzozowski's algorithm [12] through diagrammatic equational reasoning. The proof proceeds in three distinct steps.

1. We first show (Section 5.1) how to determinise (the representation of) a diagram: this step consists in eliminating all subdiagrams that correspond to nondeterministic transitions in the associated automaton. 2. We use the previous step to implement a minimisation procedure (Section 5.2) from which we obtain a minimal representation for a given diagram: this is a representation whose associated automaton is minimal-with the fewest number of states-amongst DFAs that recognise the same language.

To do this, we show how the four steps of Brzozowski's minimisation algorithm (reverse; determinise; reverse; determinise) translate into diagrammatic equational reasoning. Note that the first three steps taken together simply amount to applying in reverse the determinisation procedure we have already devised. That this is possible will be a consequence of the symmetry of = KDA . 3. Finally, from the uniqueness of minimal DFAs, any two diagrams that have the same denotation are both equal to the same minimal representation and we can derive completeness of = KDA .

We will now write equations in = KDA simply as = to simplify notation and say that diagrams c and d are equal when c = KDA d.

First, we use the symmetries of the equational theory to make simplifying assumptions about the diagrams to consider in the completeness proof.

A few simplifying assumptions. Without loss of generality, the proof we give is restricted to string diagrams with no in their domain as well as in their codomain. This is simply a matter of convenience: the same proof would work for more general diagrams, that may contain in their (co)domain, at the cost of significantly cluttering diagrams. Henceforth, one can simply think of the labels for the action x as uniquely identifying one open red wire in a diagram. With this convention, two or more occurrences of the same x in a diagram can be seen as connected to the same red wire on the left, via . That we can safely do so is a consequence of the completeness of = KDA restricted to the monochromatic red fragment, itself a consequence of [11, Theorem 6.1] .

Arbitrary objects in Aut Σ are lists of the three generating objects. We have already motivated focusing on string diagrams with no open red wires so that the objects we care about are lists of and . The following proposition implies that, without loss of generality, for the proof of completeness we can restrict further to left-to-right diagrams (Section 4.2).

There is a natural bijection between sets of string diagrams of the form

where A i , B i represent lists of and .

Proposition 8 tell us that we can always bend the incoming wires to the left and outgoing wires to the right before applying some equations, and recover the original orientation of the wires by bending them into their original place later.

In diagrammatic terms, a nondeterministic transition of the automaton associated to (a representation of) a given diagram, corresponds to a subdiagram of the form a a for some a ∈ Σ. Clearly, using the definition of a := a in (6) and the axiom

, which will prove to be the engine of our determinisation procedure, along with the fact that any red expression can be copied and deleted. The next two theorems generalise the ability to copy and delete to arbitrary left-to-right diagrams. For d : m → n , let d ij be the string diagram of type → obtained by composing every input with except the ith one, and every output with except the jth one. Theorem 2 implies that string diagrams are fully characterised by their → subdiagrams. Corollary 1. Given d, e : m → n , d = KDA e iff d ij = KDA e ij , for all 1 ≤ i ≤ m and 1 ≤ j ≤ n. Thus, we can restrict our focus further to left-to-right → diagrams, without loss of generality. We are now able to devise a determinisation procedure for representation of diagrams, which we illustrate below on a simple example.

Any diagram → has a deterministic representation. Dealing with useless states. Notice that our deterministic form is partial and that the determinisation procedure disregards useless states, i.e., parts of a string diagram that do not reach an output wire. None of these contribute to the semantics of the diagram and can be safely eliminated using Theorem 2 (del)-(co-del).

As explained above, our proof of completeness is a diagrammatic reformulation of Brzozowski's algorithm which proceeds in four steps: determinise, reverse, determinise, reverse. We already know how to determinise a given diagram. The other three steps are simply a matter of looking at string diagrams differently and showing that all the equations that we needed to determinise them, can be performed in reverse. We say that a matrix-diagram is co-deterministic if the converse of its associated transition relation is deterministic.

Proof (Theorem 1 (Completeness)). We have a procedure to show that, if d = e , then there exists a string diagram f in normal form such that d = f = e. This normal form is the diagrammatic counterpart of the minimal automaton associated to d and e. In our setting, it is the deterministic representation equal to d and e with the smallest number of states. This is unique because we can obtain from it the corresponding minimal automaton, which is well-known to be unique. First, given any string diagram we can obtain a representation for it by Proposition 6. Then we obtain a minimal representation by splitting Brzozowski's algorithm in two steps.

A close look at the determinisation procedure shows that, at each step, the required laws all hold in reverse. For example, we can replace every instance of (cpy) with (co-cpy). We can thus define, in a completely analogous manner, a co-determinisation procedure which takes care of the first three steps of Brzozowski's algorithm, and obtain a co-deterministic representation for the given diagram. 2. Determinise. By applying Proposition 9, we can obtain a deterministic representation for the co-deterministic representation of the previous step. The result is the desired minimal representation and normal form.

In this paper, we have given a fully diagrammatic treatment of finite-state automata, with a finite equational theory that axiomatises them up to language equivalence. We have seen that this allows us to decompose the regular operations of Kleene algebra, like the star, into more primitive components, resulting in greater modularity. In this section, we compare our contributions with related work, and outline directions for future research. Traditionally, computer scientists have used syntax or railroad diagrams to visualise regular expressions and context-free grammars [41] . These diagrams resemble our very closely but have remained mostly informal More recently, Hinze has treated the single input-output case rigorously as a pedagogical tool to teach the correspondence between finite-state automata and regular expressions [18] . He did not, however, study their equational properties.

Bloom and Ésik's iteration theories provide a general categorical setting in which to study the equational properties of iteration for a broad range of structures that appear in programming languages semantics [5] . They are cartesian categories equipped with a parameterised fixed-point operation closely related to the feedback notion we have used to represent the Kleene star. However, the monoidal category of interest in this paper is compact-closed (only the full subcategory over and to be precise), a property that is incompatible with the existence of categorical products (any category that has both collapses to a preorder [30] ). Nevertheless, the subcategory of left-to-right diagrams (Section 4.2) is a (matrix) iteration theory [6] , a structure that Bloom and Ésik have used to give an (infinitary) axiomatisation of regular languages [4] .

Similarly, Stefanescu's work on network algebra provides a unified algebraic treatment of various types of networks, including finite-state automata [39] . In general, network algebras are traced monoidal categories where the product is not necessarily cartesian, and therefore more general than iteration theories. In both settings however, the trace is a global operation, that cannot be decomposed further into simpler components. In our work, on the other hand, the trace can be defined from the compact-closed structure, as was depicted in (3) .

Note that the compact closed subcategory in this paper can be recovered from the traced monoidal category of left-to-right diagrams, via the Int construction [22] . Therefore, as far as mathematical expressiveness is concerned, the two approaches are equivalent. However, from a methodological point of view, taking the compact closed structure as primitive allows for improved compositionality, as example (2) in the introduction illustrates. Furthermore, the compact closed structure can be finitely presented relative to the theory of symmetric monoidal categories, whereas the trace operation cannot. This matters greatly in this paper, where finding a finite axiomatisation is our main concern.

Finally, the idea of treating regular expressions as a free structure acting on a second algebraic structure also appeared in Pratt's dynamic algebras, which axiomatise the propositional fragment of dynamic modal logic [34] . Like our formalism, the variety of dynamic algebras is finitely-based. But they assume more structure: the second algebraic structure is a Boolean algebra.

In all the formalisms we have mentioned, the difficulty typically lies in capturing the behaviour of iteration-whether as the star in Kleene algebra [26, 4] , or a trace operator [5] in iteration theory and network algebra [39] . The axioms should be coercive enough to force it to be the least fixed-point of the language map L → { } ∪ LK. In Kozen's axiomatisation of Kleene algebra [26] for exam-ple, this is through (a) the axiom 1 + ee * ≤ e * (star is a fixpoint) and (b) the Horn clause f + ex ≤ x ⇒ e * f ≤ x (star is the least fixpoint). In our work, (a) is a consequence of the unfolding of the star into a feedback loop and can be derived from the other axioms. (b) is more subtle, but can be seen as a consequence of (D1)-(D4) axioms. These allows us to (co)copy and (co)delete arbitrary diagrams (Theorem 2) and we conjecture that this is what forces the star to be a single definite value, not just any fixed-point, but the least one. Making this statement precise is the subject of future work.

The difficulty in capturing the behaviour of fixed-points is also the reason why we decided to work with an additional red wire, to encode the action of regular expressions on the set of languages-without it, global (co)copying and (co)deleting (Theorem 2) cannot be reduced to the local (D1)-(D4) axioms. There is another route, that leads to an infinitary axiomatisation: we could dispense with the red generators altogether and take a (for a ∈ Σ) as primitive instead, with global axioms to (co)copy and (co)delete arbitrary diagrams. This would pave the way for a reformulation of our work in the context of iteration (matrix) theories, where the ability to (co)copy and (co)delete arbitrary expressions is already built-in. We leave this for future work.

There is an intriguing parallel between our case study and the positive fragment of relation algebra (also known as allegories [16] ). Indeed, allegories, like Kleene algebra, do not admit a finite axiomatisation [16] . However, this result holds for standard algebraic theories. It has been shown recently that a structure equivalent to allegories can be given a finite axiomatisation when formulated in terms of string diagrams in monoidal categories [9] . It seems like the greater generality of the monoidal setting-algebraic theories correspond precisely to the particular case of cartesian monoidal categories [11] -allows for simpler axiomatisations in some specific cases. In the future we would like to understand whether this phenomenon, of which now we have two instances, can be understood in a general context.

Lastly, extensions of Kleene Algebra, such as Concurrent Kleene Algebra (CKA) [19, 23] and NetKAT [1] , are increasingly relevant in current research. Enhancing our theory = KDA to encompass these extensions seems a promising research direction, for two main reasons. First, the two-dimensional nature of string diagrams has been proven particularly suitable to reason about concurrency (see e.g. [7, 38] ), and more generally about resource exchange between processes (see e.g. [10, 13, 21, 3, 8] ). Second, when trying to transfer the good meta-theoretical properties of Kleene Algebra (like completeness and decidability) to extensions such as CKA and NetKAT, the cleanest way to proceed is usually in a modular fashion. The interaction between the new operators of the extension and the Kleene star usually represents the greatest challenge to this methodology. Now, in = KDA , the Kleene star is decomposable into simpler components (see (3) ) and there is only one specific axiom (C5) governing its behaviour. We believe this is a particularly favourable starting point to modularise a meta-theoretic study of CKA and NetKAT with string diagrams, taking advantage of the results we presented in this paper for finite-state automata.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Netkat: semantic foundations for networks

Delayed-logic and finite-state machines

A compositional framework for passive linear networks

Equational axioms for regular sets

Iteration theories

Matrix and matricial iteration theories

Diagrammatic algebra: from linear to concurrent systems

Graphical affine algebra

Graphical conjunctive queries

The calculus of signal flow diagrams I: linear relations on streams

Deconstructing Lawvere with distributive laws

Canonical regular expressions and minimal state graphs for definite events

Picturing Quantum Processes -A first course in Quantum Theory and Diagrammatic Reasoning

Regular algebra and finite machines

Seven sketches in compositionality: An invitation to applied category theory

Categories, allegories

Recursion from cyclic sharing: traced monoidal categories and models of cyclic lambda calculi

Self-certifying railroad diagrams

Concurrent Kleene algebra

Glueing and orthogonality for models of linear logic

Causal inference by string diagram surgery

Traced monoidal categories

Concurrent Kleene algebra: Free model and completeness

Coherence for compact closed categories

Representation of events in nerve nets and finite automata

A completeness theorem for Kleene algebras and the algebra of regular events

Kleene algebra with tests

Complete systems of B-rational identities

Composing PROPs. Theory and Application of Categories

Introduction to higher-order categorical logic

A logical calculus of the ideas immanent in nervous activity

Coherence for categories of posets with applications. Topology, Algebra and Categories in Logic (TACL) p

A string diagrammatic axiomatisation of finite-state automata

Dynamic algebras as a well-behaved fragment of relation algebras

On defining relations for the algebra of regular events

A survey of graphical languages for monoidal categories

Guarded Kleene algebra with tests: verification of uninterpreted programs in nearly linear time

Connector algebras for C/E and P/T nets' interactions

Network Algebra

Programming techniques: Regular expression search algorithm

The programming language pascal

Interacting Hopf Algebras: the theory of linear systems