key: cord-0046571-xt9wtqm6
authors: Kirst, Dominik; Larchey-Wendling, Dominique
title: Trakhtenbrot’s Theorem in Coq: A Constructive Approach to Finite Model Theory
date: 2020-06-06
journal: Automated Reasoning
DOI: 10.1007/978-3-030-51054-1_5
sha: e04f5c75ccec466e7b337470a3bd3c315f5b9177
doc_id: 46571
cord_uid: xt9wtqm6

We study finite first-order satisfiability (FSAT) in the constructive setting of dependent type theory. Employing synthetic accounts of enumerability and decidability, we give a full classification of FSAT depending on the first-order signature of non-logical symbols. On the one hand, our development focuses on Trakhtenbrot’s theorem, stating that FSAT is undecidable as soon as the signature contains an at least binary relation symbol. Our proof proceeds by a many-one reduction chain starting from the Post correspondence problem. On the other hand, we establish the decidability of FSAT for monadic first-order logic, i.e. where the signature only contains at most unary function and relation symbols, as well as the enumerability of FSAT for arbitrary enumerable signatures. All our results are mechanised in the framework of a growing Coq library of synthetic undecidability proofs.

In the wake of the seminal discoveries concerning the undecidability of firstorder logic by Turing and Church in the 1930s, a broad line of work has been pursued to characterise the border between decidable and undecidable fragments of the original decision problem. These fragments can be grouped either by syntactic restrictions controlling the allowed function and relation symbols or the quantifier prefix, or by semantic restrictions on the admitted models (see [1] for a comprehensive description).

Concerning signature restrictions, already predating the undecidability results, Löwenheim had shown in 1915 that monadic first-order logic, admitting only signatures with at most unary symbols, is decidable [15] . Therefore, the successive negative results usually presuppose non-trivial signatures containing an at least binary symbol.

Turning to semantic restrictions, Trakhtenbrot proved in 1950 that, if only admitting finite models, the satisfiability problem over non-trivial signatures is still undecidable [21] . Moreover, the situation is somewhat dual to the unrestricted case, since finite satisfiability (FSAT) is still enumerable while, in the unrestricted case, validity is enumerable. As a consequence, finite validity cannot be characterised by a complete finitary deduction system and, resting on finite model theory, various natural problems in database theory are undecidable.

Conventionally, Trakhtenbrot's theorem is proved by (many-one) reduction from the halting problem for Turing machines (see e.g. [1, 14] ). An encoding of a given Turing machine M can be given as a formula ϕ M such that the models of ϕ M correspond to the runs of M . Specifically, the finite models of ϕ M correspond to terminating runs of M and so a decision procedure for finite satisfiability of ϕ M would be enough to decide whether M terminates or not.

Although this proof strategy is in principle explainable on paper, already the formal definition of Turing machines, not to mention their encoding in first-order logic, is not ideal for mechanisation in a proof assistant. So for our Coq mechanisation of Trakhtenbrot's theorem, we follow a different strategy by starting from the Post correspondence problem (PCP), a simple matching problem on strings. Similar to the conventional proof, we proceed by encoding every instance R of PCP as a formula ϕ R such that R admits a solution iff ϕ R has a finite model. Employing the framework of synthetic undecidability [8, 11] , the computability of ϕ R from R is guaranteed since all functions definable in constructive type theory are computable without reference to a concrete model of computation.

Both the conventional proof relying on Turing machines and our elaboration starting from PCP actually produce formulas in a custom signature well-suited for the encoding of the seed decision problems. The sharper version of Trakhtenbrot's theorem, stating that a signature with at least one binary relation (or one binary function and one unary relation) is enough to turn FSAT undecidable, is in fact left as an exercise in e.g. Libkin's book [14] . However, at least in a constructive setting, this generalisation is non-trivial and led us to mechanising a chain of signature transformations eliminating and compressing function and relation symbols step by step.

Complementing the undecidability result, we further formalise that FSAT is enumerable for enumerable signatures and decidable for monadic signatures. Again, both of these standard results come with their subtleties when explored in a constructive approach of finite model theory.

In summary, the main contributions of this paper are threefold:

-we provide an axiom-free Coq mechanisation comprising a full classification of finite satisfiability with regards to the signatures allowed; 1 -we present a streamlined proof strategy for Trakhtenbrot's theorem wellsuited for mechanisation and simple to explain informally, basing on PCP; -we give a constructive account of signature transformations and the treatment of interpreted equality typically neglected in a classical development.

The rest of the paper is structured as follows. We first describe the typetheoretical framework for undecidability proofs and the representation of firstorder logic in Sect. 2. We then outline our variant of Trakhtenbrot's theorem for a custom signature in Sect. 3. This is followed by a development of enough constructive finite model theory (Sect. 4) to conclude some decidability results (Sect. 5) as well as the final classification (Sect. 6). We end with a brief discussion of the Coq development and future work in Sect. 7.

In order to make this paper accessible to readers unfamiliar with constructive type theory, we outline the required features of Coq's underlying type theory, the synthetic treatment of computability available in constructive mathematics, some properties of finite types, as well as our representation of first-order logic.

We work in the framework of a constructive type theory such as the one implemented in Coq, providing a predicative hierarchy of type universes T above a single impredicative universe P of propositions. On type level, we have the unit type 1 with a single element * : 1, the void type 0, function spaces X → Y , products X × Y , sums X + Y , dependent products ∀x : X. F x, and dependent sums {x : X | F x}. On propositional level, these types are denoted using the usual logical notation ( , ⊥, →, ∧, ∨, ∀, and ∃).

We employ the basic inductive types of Booleans (B ::= tt | ff), of Peano natural numbers (n : N ::= 0 | S n), the option type (O X ::= x | ∅), and lists (l : L X::=[ ] | x::l). We write |l| for the length of a list, l+ +m for the concatenation of l and m, x ∈ l for membership, and simply f [x 1 ; . . . ; x n ] := [f x 1 ; . . . ; f x n ] for the map function. We denote by X n the type of vectors of length n : N and by 

We review the main ingredients of our synthetic approach to decidability and undecidability [7, 8, 10, 11, 13, 19] , based on the computability of all functions definable in constructive type theory. 2 We first introduce standard notions of computability theory without referring to a formal model of computation, e.g. Turing machines.

These notions generalise to predicates of higher arity. Moreover, a type X is

-a data type if it is both enumerable and discrete.

Using the expressiveness of dependent types, we equivalently tend to establish the decidability of a predicate p : X → P by giving a function ∀x : X. p x + ¬p x. Note that it is common to mechanise decidability results in this synthetic sense (e.g. [2, 16, 17] ). Next, decidability and enumerability transport along reductions:

Fact 3. Assume p : X → P, q : Y → P and p q: (1) if q is decidable, then so is p and (2) if X and Y are data types and q is enumerable, then so is p.

Item (1) implies that we can justify the undecidability of a target problem by reduction from a seed problem known to be undecidable, such as the halting problem for Turing machines. This is in fact the closest rendering of undecidability available in a synthetic setting, since the underlying type theory is consistent with the assumption that every problem is decidable. 4 Nevertheless, we believe that in the intended effective interpretation for synthetic computability, a typical seed problem is indeed undecidable and so are the problems reached by verified reductions. 5 More specifically, since the usual seed problems are not coenumerable, (2) implies that the reached problems are not co-enumerable either.

Given its simple inductive characterisation involving only basic types of lists and Booleans, the (binary) Post correspondence problem (BPCP) is a well-suited seed problem for compact encoding into first-order logic. 6 we define derivability of a pair s/t from R (denoted by R s/t) and solvability (denoted by BPCP R) by the following rules:

Given a list R : L(L B × L B), the derivability predicate λs t.R s/t is decidable. However, the halting problem for Turing machines reduces to BPCP.

Proof. We give of proof of the decidability of R s/t by induction on |s| + |t|.

We also provide a trivial proof of the equivalence of two definitions of BPCP. See [7, 10] for details on the reduction from the halting problem to BPCP.

It might at first appear surprising that derivability λs t.R s/t is decidable while BPCP is reducible from the halting problem (and hence undecidable). This simply illustrates that undecidability is caused by the unbounded existential quantifier in the equivalence BPCP R ↔ ∃s. R s/s.

A type X is finite if there is a list l X with x ∈ l X for all x : X and a predicate p : X → P is finite if there is a list l p with ∀x. p x ↔ x ∈ l p .

Note that in constructive settings there are various alternative characterisations of finiteness 7 (bijection with F n for some n; negated infinitude for some definition of infiniteness; etc.) and we opted for the above since it is easy to work with while transparently capturing the expected meaning. One can distinguish strong finiteness in T (i.e. {l X : L X | ∀x. x ∈ l X }) from weak finiteness in P (i.e. ∃l X : L X. ∀x. x ∈ l X ), the list l X being required computable in the strong case.

We present three important tools for manipulating finite types: the finite pigeon hole principle (PHP) here established without assuming discreteness, the well-foundedness of strict orders over finite types, and quotients over strongly decidable equivalences that map onto F n . The proofs are given in Appendix A of the extended version of this paper [12] .

For the finite PHP, the typical classical proof requires the discreteness of X to design transpositions/permutations. Here we avoid discreteness completely, the existence of a duplicate being established without actually computing one.

then the values at two distinct positions in l are related to the same y in m, i.e. there exist x 1 , x 2 ∈ l and y ∈ m such that l has shape l = · · · + + x 1 :: · · · + + x 2 :: · · · and R x 1 y and R x 2 y.

Using the PHP, one can constructively show that, for a strict order over a finite type X, any descending chain has length bounded by the size of X. 8

Coq's type theory does not provide quotients in general (see e.g. [6] ) but one can build computable quotients in certain conditions, here for a decidable equivalence relation of which representatives of equivalence classes are listable.

Theorem 9 (Finite decidable quotient). Let ∼ : X → X → P be a decidable equivalence with {l r : L X | ∀x∃y. y ∈ l r ∧ x ∼ y}, i.e. finitely many equivalence classes. 9 Then one can compute the quotient X/∼ onto F n for some n, i.e. n : N, c : X → F n and r :

Using Theorem 9 with identity over X as equivalence, we get bijections between finite, discrete types and the type family (F n ) n:N . 10 Corollary 10. If X is a finite and discrete type then one can compute n : N and a bijection from X to F n .

We briefly outline our representation of the syntax and semantics of first-order logic in constructive type theory (cf. [9] ). Concerning the syntax, we describe terms and formulas as dependent inductive types over a signature Σ = (F Σ ; P Σ ) of function symbols f : F Σ and relation symbols P : P Σ with arities |f | and |P |, using binary connectives˙ ∈ {→,∧,∨} and quantifiers∇ ∈ {∀,∃}:

Negation is defined as the abbreviation¬ϕ := ϕ→⊥. In the chosen de Bruijn representation [4] , a bound variable is encoded as the number of quantifiers shadowing its binder, e.g. ∀x. ∃y. P x u → P y v may be represented by∀∃ P 1 4→ P 0 5. The variables 2 = 4 − 2 and 3 = 5 − 2 in this example are the free variables, and variables that do not occur freely are called fresh, e.g. 0 and 1 are fresh. For the sake of legibility, we write concrete formulas with named binders and defer de Bruijn representations to the Coq development. For a formula ϕ over a signature Σ, we define the list FV(ϕ) : L N of free variables, the list F ϕ : L F Σ of function symbols and the list P ϕ : L P Σ of relation symbols that actually occur in ϕ, all by recursion on ϕ.

Turning to semantics, we employ the standard (Tarski-style) model-theoretic semantics, evaluating terms in a given domain and embedding the logical connectives into the constructive meta-logic (cf. [22] ): 

where each logical connective˙ /∇ is mapped to its meta-level counterpart /∇ and where we denote by a · ρ the de Bruijn extension of ρ by a, defined by (a · ρ) 0 := a and (a · ρ)

Proof. By induction on ϕ; finite quantification preserves decidability.

In this paper, we are mostly concerned with finite satisfiability of formulas. However, since some of the compound reductions hold for more general or more specific notions, we introduce the following variants: Definition 13 (Satisfiability). For a formula ϕ over a signature Σ, we write

Notice that in a classical treatment of finite model theory, models are supposed to be given in extension, i.e. understood as tables providing computational access to functions and relations values. To enable this view in our constructive setting, we restrict to decidable relations in the definition of FSAT, and from now on, finite satisfiability is always meant to encompass a decidable model. One could further require the domain D to be discrete to conform more closely with the classical view; discreteness is in fact enforced by FSATEQ. However, we refrain from this requirement and instead show in Sect. 4.1 that FSAT and FSAT over discrete models are constructively equivalent.

In this section, we show that BPCP reduces to FSATEQ(Σ BPCP ; ≡) for the special purpose signature

To this end, we fix an instance R : L (L B × L B) of BPCP (to be understood as a finite set of pairs of Boolean strings) and we construct a formula ϕ R such that ϕ R is finitely satisfiable if and only if R has a solution.

Informally, we axiomatise a family B n of models over the domain of Boolean strings of length bounded by n and let ϕ R express that R has a solution in B n . The axioms express enough equations and inversions of the constructions included in the definition of BPCP such that a solution for R can be recovered.

Formally, the symbols in Σ BPCP are used as follows: the functions f b and the constant e represent b :: (·) and [ ] for the encoding of strings s as terms s:

The constant represents an undefined value for strings too long to be encoded in the finite model B n . The relation P represents derivability from R (denoted R ·/· here) while ≺ and ≡ represent strict suffixes and equality, respectively. Expected properties of the intended interpretation can be captured formally as first-order formulas. First, we ensure that P is proper (only subject to defined values) and that ≺ is a strict order (irreflexive and transitive):

Next, the image of f b is forced disjoint from e and injective as long as is not reached. We also ensure that the images of f tt and f ff intersect only at :

Furthermore, we enforce that P simulates R ·/·, encoding its inversion principle

Finally, ϕ R is the conjunction of all axioms plus the existence of a solution: where we left out some explicit constructors and the excluded edge cases of the relations for better readability. As required, B n interprets ≡ by equality = Dn . Considering the desired properties of B n , first note that D n can be shown finite by induction on n. This however crucially relies on the proof irrelevance of the λx. x ≤ n predicate. 12 The atoms s ≺ Bn t and s ≡ Bn t are decidable by straightforward computations on Boolean strings. Decidability of P Bn s t (i.e. R s/t) was established in Fact 5. Finally, since ϕ R is a closed formula, any variable assignment ρ can be chosen to establish that B n satisfies ϕ R , for instance ρ := λx.∅. Then showing B n ρ ϕ R consists of verifying simple properties of the chosen functions and relations, with mostly straightforward proofs.

Proof. Suppose that M ρ ϕ R holds for some finite Σ BPCP -model (D, M, ρ) interpreting ≡ as equality and providing operations f M b , e M , M , P M and ≺ M . Again, the concrete assignment ρ is irrelevant and M ρ ϕ R ensures that the functions/relations behave as specified and that P M x x holds for some x : D.

Instead of trying to show that M is isomorphic to some B n , we directly reconstruct a solution for R, i.e. we find some s with R s/s from the assumption that M ρ ϕ R holds. To this end, we first observe that the relation u/v ≺ M x/y as defined above is a strict order and thus well-founded as an instance of Fact 8. Now we can show that for all x/y with P M x y there are strings s and t with x = s, y = t and R s/t, by induction on the pair x/y using the well-foundedness of ≺ M . So let us assume P M x y. Since M satisfies ϕ there are two cases:

-there is s/t ∈ R such that x = s and y = t. The claim follows by R s/t; -there are u, v : D with P M u v and s/t ∈ R such that x = s + + + u, y = t + + + v, and u/v ≺ M x/y. The latter makes the inductive hypothesis applicable for P M u v, hence yielding R s /t for some strings s and t corresponding to the encodings u and v. This is enough to conclude x = s + + s , y = t + + t and R (s + + s )/(t + + t ) as wished.

Applying this fact to the assumed match P M x x yields a solution R s/s.

Combined with Fact 5, Theorem 14 entails the undecidability (and non-coenumerability) of FSATEQ over a custom (both finite and discrete) signature Σ BPCP . By a series of signature reductions, we generalise these results to any signature containing an at least binary relation symbol. In particular, we explain how to reduce FSAT(Σ) to FSAT(0; {∈ 2 }) for any discrete signature Σ, hence including Σ BPCP . We also provide a reduction from FSAT(0; {∈ 2 }) to FSAT({f n }; {P 1 }) for n ≥ 2, which entails the undecidability of FSAT for signatures with one unary relation and an at least binary function. But first, let us show that FSAT is unaltered when further assuming discreteness of the domain.

We consider the case of models over a discrete domain D. Of course, in the case of FSATEQ(Σ; ≡) the requirement that ≡ is interpreted as a decidable binary relation which is equivalent to = D imposes the discreteness of D. But in the case of FSAT(Σ) nothing imposes such a restriction on D. However, as we argue here, we can always quotient D using a suitable decidable congruence, making the quotient a discrete finite type while preserving first-order satisfaction.

We write FSAT (Σ) ϕ if FSAT(Σ) ϕ on a discrete model.

Let us consider a fixed signature Σ = (F Σ ; P Σ ). In addition, let us fix a finite type D and a (decidable) model M of Σ over D. We can conceive an equivalence over D which is a congruence for all the interpretations of the symbols by M, namely first-order indistinguishability x= Σ y := ∀ϕ ρ. M x·ρ ϕ ↔ M y·ρ ϕ, i.e. first-order semantics in M is not impacted when switching x with y.

The facts that= Σ is both an equivalence and a congruence are easy to prove but, with this definition, there is little hope of establishing decidability of= Σ . The main reason for this is that the signature may contain symbols of infinitely many arities. So we fix two lists l F : L F Σ and l P : L P Σ of function and relation symbols respectively and restrict the congruence requirement to these lists.

We say that x and y are first-order indistinguishable up to l F /l P , and we write x= y, if for any ρ : N → D and any first-order formula ϕ built from the symbols in l F and l P only, we have M x·ρ ϕ ↔ M y·ρ ϕ.

Theorem 19. First-order indistinguishability= up to l F /l P is a strongly decidable equivalence and a congruence for all the symbols in l F /l P .

Proof. The proof is quite involved, we only give its sketch here; see Appendix B of the extended version of this paper [12] for more details. The real difficulty is to show the decidability of=. To this end, we characterise= as a bisimulation, i.e. we show that= is extensionally equivalent to Kleene's greatest fixpoint F ω (λuv. ) of some ω-continuous operator F :

We then show that F preserves strong decidability. To be able to conclude, we establish that F reaches its limit after l := 2 d×d iterations where d := card D, the length of a list enumerating the finite type D. To verify this upper bound, we build the weak powerset, a list of length l which contains all the weakly decidable binary predicates of type D → D → P, up to extensional equivalence. As all the iterated values F n (λuv. ) are strongly decidable, they all belong to the weak powerset, so by Theorem 7, a duplicate is to be found in the first l + 1 steps, ensuring that the sequence is stalled at l.

We use the strongly decidable congruence= to quotient models onto discrete ones (in fact F n for some n) while preserving first-order satisfaction. Proof. FSAT(Σ) ϕ entails FSAT (Σ) ϕ is the non-trivial implication. Hence we consider a finite Σ-model (D, M, ρ) of ϕ and we build a new finite Σ-model of ϕ which is furthermore discrete. We collect the symbols occurring in ϕ as the lists l F := F ϕ (for functions) and l P := P ϕ (for relations). By Theorem 19, firstorder indistinguishability= : D → D → P up to F ϕ /P ϕ is a strongly decidable equivalence over D and a congruence for the semantics of the symbols occurring in ϕ. Using Theorem 9, we build the quotient D/= on a F n for some n : N. We transport the model M along this quotient and because= is a congruence for the symbols in ϕ, its semantics is preserved along the quotient. Hence, ϕ has a finite model over the domain F n which is both finite and discrete.

Proof. Given a list l F (resp. l P ) of function (resp. relation) symbols, we construct a formula ψ(l F , l P , ≡) over the function symbols in l F and relation symbols in (≡ :: l P ) expressing the requirement that ≡ is an equivalence and a congruence for the symbols in l F /l P . Then we show that λϕ. ϕ∧ ψ(F ϕ , ≡ :: P ϕ , ≡) is a correct reduction, where F ϕ and P ϕ list the symbols occurring in ϕ.

Let us start by converting a discrete signature to a finite and discrete signature.

For any formula ϕ over a discrete signature Σ, one can compute a signature Σ n,m = (F n ; F m ), arity preserving maps F n → F Σ and F m → P Σ and an equi-satisfiable formula ψ over Σ n,m , i.e. FSAT(Σ) ϕ ↔ FSAT(Σ n,m ) ψ.

Proof. We use the discreteness of Σ and bijectively map the lists of symbols F ϕ and P ϕ onto F n and F m respectively, using Corollary 10. We structurally map ϕ to ψ over Σ n,m along this bijection, which preserves finite satisfiability.

Notice that n and m in the signature Σ n,m depend on ϕ, hence the above statement cannot be presented as a reduction between (fixed) signatures.

We now erase all function symbols by encoding them with relation symbols. To this end, let Σ = (F Σ ; P Σ ) be a signature, we set Σ := (0; {≡ 2 } + F +1 Σ + P Σ ) where ≡ is a new interpreted relation symbol of arity two and in the conversion, function symbols have arity lifted by one, hence the F +1 Σ notation.

For any finite 13 type of function symbols F Σ , one has a reduction FSAT (F Σ ; P Σ ) FSATEQ(0;

Proof. The idea is to recursively replace a term t over Σ by a formula which is "equivalent" to x ≡ t (where x is a fresh variable not occurring in t) and then an atomic formula like e.g.

We complete the encoding with a formula stating that every function symbol f : F Σ is encoded into a total functional relation P f : F +1 Σ of arity augmented by 1.

Next, assuming that the function symbols have already been erased, we explain how to merge the relation symbols in a signature Σ = (0; P Σ ) into a single relation symbol, provided that there is an upper bound for the arities in P Σ .

Lemma 24. The reduction FSAT(0; P Σ ) FSAT 0; {Q 1+n } holds when P Σ is a finite and discrete type of relation symbols and |P | ≤ n holds for all P : P Σ .

In the following, we denote by F n Σ (resp. P n Σ ) the same type of function (resp. relation) symbols but where the arity is uniformly converted to n.

Fact 25. Let Σ = (F Σ ; P Σ ) be a signature:

Proof. For the first reduction, every atomic formula of the form P v with |v| = |P | ≤ n is converted to P v with v := v + + [x 0 ; . . . ;

x 0 ] and |v | = n for an arbitrary term variable x 0 . The rest of the structure of formulas is unchanged. For the second reduction, we convert every atomic formula P v with |v| = n into Q (P :: v) where P now represents a constant symbol (Q is fixed).

For the last reduction, we replace every constant symbol by a corresponding fresh variable chosen above all the free variables of the transformed formula.

Let Σ n = (0; {P n }) be a singleton signature where P is of arity n. We now show that P can be compressed to a binary relation modelling membership via a construction using hereditarily finite sets [18] (useful only when n ≥ 3).

Technically, this reduction is one of the most involved in this work, although in most presentations of Trakhtenbrot's theorem, this is left as an "easy exercise, " see e.g. [14] . Maybe it is perceived so because it relies on the encoding of tuples in set theory, which is somehow natural for mathematicians, 14 but properly building the finite set model in constructive type theory was not that easy.

Here we only give an overview of the main tools. We encode an arbitrary n-ary relation R : X n → P over a finite type X in the theory of membership over the signature Σ 2 = (0; {∈ 2 }). Membership is much weaker than set theory because the only required set-theoretic axiom is extensionality. Two sets are extensionally equal if their members are the same, and extensionality states that two extensionally equal sets belong to the same sets:

∀xy. (∀z. z∈ x↔ z∈ y)→∀z. x∈ z→ y∈ z

As a consequence, no first-order formula over Σ 2 can distinguish two extensionally equal sets. Notice that the language of membership theory (and set theory) does not contain any function symbol, hence, contrary to usual mathematical practices, there is no other way to handle a set than via its characterising formula which makes it a very cumbersome language to work with formally. However, this is how we have to proceed in the Coq development but here, we stick to meta-level "terms" in the prose for simplicity.

The ordered pair of two sets p and q is encoded as (p, q) := {{p}, {p, q}} while the n-tuple (t 1 , . . . , t n ) is encoded as (t 1 , (t 2 , . . . , t n ) ) recursively. The reduction function which maps formulas over Σ n to formulas over Σ 2 proceeds as follows. We reserve two first-order variables d (for the domain D) and r (for the relation R). We describe the recursive part of the reduction Σ r n 2 Σ r n 2 (P v) := "tuple v∈ r" Σ r n 2 (∀z. ϕ) :=∀z. z∈ d→ Σ r n 2 (ϕ) Σ r n 2 (ϕ˙ ψ) := Σ r n 2 (ϕ)˙ Σ r n 2 (ψ) Σ r n 2 (∃z. ϕ) :=∃z. z∈ d∧ Σ r n 2 (ϕ)

ignoring the de Bruijn syntax (which would imply adding d and r as parameters).

Notice that d and r should not occur freely in ϕ. In addition, we require that:

This gives us the reduction function Σ n 2 (ϕ) := ϕ 1∧ ϕ 2∧ ϕ 3∧ Σ r n 2 (ϕ). The completeness of the reduction Σ n 2 is the easy part. Given a finite model of Σ n 2 (ϕ) over Σ 2 , we recover a model of ϕ over Σ n by selecting as the new domain the members of d and the interpretation of P v is given by testing whether the encoding of v as a n-tuple is a member of r.

The soundness of the reduction Σ n 2 is the formally involved part, with Theorem 27 below containing the key construction.

Theorem 27. Given a decidable n-ary relation R : X n → P over a finite, discrete and inhabited type X, one can compute a finite and discrete type Y equipped with a decidable relation ∈ : Y → Y → P, two distinguished elements d, r : Y and a pair of maps i : X → Y and s : Y → X s.t.

1. ∈ is extensional; 4. ∀x : X. i x ∈ d; 2. extensionally equal elements of Y are equal; 5. ∀y : Y. y ∈ d → ∃x. y = i x; 3. all n-tuples of members of d exist in Y ; 6.∀x : X. s(i x) = x; 7. R v iff i(v) is a n-tuple member of r, for any v : X n .

Proof. We give a brief outline of this quite involved proof, referring to the Coq code for details. The type Y is built from the type of hereditarily finite sets based on [18] , and when we use the word "set" below, it means hereditarily finite set. The idea is first to construct d as a transitive set of which the elements are in bijection i/s with the type X, hence d is the cardinal of X in the set-theoretic meaning. Then the iterated powersets P(d), P 2 (d), . . . , P k (d) are all transitive as well and contain d both as a member and as a subset. Considering P 2n (d) which contains all the n-tuples built from the members of d, we define r as the set of n-tuples collecting the encodings i(v) of vectors v : X n such that R v. We show r ∈ p for p defined as p := P 2n+1 (d). Using the Boolean counterpart of (·) ∈ p for unicity of proofs, we then define Y := {z | z ∈ p}, restrict membership ∈ to Y and this gives the finite type equipped with all the required properties. Notice that the decidability requirement for ∈ holds constructively because we work with hereditarily finite sets, and would not hold with arbitrary sets.

Combining all the previous results, we give a reduction from any discrete signature to the binary singleton signature. We finish the reduction chains with the weakest possible signature constraints. The following reductions have straightforward proofs. 

Complementing the previously studied negative results, we now examine the conditions allowing for decidable satisfiability problems. (FSAT over a fixed domain) . Given a discrete signature Σ and a discrete and finite type D, one can decide whether or not a formula over Σ has a (finite) model over domain D.

Proof. By Fact 12, satisfaction in a given finite model is decidable. It is also invariant under extensional equivalence, so we only need to show that there are finitely many (decidable) models over D up to extensional equivalence. 15 Lemma 32. A formula over a signature Σ has a finite and discrete model if and only if it has a (finite) model over F n for some n : N.

Proof. If ϕ has a model over a discrete and finite domain D, by Corollary 10, one can bijectively map D to F n and transport the model along this bijection.

Lemma 33. FSAT(0; P Σ ) is decidable if P Σ is discrete with uniform arity 1.

Proof. By Lemma 22, we can assume P Σ = F n w.l.o.g. We show that if ϕ has a finite model then it must have a model over domain {v : B n → B | b v = tt} for some Boolean subset b : (B n → B) → B. Up to extensional equivalence, there are only finitely many such subsets b and we conclude with Lemma 31.

Lemma 34. For any finite type P Σ of relation symbols and signatures of uniform arity 1, we have a reduction FSAT(F n ; P Σ ) FSAT(0; L F n × P Σ + P Σ ).

Proof. We implemented a proof somewhat inspired by that of Proposition 6.2.7 (Grädel) in [1, pp. 251 ] but the invariant suggested in the iterative process described there did not work out formally and we had to proceed in a single conversion step instead, switching from single symbols to lists of symbols.

If functions or relations have arity 0, one can always lift them to arity 1 using a fresh variable (of arbitrary value), like in Fact 25, item (1).

holds when all arities in Σ are at most 1, where F 1 Σ and P 1 Σ denote arities uniformly updated to 1.

We conclude with the exact classification of FSAT regarding enumerability, decidability, and undecidability depending on the properties of the signature. Proof. If all arities are at most 1, then by Fact 35, we can assume Σ of uniform arity 1. Therefore, for a formula ϕ over Σ with uniform arity 1, we need to decide FSAT for ϕ. By Theorem 22, we can compute a signature Σ n,m = (F n ; F m ) and a formula ψ over Σ n,m equi-satisfiable with ϕ. Using the reduction of Lemma 34, we compute a formula γ, equi-satisfiable with ψ, over a discrete signature of uniform arity 1, void of functions. We decide the satisfiability of γ by Lemma 33. If all relation symbols have arity 0, regardless of F Σ , no term can occur in formulas, hence neither can function symbols. Starting from ϕ over Σ = (F Σ ; P 0 Σ ) where only P Σ is assumed discrete, we compute an equi-satisfiable formula ψ over Σ = (0; P 0 Σ ) and we are back to the previous case. Proof. Follows by Facts 3 and 5.

Notice that even if the conditions on arities of Theorems 37 and 38 fully classify discrete signatures, it is not possible to decide which case holds unless the signature is furthermore finite. For a given formula ϕ though, it is always possible to render it in the finite signature of used symbols.

The main part of our Coq development directly concerned with the classification of finite satisfiability consists of 10k loc, in addition to 3k loc of (partly reused) utility libraries. Most of the code comprises the signature transformations with more than 4k loc for reducing discrete signatures to membership. Comparatively, the initial reduction from BPCP to FSATEQ(Σ BPCP ) takes less than 500 loc.

Our mechanisation of first-order logic in principle follows previous developments [8, 9] but also differs in a few aspects. Notably, we had to separate function from relation signatures to be able to express distinct signatures that agree on one sort of symbols computationally. Moreover, we found it favourable to abstract over the logical connectives in form of˙ and∇ to shorten purely structural definitions and proofs. Finally, we did not use the Autosubst 2 [20] support for de Bruijn syntax to avoid its current dependency on the functional extensionality axiom.

We refrained from additional axioms since we included our development in the growing Coq library of synthetic undecidability proofs [11] . In this context, we plan to generalise some of the intermediate signature reductions so that they become reusable for other undecidability proofs concerning first-order logic over arbitrary models.

As further future directions, we want to explore and mechanise the direct consequences of Trakhtenbrot's theorem such as the undecidability of query containment and equivalence in data base theory or the undecidability of separation logic [3, 5] . Also possible, though rather ambitious, would be to mechanise the classification of first-order satisfiability with regards to the quantifier prefix as comprehensively developed in [1] . Finally, we plan to mechanise the undecidability of semantic entailment and syntactic deduction in first-order axiom systems such as ZF set theory and Peano arithmetic.

Funding. The work of the second author was partially supported by the TICAMORE project (ANR grant 16-CE91-0002).

The Classical Decision Problem. Perspectives in Mathematical Logic

An efficient Coq Tactic for deciding kleene algebras

On the almighty wand

Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application to the Church-Rosser theorem

Computability and complexity results for a spatial assertion language for data structures

Pragmatic quotient types in Coq

Verification of PCP-related computational reductions in Coq

On synthetic undecidability in Coq, with an application to the Entscheidungs problem

Completeness theorems for first-order logic analysed in constructive type theory

Certified undecidability of intuitionistic linear logic via binary stack machines and minsky machines

A Coq library of undecidable problems

Trakhtenbrot's Theorem in Coq

Hilbert's tenth problem in Coq

Elements of Finite Model Theory, 1st edn

Über Möglichkeiten im Relativkalkül

HOCore in Coq

Completeness and decidability of de Bruijn substitution algebra in Coq

Hereditarily finite sets in constructive type theory

Undecidability of higher-order unification formalised in Coq

Autosubst 2: reasoning with multi-sorted de Bruijn terms and vector substitutions

The impossibility of an algorithm for the decidability problem on finite classes

Some elementary results in intutionistic model theory

Sets in types, types in sets