key: cord-0046639-2gr1ofol
authors: Xu, Yiming; Norrish, Michael
title: Mechanised Modal Model Theory
date: 2020-05-30
journal: Automated Reasoning
DOI: 10.1007/978-3-030-51074-9_30
sha: d9295ce0c9db088d9d8b35855016becbad9865c8
doc_id: 46639
cord_uid: 2gr1ofol

In this paper, we discuss the mechanisation of some fundamental propositional modal model theory. The focus on models is novel: previous work in mechanisations of modal logic have centered on proof systems and applications in model-checking. We have mechanised a number of fundamental results from the first two chapters of a standard textbook (Blackburn et al. [1]). Among others, one important result, the Van Benthem characterisation theorem, characterises the connection between modal logic and first order logic. This latter captures the desired saturation property of ultraproduct models on countably incomplete ultrafilters.

The theory of modal logic has long been a fruitful area when it comes to mechanisation. The proof systems are appealing, and the applications in model-checking are of clear real-world interest. It helps also that the subject domain (proof calculi and automata) are well-suited to "standard" theorem-proving technology (rule inductions and interesting data types).

There has been much less work on the model theory behind modal logic; indeed even in first order logic, most developments concern themselves only with model theory inasmuch as it is required to show completeness of an accompanying proof system. As our experience demonstrates, it is also clear that modern theorem-proving systems are not necessarily so well-suited to the mathematics behind model theory. Harrison [5] complained in 1998 that the very notion of validity is awkward to capture in HOL, and our own work shows up further failings in simple type theory.

Nonetheless, there is much interesting mathematics to be found even in the early chapters of a standard text such as Blackburn et al. [1] . The fact that mechanising only as far as [1, Chapter 2] requires what we believe to be the first mechanisation of the notion of ultraproduct (ultimately leading to Loś's theorems and other results), is a strong suggestion that we are exploring novel mathematical ground for interactive theorem-proving systems.

Contributions. This paper presents the first mechanised proofs of a number of basic results from the first two chapters of Blackburn et al. [1] (e.g., bounded morphisms, bisimulations and the finite model property via selection), as well as -Two versions of Loś's theorem on the saturation of ultraproduct models; -modal equivalence as bisimilarity between ultrafilter extensions; and -a close approximation of Van Benthem's Characterisation Theorem.

We also discuss where HOL's simple type theory lets us down: some standard results (including the best possible statement of Van Benthem's Characterisation Theorem) seem impossible to prove in our setting.

HOL4 Notation. All of our theorems have been pretty-printed to L a T E X from the HOL theory files. We hope that most of the basic syntax is easy to follow. In a few places we use CHOICE s to denote the arbitrary choice of an element from set s (appealing to the Axiom of Choice). The power-set of a set s is written P s. In a number of places, we use HOL's "itself" type to allow us to explicitly mention a type via a term. The type α itself has just one value inhabiting it for any given choice of α; that value is written (:α).

Source Availability. Our HOL4 sources are available from GitHub at https://github.com/u5943321/Modal-Logic

The sources build under HOL4 commit with SHA 03829d8986f.

In our mechanisation, we consider the basic modal language, in which the only primitive modal operator is the '♦'. A modal formula is either of form Vm p, where p is of type num, enumerating all the possible variable symbols, a disjunction DISJ φ ψ (pretty-printed to φ ∨ m ψ in most places), the falsity ⊥ m , a negation ¬m φ, or, finally, of the form ♦φ. We define a data type called form to represent the formulas of this modal language.

If we wanted to consider modal operators with any arity, we should change the last constructor of modal formulas so it takes two parameters: a natural number indexing the modal operator, and a list of modal formulas. This would in turn require a well-formedness predicate to be defined over formulas to make sure that modalities were applied to the right number of arguments.

A model where these formulas can be interpreted consists of a frame and a valuation, where a β frame is a β-set with a relation on it, and a model adds valuations for the variables present at each world:

In the rest of the paper, the field M .valt of a model M will be called the valuation, and M W , M R and M V are used to denote the world set, the relation, and the valuation of M respectively. The interpretation of modal formulas on a model is given by the predicate satisfaction. We read 'M , w φ' as 'φ is satisfied at the world w in M '.

By requiring w ∈ M W in various clauses above, we ensure that models' world sets must be non-empty if they are to satisfy any formulas. Two worlds w 1 ∈ M 1 W and w 2 ∈ M 2 W are modal equivalent (written M 1 , w 1 M 2 ,w 2 ) if they satisfy the same set of modal formulas. If φ 1 , φ 2 are modal formulas, we say they are equivalent over β models (written φ 1 ≡ (:β) φ 2 ) if they are satisfied in the same worlds in every model:

We cannot omit the type parameter (:β) in the definition, as there would otherwise be a type, namely the type of the underlying set of the models we are talking about, that only appears on the right-hand side but not on the lefthand side of the definition. HOL forbids such definitions for soundness reasons. Also, HOL does not permit quantification over types, so it is impossible to write ∀ μ. φ 1 ≡ μ φ 2 , with μ a type. Therefore, this definition is not exactly encoding the equivalence in the usual sense: when we mention equivalence of formulas in usual mathematical language, we are implicitly referring to the class of all models, but the constraint here bans us from talking about all models of all possible types at once. A modal formula can be translated into a first-order formula via the standard translation. To mechanise this translation, we build on Harrison's construction of first-order logic [5] . The first-order connectives are decorated with an f. For a modal formula φ, ST x φ is the standard translation of φ using x as the only free variable that may occur:

As one would expect, we translate ♦φ into an existential formula. To ensure we use a fresh variable, we use x + 1 as our new variable symbol in this clause. The standard translation gives a first-order reformulation of satisfaction of modal formulas:

Here mm2folm is the function that turns a modal model into a first-order model, defined as:

That is, the model obtained by converting a modal model M has domain M W , maps every term Fn f f l into an arbitrary world, maps each propositional letter to distinct predicates on worlds, and uses one binary predicate (the "0th predicate") to encode the frame relation.

We discuss some highlights of mechanised results from Blackburn et al.

A tree-like model is a model whose underlying frame is a tree. If Tr , a frame, is also a tree with root r, we write tree Tr r :

The tree-like property says each satisfiable modal formula can be satisfied in a tree-like model:

The world set of the tree-like model constructed from M is a set of lists of worlds in M (such lists are effectively paths from the root to various positions within the tree). Thus, passing to a tree-like model does not preserve the model type.

The tree-like lemma is used to prove the finite model property via selection afterwards.

Though apparently verbose, the definition of bisimulation in HOL is straightforward.

It is trivial to prove by induction that bisimilar worlds are modal equivalent. As the most significant theorem on the basic theory of bisimulations, we proved the Hennessy-Milner theorem, which states that modal equivalence and bisimulation on image-finite models are the same thing. An image-finite model is a model where every world can only be related to finitely many worlds. In HOL, we get:

Bisimulation is an interesting topic in modal logic. Three other significant theorems on bisimulations (including an approximation of Van Benthem Characterisation theorem) are discussed later.

There are two classical approaches to constructing finite models using model theory, namely via selection and via filtration. The complicated one is the former:

The construction of M 5 requires a lemma:

The proof of Lemma 1 further relies on the following fact: Given a set A of modal-formulas that is finite up to equivalence, if we combine the elements of A using only connectives other than ♦, then we get only finitely many nonequivalent formulas. To show this, we prove that there is an injection from the set of equivalence classes of such combinations to a finite set. For the antecedent of Lemma 1, we require the assumption that the universe of β is infinite since we rely on the fact that two modal formulas ♦φ 1 and ♦φ 2 are equivalent if and only if φ 1 and φ 2 are equivalent. This would be easy to prove in set theory. However, in simple type theory, the proof of 

We also mechanised the filtration approach, but omit the details for lack of space. The advantage of filtration is that the resulting finite model is over worlds of the same type as in the starting model.

All the results proved above can be captured using num models everywhere. If one takes β to be num (or any infinite type) in Theorem 2, one can also exploit the fact that numbers and lists of numbers have the same cardinality to derive a finite model result that preserves the "input type".

A number of results in Blackburn et al. [1, §2.5- §2.7 ] rely on theorems about ultrafilters and ultraproducts.

Given a non-empty set J, a set L ⊆ P J is a filter if it contains J itself, is closed under binary intersection, and is closed upward.

We call L a proper filter if L is not the whole power set. An ultrafilter is a filter U such that for every X ⊆ J, exactly one of X or J \ X is in U . Intuitively, subsets X ⊆ J in an ultrafilter U are considered as 'large' subsets of J .

The ultrafilter theorem states that every proper filter is contained in an ultrafilter:

A subset A of the power set on J has finite intersection property if once we take the intersection of a finite, nonempty family in A, the resultant set is nonempty.

As a corollary of ultrafilter theorem, a set with finite intersection property is contained in an ultrafilter.

The notion of ultraproducts is defined for sets, modal models, and first-order models.

A family of sets indexed by J is a function A s in HOL.

For j ∈ J , A s j is the set indexed by j . Given a family A s indexed by a nonempty set J such that each A s j is non-empty, the ultraproduct of A s is defined as a quotient of the cartesian product of the family.

If U is an ultrafilter on J, for two functions f, g in the Cartesian product Cart-prod J A s , we say f and g are U -equivalent (notation:

We write f U to denote the equivalence class that f belongs to. In the case where A s j = A for all j ∈ J, the ultraproduct is called the ultrapower of A modulo U .

Ultraproduct for Modal Models. Given a family M s of modal models indexed by J and an ultrafilter U on J, the ultraproduct model of M s modulo U (notation: Π U M s ) is described as follows:

-The world set is the ultraproduct of world sets of M s modulo U .

-Two equivalence classes f U , g U of functions are related in 

As ∼ A U is an equivalence relation, if one element in an equivalence class satisfies the required condition, then all the elements in the equivalence class will satisfy the condition. Therefore, if we replace all the existential quantifiers with universal quantifiers in the above definition, the construction is still valid, and will give the same model as the current definition.

The critical result we need about ultraproducts of modal models is a modal version of the fundamental theorem of ultraproducts, also known as Loś's theorem. 's theorem, Modal version) .

According to our intuition about ultrafilters, we can gloss this theorem to mean that the ultraproduct of a family of modal models satisfies a modal formula if and only if 'most of' the models in the family satisfy the formula. Though it is possible to derive this result from Loś's theorem of first-order models using the standard translation, our proof is direct, by structural induction on φ. Here we fix the representative of each equivalence class f U to be CHOICE f U . Therefore, as described above, the functions Fun-component and Pred-component are:

The semantic behavior of ultraproduct models is characterised by Loś's theorem: for the ultraproduct of a family M s of first-order models over an ultrafilter U on J, a formula φ is satisfied under a valuation σ if and only if the set indexing the models M s j in the family where φ is true under the valuation λ v . CHOICE (σ v ) j is in the ultrafilter U .

where wffm M means the functions of M never map a list out of the domain of M .

The first application of the theory of ultrafilters above is to construct the ultrafilter extension of a model, which has the nice property of being modally saturated (m-saturated hereafter). To define m-saturation, we give the following three definitions (the first two are called finitely satisfiable, satisfiable) consecutively:

For m-saturated models, bisimulation and modal equivalence coincide:

Given a model M and a set X of worlds of M , the set of worlds that 'can see' X (notation: M ♦ (X )) is the set of worlds w of M such that there exists some v ∈ X such that M R w v. We define the ultrafilter extension ue M of M as: In HOL:

Using the ultrafilter theorem and some basic properties about ultrafilters, we derive:

In particular, every world w ∈ M W is embedded as the principal filter π M W w on M W generated by w in the ultrafilter extension or M . Also, the above leads to the proof of the fact that the ultrafilter extension of every model is m-saturated. 

Given a first-order model M with no information about interpretation of its function symbols, we can expand the model M by adding an interpretation of some function symbols. For our purpose, we are only interested in adding the interpretation of finitely many nullary function symbols, also called constants. We write expand M A f to denote the model that is the result of adding each element in A to M as a new constant. Further, the function f is a bijection between {0, · · · , n− 1} and A, which is assumed to be finite, so that each nullary function symbol c will be interpreted as f c in M . As is apparent from the definition, the only difference between a model and its expansion is the interpretation of function symbols.

A set Σ of first-order formulas is called consistent with a model M if for every finite subset Σ 0 ⊆ Σ , there exists a valuation of M such that all elements of Σ 0 are satisfied, in this case, we write consistent M Σ. A set Γ of first-order formula is an x-type if for each formula in Γ , the only free variable that may contain is x. In this case, we write 'ftype x Γ' in HOL. If Γ is an x-type, when evaluating formulas in Γ , the valuations will only control where the only free variable x goes to. We say 

We say M is countably saturated if M is n-saturated for every natural number n. The ultimate goal is to prove a lemma to be used in the proof of Van Benthem characterisation theorem: For a family of non-empty models, their ultraproduct on a countably incomplete ultrafilter is countably saturated.

Here a countably incomplete ultrafilter is an ultrafilter that contains a countably infinite family that intersects to the empty set. We prove in HOL that such ultrafilters do exist using Theorem 3. The above theorem is not simply a direct consequence of Loś's theorem: that result is about ultraproducts of first-order models, and it says nothing about expansion. But to prove Lemma 2, we must prove a statement for an expanded first-order model, and this first order model is itself obtained by converting a ultraproduct of modal models.

To deal with this issue, the key observation is that constants are nothing more than forcing some symbols to be sent to some points in a model under every valuation, hence rather than use nullary function symbols, we fix a set of variable letters, each corresponding to a function symbol, and only consider the valuations that send these variable letters to certain fixed points. With this idea, we can remove all the constants in a formula, and hence change our scope from an expanded model back to the unexpanded model. To get rid of the constants {0, · · · , n−1}, we replace every V f m with V f (m + n), and replace every constant Fn f c [ ] by V f c. This operation is done by the function shift-form which takes a natural number (the number of constants we want to remove), and a firstorder formula (where the only function symbols may appear are the constants 0, · · · , n − 1). Since 0, · · · , n − 1 in a shifted formula are now designed to be sent to fixed places f 0, · · · , f (n − 1), it does not make sense to assign these variable symbols anywhere else. Therefore, to talk about evaluation of shifted formula, the first thing is to make sure that the valuations we are considering send the variables which actually denote constants to the right place. Hence we shift the valuations accordingly, and then prove that a formula is satisfied on an expanded model is satisfied under a valuation if and only if the shifted formula is satisfied under the shifted valuation. Also, we prove that 'taking the ultraproduct firstorder model commutes with the convertion from modal to a first-order model on certain formulas', in the sense that the resulting models satisfies the same firstorder formulas without function symbols. By putting these two results together, we prove Lemma 2 using the proof in Chang and Keisler [3] .

Note that the standard translation of any modal formula can only contain unary predicate symbols which correspond to propositional letters, one binary predicate symbol which corresponds to the relation, and no function symbols. A first-order formula which only uses these symbols is called an L 1 τ -formula. An invar4bisim (x : num) (:α) (:β) (φ : folform)

Because of the same problem we met when defining equivalence of formulas, the type parameters are necessary here. However, although it is possible to prove theorems for different types α and β in the above definition, in the theorems to come, we will only consider the case where α and β are the same.

The Van Benthem characterisation theorem says an L 1 τ formula with at most one free variable x is invariant under bisimulation precisely when it is equivalent to the standard translation of some modal formula at x. It is immediate from Proposition 1 that every such formula which is equivalent to a standard translation is invariant for bisimulation. We cannot prove it as an 'if and only if' statement, since according to the proofs in [1] , we can only prove the two directions separately as: 

which cannot be put together into a double implication. To see the reason: given an L 1 τ -formula φ with no more then one free variable, by the second theorem above, if φ is invariant under bisimulation for models with (num → α) → boolworlds, then φ is equivalent to a standard translation on a model with α-worlds. However, if we want to prove the converse of this statement, we need to start with the assumption that φ is equivalent to a standard translation on models with α-worlds, and prove that φ is invariant for bisimulation for models with (num → α) → bool-worlds. But by the first theorem above, we can only conclude φ is invariant for bisimulation for models of type α. The point is that it is not the fact that all our desired operations can be taken within a type. In particular, we cannot take ultraproducts of models and preserve cardinalities. The cardinality of the type universe of (num → α) → bool is too large to be embedded into α, so we cannot just fix the 'base type' to be α and get an 'if and only if' statement-we cannot derive φ is invariant for bisimulation for models with (num → α) → bool-worlds from the fact that φ is invariant for bisimulation for models with α-worlds. If we could quantify over types (as we could in a theorem prover based on dependent type theory), then we can could define 'invariant under bisimulation for models of every type', and hence prove the original statement of Van Benthem characterisation theorem.

For the proof of the two theorems above, the first one is immediate from Proposition 1, and the second one requires another critical lemma saying 'modal equivalence between two worlds implies bisimilarity of the two worlds when embedded in some other models'. More precisely, if two worlds w ∈ M W and v ∈ N W are modal equivalent, then we can find an ultrafilter U on J such that in ultrapower models of M and N on U respectively, there is a bisimulation between the worlds corresponding to w and v .

The proof of the above relies on Lemma 2.

To summarise, we have mechanised all of the results (appearing as propositions, lemmas and theorems) in the first two chapters in Blackburn et al. [ For each of the mechanised definitions and results, we write the statement in HOL to be as close as possible to the original statement in [1] . We believe that this makes it as easy as possible for people who are interested in mechanising other results in [1] to continue with our work as a starting point. The work on ultraproducts up to Loś's theorem is independent of our work on modal model theory, and should be generally useful in other model-theoretic applications.

We believe that we are the first to mechanise the bulk of the results in this paper. Of course, much work has been done in this and similar areas. For example, de Wind's thesis [7] is a notable early mechanisation of modal logic, mainly focusing on proving the validity of modal formulas via natural deduction. Of similar vintage is Harrison's mechanisation of foundational results about first order model theory [5] , in particular compactness. We used this mechanisation directly in our own work. A great deal of work has also been done in the mechanisation of first order proof theory, such as the recent pearl by Blanchette et al. [2] , showing completeness in elegant fashion.

The connections between modal logic and process algebra are well-understood and there has been a great deal of mechanised work on the operational theory of such (co-)algebraic systems, starting at least as far back as Nesi [6] . Our proof of the Hennessy-Milner theorem (Theorem 1) is a gesture in this direction, but Van Benthem's theorem is much deeper and uses bisimulations as a tool to understanding the connection between modal and first order logics, rather than as a connection to process algebras.

Mechanised work with ultrafilters began with Fleuriot's use of them to mechanise non-standard analysis [4] . We are unaware of any previous mechanised use of ultraproducts or ultrapowers.

Modal Logic

Unified classical logic completeness

Model Theory

A Combination of Geometry Theorem Proving and Nonstandard Analysis with Application to Newton's Principia

Formalizing basic first order model theory

Mechanising a modal logic for value-passing agents in HOL. Electron

Modal Logic in Coq