key: cord-0060381-i9rjkkc5
authors: Baldan, Paolo; Eggert, Richard; König, Barbara; Padoan, Tommaso
title: Fixpoint Theory – Upside Down
date: 2021-03-23
journal: Foundations of Software Science and Computation Structures
DOI: 10.1007/978-3-030-71995-1_4
sha: 54fd12ff26af5ca678f5f1d33eb5dfaca653813c
doc_id: 60381
cord_uid: i9rjkkc5

Knaster-Tarski’s theorem, characterising the greatest fix- point of a monotone function over a complete lattice as the largest post-fixpoint, naturally leads to the so-called coinduction proof principle for showing that some element is below the greatest fixpoint (e.g., for providing bisimilarity witnesses). The dual principle, used for showing that an element is above the least fixpoint, is related to inductive invariants. In this paper we provide proof rules which are similar in spirit but for showing that an element is above the greatest fixpoint or, dually, below the least fixpoint. The theory is developed for non-expansive monotone functions on suitable lattices of the form [Formula: see text] , where Y is a finite set and [Formula: see text] an MV-algebra, and it is based on the construction of (finitary) approximations of the original functions. We show that our theory applies to a wide range of examples, including termination probabilities, behavioural distances for probabilistic automata and bisimilarity. Moreover it allows us to determine original algorithms for solving simple stochastic games.

Fixpoints are ubiquitous in computer science as they allow to provide a meaning to inductive and coinductive definitions (see, e.g., [26, 23] ). A monotone function f : L → L over a complete lattice (L, ), by Knaster-Tarski's theorem [28] , admits a least fixpoint µf and greatest fixpoint νf which are characterised as the least pre-fixpoint and the greatest post-fixpoint, respectively. This immediately gives well-known proof principles for showing that a lattice element l ∈ L is below νf or above µf l f (l) l νf f (l) l µf l

On the other hand, showing that a given element l is above νf or below µf is more difficult. One can think of using the characterisation of least and largest fixpoints via Kleene's iteration. E.g., the largest fixpoint is the least element of the (possibly transfinite) descending chain obtained by iterating f from .

Then showing that f i ( ) l for some i, one concludes that νf l. This proof principle is related to the notion of ranking functions. However, this is a less satisfying notion of witness since f has to be applied i times, and this can be inefficient or unfeasible when i is an infinite ordinal.

The aim of this paper is to present an alternative proof rule for this purpose for functions over lattices of the form L = M Y where Y is a finite set and M is an MV-chain, i.e., a totally ordered complete lattice endowed with suitable operations of sum and complement. This allows us to capture several examples, ranging from ordinary relations, for dealing with bisimilarity, behavioural metrics, termination probabilities and simple stochastic games.

Assume f : M Y → M Y monotone and consider the question of proving that some fixpoint a : Y → M is the largest fixpoint νf . The idea is to show that there is no "slack" or "wiggle room" in the fixpoint a that would allow us to further increase it. This is done by associating with every a : Y → M a function f # a on 2 Y whose greatest fixpoint gives us the elements of Y where we have a potential for increasing a by adding a constant. If no such potential exists, i.e. νf # a is empty, we conclude that a is νf . A similar function f a # (specifying decrease instead of increase) exists for the case of least fixpoints. Note that the premise is νf a # = ∅, i.e. the witness remains coinductive. The proof rules are:

For applying the rule we compute a greatest fixpoint on 2 Y , which is finite, instead of working on the potentially infinite M Y . The rule does not work for all monotone functions f : M Y → M Y , but we show that whenever f is nonexpansive the rule is valid. Actually, it is not only sound, but also reversible, i.e., if a = νf then νf # a = ∅, providing an if-and-only-if characterisation. Quite interestingly, under the same assumptions on f , using a restricted function f * a , the rule can be used, more generally, when a is just a pre-fixpoint (f (a) a) and it allows to conclude that νf a. A dual result holds for postfixpoints in the case of least fixpoints.

f (a) a νf * a = ∅ νf a a f (a) νf a * = ∅ a µf As already mentioned, the theory above applies to many interesting scenarios: witnesses for non-bisimilarity, algorithms for simple stochastic games [11] and lower bounds for termination probabilities and behavioural metrics in the setting of probabilistic systems [1] and probabilistic automata [2] . In particular we were inspired by, and generalise, the self-closed relations of Fu [16] , also used in [2] .

Motivating Example. Consider a Markov chain (S, T, η) with a finite set of states S, where T ⊆ S are the terminal states and every state s ∈ S\T is associated with a probability distribution η(s) ∈ D(S). 3 Intuitively, η(s)(s ) denotes the probability of state s choosing s as its successor. Assume that, given a fixed state s ∈ S, we want to determine the termination probability of s, i.e. the probability of reaching any terminal state from s. As a concrete example, take the Markov chain given in Fig. 1 , where u is the only terminal state. The termination probability arises as the least fixpoint of a function T defined as in Fig. 1 . The values of µT are indicated in green (left value). Now consider the function t assigning to each state the termination probability written in red (right value). It is not difficult to see that t is another fixpoint of T , in which states y and z convince each other incorrectly that they terminate with probability 1, resulting in a vicious cycle that gives "wrong" results. We want to show that µT = t without knowing µT . Our idea is to compute the set of states that still has some "wiggle room", i.e., those states which could reduce their termination probability by δ if all their successors did the same. This definition has a coinductive flavour and it can be computed as a greatest fixpoint on the finite powerset 2 S of states, instead of on the infinite lattice S [0,1] .

We hence consider a function T t

dependent on t, defined as follows. Let [S] t be the set of all states s where t(s) > 0, i.e., a reduction is in principle possible. Then a state s ∈ [S] t is in T t # (S ) iff s ∈ T and for all s for which η(s)(s ) > 0 it holds that s ∈ S , i.e. all successors of s are in S .

The greatest fixpoint of T t # is {y, z}. The fact that it is not empty means that there is some "wiggle room", i.e., the value of t can be reduced on the elements {y, z} and thus t cannot be the least fixpoint of f . Moreover, the intuition that t can be improved on {y, z} can be made precise, leading to the possibility of performing the improvement and search for the least fixpoint from there.

Contributions. In the paper we formalise the theory outlined above, showing that the proof rules work for non-expansive monotone functions f on lattices of the form M Y , where Y is a finite set and M an MV-algebra ( §3 and §4). Additionally, given a decomposition of f we show how to obtain the corresponding approximation compositionally ( §5). Then, in order to show that our approach covers a wide range of examples and allows us to derive original algorithms, we discuss various applications: termination probability, behavioural distances for probabilistic automata and bisimilarity ( §6) and simple stochastic games ( §7).

Proofs and further material can be found in the full version of the paper [5] .

In this section, we review some basic notions used in the paper. A preordered or partially ordered set (P, ) is often denoted simply as P , omitting the order relation. Given x, y ∈ P , with x y, we denote by [x, y] the interval {z ∈ P | x z y}. The join and the meet of a subset X ⊆ P (if they exist) are denoted X and X, respectively.

A complete lattice is a partially ordered set (L, ) such that each subset X ⊆ L admits a join X and a meet X. A complete lattice (L, ) always has a least element ⊥ = ∅ and a greatest element = ∅.

A function f : L → L is monotone if for all l, l ∈ L, if l l then f (l) f (l ). By Knaster-Tarski's theorem [28, Thm. 1] , any monotone function on a complete lattice has a least and a greatest fixpoint, denoted respectively µf and νf , characterised as the meet of all pre-fixpoints respectively the join of all post-fixpoints: µf = {l | f (l) l} and νf = {l | l f (l)}.

Let (C, ), (A, ≤) be complete lattices. A Galois connection is a pair of monotone functions α, γ such that α : C → A, γ : A → C and for all a ∈ A and c ∈ C: α(c) ≤ a ⇐⇒ c γ(a). Equivalently, for all a ∈ A and c ∈ C, (i) c γ(α(c)) and (ii) α(γ(a)) ≤ a. In this case we will write α, γ : C → A. For a Galois connection α, γ : C → A, the function α is called the left (or lower) adjoint and γ the right (or upper) adjoint.

Galois connections are at the heart of abstract interpretation [13, 14] . In particular, when α, γ is a Galois connection, given f C :

If equality holds, i.e., f C • γ = γ • f A , then greatest fixpoints are preserved along the connection, i.e., νf C = γ(νf A ).

Given a set Y and a complete lattice L, the set of functions L Y = {f | f : Y → L}, endowed with pointwise order, i.e., for a, b ∈ L Y , a b if a(y) b(y) for all y ∈ Y , is a complete lattice.

In the paper we will mostly work with lattices of the kind M Y where M is a special kind of lattice with a rich algebraic structure, i.e. an MV-algebra [21] . 

We denote 1 = 0, multiplication x ⊗ y = x ⊕ y and subtraction x y = x ⊗ y.

Definition 2 (natural order). Let M = (M, ⊕, 0, (·)) be an MV-algebra. The natural order on M is defined, for x, y ∈ M , by x y if x ⊕ z = y for some z ∈ M . When is total M is called an MV-chain.

The natural order gives an MV-algebra a lattice structure where ⊥ = 0, = 1, x y = (x y) ⊕ y and x y = x y = x ⊗ (x ⊕ y). We call the MV-algebra complete, if it is a complete lattice, which is not true in general, e.g., ([0, 1] ∩ Q, ≤).

Example 3. A prototypical example of an MV-algebra is ([0, 1], ⊕, 0, (·)) where x ⊕ y = min{x + y, 1} and x = 1 − x for x, y ∈ [0, 1]. This means that x ⊗ y = max{x + y − 1, 0} and x y = max{0, x − y} (truncated subtraction). The operators ⊕ and ⊗ are also known as strong disjunction and conjunction in Lukasiewicz logic [22] . The natural order is ≤ (less or equal) on the reals.

Another example is ({0, . . . , k}, ⊕, 0, (·)) where n ⊕ m = min{n + m, k} and n = k − n for n, m ∈ {0, . . . , k}. Both MV-algebras are complete and MV-chains.

Boolean algebras (with disjunction and complement) also form MV-algebras that are complete, but in general not MV-chains.

MV-algebras are the algebraic semantics of Lukasiewicz logic. They can be shown to correspond to intervals of the kind [0, u] in suitable groups, i.e., abelian lattice-ordered groups with a strong unit u [21] .

As mentioned in the introduction, our interest is for fixpoints of monotone func-

where M is an MV-chain and Y is a finite set. We will see that for non-expansive functions we can over-approximate the sets of points in which a given a ∈ M Y can be increased in a way that is preserved by the application of f . This will be the core of the proof rules outlined earlier.

Non-expansive Functions on MV-Algebras. For defining non-expansiveness it is convenient to introduce a norm.

Definition 4 (norm). Let M be an MV-chain and let Y be a finite set. Given a ∈ M Y we define its norm as ||a|| = max{a(y) | y ∈ Y }.

Given a finite set Y we extend ⊕ and ⊗ to M Y pointwise. Given Y ⊆ Y and δ ∈ M, we write δ Y for the function defined by δ Y (y) = δ if y ∈ Y and δ Y (y) = 0, otherwise. Whenever this does not generate confusion, we write δ instead of δ Y . It can be seen that ||·|| has the properties of a norm, i.e., for all a, b ∈ M Y and δ ∈ M, it holds that (1) ||a ⊕ b|| ||a|| ⊕ ||b||, (2) ||δ ⊗ a|| = δ ⊗ ||a|| and and ||a|| = 0 implies that a is the constant 0. Moreover, it is clearly monotonic, i.e., if a b then ||a|| ||b||.

We next introduce non-expansiveness. Despite the fact that we will finally be interested in endo-functions f : M Y → M Y , in order to allow for a compositional reasoning we work with functions where domain and codomain can be different.

Note that (a, b) → ||a b|| is the supremum lifting of a directed version of Chang's distance [21] . It is easy to see that all non-expansive functions on MVchains are monotone.

Approximating the Propagation of Increases. Let f : M Y → M Z be a monotone function and take a, b ∈ M Y with a b. We are interested in the difference b(y) a(y) for some y ∈ Y and on how the application of f "propagates" this increase. The reason is that, understanding that no increase can be propagated will be crucial to establish when a fixpoint of a non-expansive function f is actually the largest one, and, more generally, when a (pre-)fixpoint of f is above the largest fixpoint.

In order to formalise the above intuition, we rely on tools from abstract interpretation. In particular, the following pair of functions, which, under a suitable condition, form a Galois connection, will play a major role. The left adjoint α a,δ takes as input a set Y and, for y ∈ Y , it increases the values a(y) by δ, while the right adjoint γ a,δ takes as input a function b ∈ M Y , b ∈ [a, a ⊕ δ] and checks for which parameters y ∈ Y the value b(y) exceeds a(y) by δ.

We also define [Y ] a , the subset of elements in Y where a(y) is not 1 and thus there is a potential to increase, and δ a , which gives us the minimal such increase.

Definition 6 (functions to sets, and vice versa). Let M be an MV-algebra and let Y be a finite set. Define the set

For 0 δ ∈ M we consider the functions α a,δ :

When δ is sufficiently small, the pair α a,δ , γ a,δ is a Galois connection.

Lemma 7 (Galois connection). Let M be an MV-algebra and Y be a finite set. For 0 = δ δ a , the pair α a,δ , γ a,δ :

Whenever f is non-expansive, it is easy to see that it restricts to a function

As mentioned before, a crucial result shows that for all non-expansive functions, under the assumption that Y, Z are finite and the order on M is total, we can suitably approximate the propagation of increases. In order to state this result, a useful tool is a notion of approximation of a function.

Definition 8 ((δ, a)-approximation). Let M be an MV-chain, let Y , Z be finite sets and let f :

, the points to which f propagates an increase of the function a with value δ on the subset Y .

We first show that f # a,δ is antitone in the parameter δ, a non-trivial result. Lemma 9 (anti-monotonicity). Let M be an MV-chain, let Y , Z be finite sets, let f : M Y → M Z be a non-expansive function and let a ∈ M Y . For

Since f # a,δ increases when δ decreases and there are finitely many such functions, there must be a value ι f a such that all functions f # a,δ for 0 δ ι f a are equal. This function is denoted by f # a and is called the a-approximation of f . We next show that indeed, for all non-expansive functions, the a-approximation properly approximates the propagation of increases.

Theorem 10 (approximation of non-expansive functions). Let M be a complete MV-chain, let Y, Z be finite sets and let f : M Y → M Z be a nonexpansive function. Then there exists ι f a ∈ M, the largest value below or equal to δ a such that f # a,δ = f # a,δ for all 0 δ, δ ι f a . We denote this function by f # a and call it the a-approximation of f . Then for all 0 δ ∈ M:

Note that if Y = Z and a is a fixpoint of f , i.e., a = f (a), condition (a) above corresponds exactly to soundness in the sense of abstract interpretation [13] , while condition (b) corresponds to (γ-)completeness (see also §2).

In this section we formalise the proof technique outlined in the introduction for showing that a fixpoint is the largest and, more generally, for checking overapproximations of greatest fixpoints of non-expansive functions.

Consider a monotone function f : M Y → M Y for some finite set Y . We first focus on the problem of establishing whether some given fixpoint a of f coincides with νf (without explicitly knowing νf ), and, in case it does not, finding an "improvement", i.e., a post-fixpoint of f , larger than a. Observe that when a is a fixpoint, [Y ] a = [Y ] f (a) and thus the a-approximation of f (Thm. 10) is

We have the following result, which relies on the fact that due to Thm. 10 γ a,δ preserves fixpoints (of f and f # a ).

Theorem 11 (soundness and completeness for fixpoints). Let M be a complete MV-chain, Y a finite set and f :

Whenever a is a fixpoint, but not yet the largest fixpoint of f , we can increase it and obtain a post-fixpoint.

a fixpoint of f , and let f # a be the corresponding a-approximation and ι f a as in Thm. 10 . Then α a,ι f a (νf # a ) = a ⊕ (ι f a ) νf # a is a post-fixpoint of f . Using these results one can perform an alternative fixpoint iteration where we iterate to the largest fixpoint from below: start with a post-fixpoint a 0 f (a 0 ) (which is clearly below νf ) and obtain, by (possibly transfinite) iteration, an ascending chain that converges to a, the least fixpoint above a 0 . Now check with Thm. 11 whether Y = νf # a = ∅. If yes, we have reached νf = a. If not,

is again a post-fixpoint (cf. Lem. 12) and we continue this procedure until -for some ordinal -we reach the largest fixpoint νf , for which we have νf # νf = ∅. Interestingly, the soundness result in Thm. 11 can be generalised to the case in which a is a pre-fixpoint instead of a fixpoint. In this case, the a-approximation for a function f :

where domain and codomain are different, hence it would not be meaningful to look for fixpoints. However, as explained below, it can be restricted to an endofunction.

Theorem 13 (soundness for pre-fixpoints). Let M be a complete MV-chain, Y a finite set and f :

Roughly, the intuition for the above result is the following: the value of f (a) on some y might or might not depend "circularly" on the value of a on y itself. In a purely inductive setting, without such circular dependencies, µf = νf and hence a being a pre-fixpoint means that we over-approximate νf . However, we might have vicious cycles, as explained in the introduction, that destroy the over-approximation since the values are too low. Now, since we restrict to nonexpansive functions, it must be the case that there is a cycle, such that all elements on this cycle are points where a and f (a) coincide. It is hence sufficient to check whether a given pre-fixpoint could be increased on its subpart which corresponds to a fixpoint, i.e., the idea is to restrict to [Y ] a=f (a) . We detect such situations by looking for "wiggle room" as for fixpoints.

Completeness does not generalise to pre-fixpoints, i.e., it is not true that if a is a pre-fixpoint of f and νf a then νf * a = ∅. A pre-fixpoint might contain slack even though it is above the greatest fixpoint. A counterexample is in Ex. 25.

The Dual View for Least Fixpoints. The theory developed so far can be easily dualised to check under-approximations of least fixpoints. Given a complete MValgebra M = (M, ⊕, 0, (·)) and a monotone function f : M Y → M Y , in order to show that a post-fixpoint a ∈ M Y satisfies a µf , we can in fact simply work in the dual MV-algebra, M op = (M, , ⊗, (·), 1). It is convenient to formulate the conditions using and the original order.

We next outline the dualised setting. The notation for the dual case is obtained from that of the original (primal) case, exchanging subscripts and superscripts.

Given a ∈ M Y , define [Y ] a = {y ∈ Y | a(y) = 0} and δ a = min{a(y) | y ∈ [Y ] a }. For θ ∈ M, we consider the pair of functions α a,θ , γ a,θ :

A function f : M Y → M Z is non-expansive in the dual MV-algebra when it is in the primal one. Its approximation in the sense of Thm. 10 is denoted f a # . 

Then the dualisations of Thm. 11 and 13 hold, i.e., if a is a fixpoint of f , then νf a # = ∅ iff µf = a, and whenever a is a post-fixpoint, νf a * = ∅ implies a µf .

Given a non-expansive function f and a (pre/post-)fixpoint a, it is often nontrivial to determine the corresponding approximations. However, non-expansive functions enjoy good closure properties (closure under composition, and closure under disjoint union) and we will see that the same holds for the corresponding approximations. Furthermore it turns out that the functions needed in the applications can be obtained from just a few templates. This gives us a toolbox for assembling approximations with relative ease.

Theorem 14. All basic functions listed in Table 1 are non-expansive. Furthermore non-expansive functions are closed under composition and disjoint union. The approximations are the ones listed in the third column of the table.

We start by making the example from the introduction ( §1) more formal. Consider a Markov chain (S, T, η), as defined in the introduction (Fig. 1) , where we restrict the codomain of η : S\T → D(S) to D ⊆ D(S), where D is finite (to ensure that all involved sets are finite). Furthermore let T : [0, 1] S → [0, 1] S be the function from the introduction whose least fixpoint µT assigns to each state its termination probability. From this representation and Thm. 14 it is obvious that T is non-expansive.

It is well-known that the function T can be tweaked in such a way that it has a unique fixpoint, coinciding with µT , by determining all states which cannot reach a terminal state and setting their value to zero [3] . Hence fixpoint iteration from above does not bring us any added value here. It does however make sense to use the proof rule in order to guarantee lower bounds via post-fixpoints.

Furthermore, termination probability is a special case of the considerably more complex stochastic games that will be studied in §7, where the trick of modifying the function is not applicable.

Before we start discussing probabilistic automata, we first consider the Hausdorff and the Kantorovich lifting and the corresponding approximations.

Hausdorff Lifting. Given a metric on a set X, the Hausdorff metric is obtained by lifting the original metric to 2 X . Here we define this for general distance functions on M, not restricting to metrics. In particular the Hausdorff lifting is given by a function H :

An alternative characterisation due to Mémoli [20] , also in [4] , is more convenient for our purposes. If we let u : 2 X×X → 2 X × 2 X with u(C) = (π 1 [C], π 2 [C]), where π 1 , π 2 are the projections π i : X × X → X and π i [C] = {π i (c) | c ∈ C}. Then H(d)(X 1 , X 2 ) = min{max (x1,x2)∈C d(x 1 , x 2 ) | C ⊆ X × X ∧ u(C) = (X 1 , X 2 )}. Relying on this, we can obtain the result below, from which we deduce that H is non-expansive and construct its approximation as the composition of the corresponding functions from Table 1 .

Kantorovich Lifting. The Kantorovich (also known as Wasserstein) lifting converts a metric on X to a metric on probability distributions over X. As for the Hausdorff lifting, we lift distance functions that are not necessarily metrics.

Furthermore, in order to ensure finiteness of all the sets involved, we restrict to D ⊆ D(X), some finite set of probability distributions over X. A coupling of p, q ∈ D is a probability distribution c ∈ D(X × X) whose left and right marginals are p, q, i.e., p(

. The set of all couplings of p, q, denoted by Ω(p, q), forms a polytope with finitely many vertices [24] . The set of all polytope vertices that are obtained by coupling any p, q ∈ D is also finite and is denoted by VP D ⊆ D(X × X).

The Kantorovich lifting is given by K :

The coupling c can be interpreted as the optimal transport plan to move goods from suppliers to customers [30] . Again there is an alternative characterisation, which shows non-expansiveness of K:

Probabilistic Automata. We now compare our approach with [2] , which describes the first method for computing behavioural distances for probabilistic automata. Although the behavioural distance arises as a least fixpoint, it is in fact better, even the only known method, to iterate from above, in order to reach this least fixpoint. This is done by guessing and improving couplings, similar to strategy iteration discussed later in §7. A major complication, faced in [2] , is that the procedure can get stuck at a fixpoint which is not the least and one has to determine that this is the case and decrease the current candidate. In fact this paper was our inspiration to generalise this technique to a more general setting.

A probabilistic automaton is a tuple A = (S, L, η, ), where S is a non-empty finite set of states, L is a finite set of labels, η : S → 2 D(S) assigns finite sets of probability distributions to states and : S → L is a labelling function. (In the following we again replace D(S) by a finite subset D.)

The probabilistic bisimilarity pseudometrics is the least fixpoint of the func- In order to check whether d = µf , [2] adapts the notion of a self-closed relation from [16] .

whenever s M t, then -(s) = (t) and d(s, t) > 0, if p ∈ η(s) and d(s, t) = min q ∈η(t) K(d)(p, q ), then there exists q ∈ η(t) and

c ∈ Ω(p, q) such that d(s, t) = u,v∈S d(u, v) · c(u, v) and supp(c) ⊆ M , if q ∈ η(t) and d(s, t) = min p ∈η(s) K(d)(p , q), then there exists p ∈ η(s) and c ∈ Ω(p, q) such that d(s, t) = u,v∈S d(u, v) · c(u, v) and supp(c) ⊆ M .

The largest self-closed relation, denoted by ≈ d is empty if and only if d = µf [2] . We now investigate the relation between self-closed relations and postfixpoints of approximations. For this we will first show that M can be composed from non-expansive functions, which proves that it is indeed non-expansive. Furthermore, this decomposition will help in the comparison.

Lemma 20. The fixpoint function M characterizing probabilistic bisimilarity pseudometrics can be written as:

Hence M is a composition of non-expansive functions and thus non-expansive itself. We do not spell out M d # explicitly, but instead show how it is related to self-closed relations.

Then M is a self-closed relation wrt. d if and only if M ⊆ [S × S] d and M is a post-fixpoint of M d # .

In order to define standard bisimilarity we use a variant G of the Hausdorff lifting H from §6.2 where max and min are swapped and which we denote by G. Now we can define the fixpoint function for bisimilarity and its corresponding approximation. For simplicity we consider unlabelled transition systems, but it would be straightforward to handle labelled transitions.

Let X be a finite set of states and η : X → 2 X a function that assigns a set of successors η(x) to a state x ∈ X. Since we are interested in the greatest fixpoint, we are working in the primal sense. Bisimulation relations are represented by their characteristic functions d : X × X → {0, 1}, in fact the corresponding relation can be obtained by taking

Lemma 23. Let d : X × X → {0, 1}. The approximation for the bisimilarity function B in the primal sense is B # d :

We conclude this section by discussing how this view on bisimilarity can be useful: first, it again opens up the possibility to compute bisimilarity -a greatest fixpoint -by iterating from below, through smaller fixpoints. This could potentially be useful if it is easy to compute the least fixpoint of B inductively and continue from there.

Furthermore, we obtain a technique for witnessing non-bisimilarity of states. While this can also be done by exhibiting a distinguishing modal formula [17, 9] or by a winning strategy for the spoiler in the bisimulation game [27] , to our knowledge there is no known method that does this directly, based on the definition of bisimilarity.

With our technique however, we can witness non-bisimilarity of two states x 1 , x 2 ∈ X by presenting a pre-fixpoint d (i.e., B(d) ≤ d) such that d(x 1 , x 2 ) = 0 (equivalent to (x 1 , x 2 ) ∈ [X × X] d ) and νB # d = ∅, since this implies νB(x 1 , x 2 ) ≤ d(x 1 , x 2 ) = 0 by our proof rule.

There are two issues to discuss: first, how can we characterise a pre-fixpoint of B (which is quite unusual, since bisimulations are post-fixpoints)? In fact, the condition B(d) ≤ d can be rewritten to: for all (x 1 , x 2 ) ∈ [X × X] d there exists y 1 ∈ η(x 1 ) such that for all y 2 ∈ η(x 2 ) we have (y 1 , y 2 ) ∈ [X × X] d (or vice versa). Second, at first sight it does not seem as if we gained anything since we still have to do a fixpoint computation on relations. However, the carrier set is [X × X] d , i.e., a set of non-bisimilarity witnesses and this set can be small even though X might be large.

Example 24. We consider the transition system depicted below.

Our aim is to construct a witness showing that x, u are not bisimilar. This witness is a function d : X × X → {0, 1} with d(x, u) = 0 = d(y, u) and for all other pairs the value is 1.

x y u u) , (y, u)} and it is easy to check that d is a pre-fixpoint of B and that νB * d = ∅: we iterate over {(x, u), (y, u)} and first remove (y, u) (since y has no successors) and then (x, u). This implies that νB ≤ d and hence νB(x, u) = 0, which means that x, u are not bisimilar. Example 25. We modify Ex. 24 and consider a function d where d(x, u) = 0 and all other values are 1. Again d is a pre-fixpoint of B and νB ≤ d (since only reflexive pairs are in the bisimilarity). However νB * d = ∅, since {(x, u)} is a post-fixpoint. This is a counterexample to completeness discussed after Thm. 13 .

Intuively speaking, the states y, u over-approximate and claim that they are bisimilar, although they are not. (This is permissible for a pre-fixpoint.) This tricks x, u into thinking that there is some wiggle room and that one can increase the value of (x, u). This is true, but only because of the limited, local view, since the "true" value of (y, u) is 0.

Introduction to Simple Stochastic Games. In this section we show how our techniques can be applied to simple stochastic games [11, 10] . A simple stochastic game is a state-based two-player game where the two players, Min and Max, each own a subset of states they control, for which they can choose the successor. The system also contains sink states with an assigned payoff and averaging states which randomly choose their successor based on a given probability distribution. The goal of Min is to minimise and the goal of Max to maximise the payoff.

Simple stochastic games are an important type of games that subsume parity games and the computation of behavioural distances for probabilistic automata (cf. §6.2, [2] ). The associated decision problem is known to lie in NP ∩ coNP, but it is an open question whether it is contained in P. There are known randomised subexponential algorithms [7] .

It has been shown that it is sufficient to consider positional strategies, i.e., strategies where the choice of the player is only dependent on the current state. The expected payoffs for each state form a so-called value vector and can be obtained as the least solution of a fixpoint equation (see below).

A simple stochastic game is given by a finite set V of nodes, partitioned into MIN , MAX , AV (average) and SINK , and the following data: η min : MIN → 2 V , η max : MAX → 2 V (successor functions for Min and Max nodes), η av : AV → D (probability distributions, where D ⊆ D(V ) finite) and w : SINK → [0, 1] (weights of sink nodes).

The fixpoint function V :

The least fixpoint of V specifies the average payoff for all nodes when Min and Max play optimally. In an infinite game the payoff is 0. In order to avoid infinite games and guarantee uniqueness of the fixpoint, many authors [18, 10, 29] restrict to stopping games, which are guaranteed to terminate for every pair of Min/Maxstrategies. Here we deal with general games where more than one fixpoint may exist. Such a scenario has been studied in [19] , which considers value iteration to under-and over-approximate the value vector. The over-approximation faces challenges with cyclic dependencies, similar to the vicious cycles described earlier. Here we focus on strategy iteration, which is usually less efficient than value iteration, but yields a precise result instead of approximating it.

Example 26. We consider the game depicted below. Here min is a Min node with η min (min) = {1, av}, max is a Max node with η max (max) = {ε, av}, 1 is a sink node with payoff 1, ε is a sink node with some small payoff ε ∈ (0, 1) and av is an average node which transitions to both min and max with probability 1 2 . Min should choose av as successor since a payoff of 1 is bad for Min. Given this choice of Min, Max should not declare av as successor since this would create an infinite play and hence the payoff is 0. Therefore Max has to choose ε and be content with a payoff of ε, which is achieved from all nodes different from 1. In order to be able to determine the approximation of V and to apply our techniques, we consider the following equivalent definition.

As a composition of non-expansive functions, V is non-expansive as well. Since we are interested in the least fixpoint we work in the dual sense and obtain the following approximation, which intuitively says: we can decrease a value at node v by a constant only if, in the case of a Min node, we decrease the value of one successor where the minimum is reached, in the case of a Max node, we decrease the values of all successors where the maximum is reached, and in the case of an average node, we decrease the values of all successors. 

Strategy Iteration from Above and Below. We describe two algorithms based on strategy iteration, first introduced by Hoffman and Karp in [18] , that are novel, as far as we know. The first iterates to the least fixpoint from above and uses the techniques described in §4. The second iterates from below: the role of our results is not directly visible in the code of the algorithm, but its non-trivial correctness proof is based on the proof rule introduced earlier. We first recap the underlying notions: a Min-strategy is a mapping τ : MIN → V such that τ (v) ∈ η min (v) for every v ∈ MIN . With such a strategy, Min decides to always leave a node v via τ (v). Analogously σ : MAX → V fixes a Max-strategy. Fixing a strategy for either player induces a modified value function. If τ is a Min-strategy, we obtain V τ which is defined exactly as V but for v ∈ MIN where we set V τ (a)(v) = a(τ (v)). Analogously, for σ a Max-strategy, V σ is obtained by setting V σ (a)(v) = a(σ(v)) when v ∈ MAX . If both players fix their strategies, the game reduces to a Markov chain.

In order to describe our algorithms we also need the notion of a switch. Assume that τ is a Min-strategy and let a be a (pre-)fixpoint of V τ . Min can now potentially improve her strategy for nodes v ∈ MIN where min v ∈ηmin(v) a(v ) < a(τ (v)), called switch nodes. This results in a Min-strategy τ = sw min (τ, a), where 5 τ (v) = arg min v ∈ηmin(v) a (i) (v ) for a switch node v and τ , τ agree otherwise. Also, sw max (σ, a) is defined analogously for Max strategies. Now strategy iteration from above works as described in Figure 2a . The computation of µV τ (i) in the second step intuitively means that Max chooses his best answering strategy and we compute the least fixpoint based on this answering strategy. At some point no further switches are possible and we have reached a fixpoint a, which need not yet be the least fixpoint. Hence we use the techniques from §4 to decrease a and obtain a new pre-fixpoint a (i+1) , from which we can continue. The correctness of this procedure partially follows from Thm. 11 and Lem. 12, however we also need to show the following: first, we can compute a (i) = µV τ (i) efficiently by solving a linear program (cf. Lem. 29) by adapting [11] . Second, the chain of the a (i) decreases, which means that the algorithm will eventually terminate (cf. Thm. 30).

Strategy iteration from below is given in Figure 2b . At first sight, the algorithm looks simpler than strategy iteration from above, since we do not have to check whether we have already reached νV, reduce and continue from there. However, in this case the computation of µV σ (i) via a linear program is more involved (cf. Lem. 29), since we have to pre-compute (via greatest fixpoint iteration over 2 V ) the nodes where Min can force a cycle based on the current strategy of Max, thus obtaining payoff 0.

This algorithm does not directly use our technique but we can use our proof rules to prove the correctness of the algorithm (Thm. 30). In particular, the proof that the sequence a (i) increases is quite involved: we have to show that a (i) = µV σ (i) ≤ µV σ (i+1) = a (i+1) . We prove this, using our proof rules, by showing that a (i) is below the least fixpoint of V σ (i+1) .

The algorithm generalises strategy iteration by Hoffman and Karp [18] . Note that we cannot simply adapt their proof, since we do not assume that the game is stopping, which is a crucial ingredient.

Lemma 29. The least fixpoints of V τ and V σ can be determined by solving linear programs.

Theorem 30. Strategy iteration from above and below both terminate and compute the least fixpoint of V.

Example 31. Ex. 26 is well suited to explain our two algorithms.

Starting with strategy iteration from above, we may guess τ (0) (min) = 1. In this case, Max would choose av as successor and we would reach a fixpoint, where each node except for ε is associated with a payoff of 1. Next, our algorithm would detect the vicious cycle formed by min, av and max. We can reduce the values in this vicious cycle and reach the correct payoff values for each node.

For strategy iteration from below assume that σ (0) (max) = av. Given this strategy of Max, Min can force the play to stay in a cycle formed by min, av and max. Thus, the payoff achieved by the Max strategy σ (0) and an optimal play by Min would be 0 for each of these nodes. In the next iteration Max switches and chooses ε as successor, i.e. σ (1) (max) = ε, which results in the correct values.

We implemented strategy iteration from above and below and classical Kleene iteration in MATLAB. In Kleene iteration we terminate with a tolerance of 10 −14 , i.e., we stop if the change from one iteration to the next is below this bound. We tested the algorithms on random stochastic games and found that Kleene iteration is always the fastest, but only converges and it is known that the rate of convergence can be exponentially slow [10] . Strategy iteration from below is usually slightly faster than strategy iteration from above. More details can be found in the full version [5] .

It is well-known that several computations in the context of system verification can be performed by various forms of fixpoint iteration and it is worthwhile to study such methods at a high level of abstraction, typically in the setting of complete lattices and monotone functions. Going beyond the classical results by Tarski [28] , combination of fixpoint iteration with approximations [14, 6] and with up-to techniques [25] has proven to be successful. Here we treated a more specific setting, where the carrier set consists of functions from a finite set into an MV-chain and the fixpoint functions are non-expansive (and hence monotone), and introduced a novel technique to obtain upper bounds for greatest and lower bounds for least fixpoints, including associated algorithms. Such techniques are widely applicable to a wide range of examples and so far they have been studied only in quite specific scenarios, such as in [2, 16, 19] .

In the future we plan to lift some of the restrictions of our approach. First, an extension to an infinite domain Y would of course be desirable, but since several of our results currently depend on finiteness, such a generalisation does not seem to be easy. Another restriction, to total orders, seems easier to lift: in particular, if the partially ordered MV-algebraM is of the form M I where I is a finite index set and M an MV-chain. (E.g., finite Boolean algebras are of this type.) Then our function space isM Y = M I Y ∼ = M Y ×I and we have reduced to the setting presented in this paper. This will allow us to handle featured transition systems [12] where transitions are equipped with boolean formulas. We also plan to determine the largest possible increase that can be added to a fixpoint that is not yet the greatest fixpoint in order to maximally speed up fixpoint iteration from below (this might be larger than ι f a ). There are several other application examples that did not fit into this paper, but that can also be handled by our approach: for instance behavioural distances for metric transition systems [15] and other types of systems [4] . We also plan to investigate other types of games, such as energy games [8] . While here we introduced strategy iteration techniques for simple stochastic games, we also want to check whether we can provide an improvement to value iteration techniques, combining our approach with [19] .

We also plan to study whether some examples can be handled with other types of Galois connections: here we used an additive variant, but looking at multiplicative variants (multiplication by a constant factor) might also be fruitful.

Guess a Min-strategy τ (0)

a (i) := µV τ (i)

τ (i+1) := sw min(τ (i) , a (i) )

If τ (i+1) = τ (i) i := i + 1 then goto 2

Otherwise set a (i+1) := a − (ι a V ) V , τ (i+2) := sw min(τ (i) , a (i+1) ), i := i+2

Guess a Max-strategy σ (0)

a (i) := µV σ (i)

σ (i+1) := sw max(σ (i) , a (i) )

If σ (i+1) = σ (i) set i := i+1 and goto 2. Otherwise stop and return a (i)

On-the-fly exact computation of bisimilarity distances

Computing probabilistic bisimilarity distances for probabilistic automata

Principles of Model Checking

Coalgebraic behavioral metrics

Fixpoint theory -upside down

Abstraction, up-to techniques and games for systems of fixpoint equations

Combinatorial structure and randomized subexponential algorithms for infinite games

Faster algorithms for mean-payoff games

On automatically explaining bisimulation inequivalence

On algorithms for simple stochastic games

The complexity of stochastic games

Simulation-based abstractions for software product-line model checking

Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints

Temporal abstract interpretation

Linear and branching system metrics

Computing game metrics on Markov decision processes

Algebraic laws for nondeterminism and concurrency

On nonterminating stochastic games

Value iteration for simple stochastic games: Stopping criterion and learning algorithm

Gromov-Wasserstein distances and the metric approach to object matching

MV-algebras. A short tutorial

Advanced Lukasiewicz calculus and MV-algebras

Computational optimal transport

Complete lattices and up-to techniques

Introduction to Bisimulation and Coinduction

Bisimulation, model checking and other games. Notes for Mathfit instructional meeting on games and computation

A lattice-theoretical theorem and its applications

On strategy improvement algorithms for simple stochastic games

Optimal Transport -Old and New

which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use

Acknowledgements: We are grateful to Ichiro Hasuo for making us aware of stochastic games as application domain. Furthermore we would like to thank Matthias Kuntz and Timo Matt for their help with experiments.