key: cord-0046618-fe9xv1u4 authors: Bhayat, Ahmed; Reger, Giles title: A Knuth-Bendix-Like Ordering for Orienting Combinator Equations date: 2020-05-30 journal: Automated Reasoning DOI: 10.1007/978-3-030-51074-9_15 sha: 356146c613941a3865e24cb22fb16a3156ddd6f7 doc_id: 46618 cord_uid: fe9xv1u4 We extend the graceful higher-order basic Knuth-Bendix order (KBO) of Becker et al. to an ordering that orients combinator equations left-to-right. The resultant ordering is highly suited to parameterising the first-order superposition calculus when dealing with the theory of higher-order logic, as it prevents inferences between the combinator axioms. We prove a number of desirable properties about the ordering including it having the subterm property for ground terms, being transitive and being well-founded. The ordering fails to be a reduction ordering as it lacks compatibility with certain contexts. We provide an intuition of why this need not be an obstacle when using it to parameterise superposition. There exists a wide range of methods for automated theorem proving in higherorder logic. Some provers such as AgsyHOL [17] , Satallax [10] and Leo-II [4] implement dedicated higher-order proof calculi. A common approach, followed by the Leo-III prover [21] , is to use a co-operative architecture with a dedicated higher-order prover working in conjunction with a first-order prover. It has long been part of theorem proving folklore that sound and complete translations from higher-order to first-order logic exist. Kerber [15] proves this result for a higher-order logic that does not assume comprehension axioms (otherwise known as applicative first-order logic). Thus, translating higher-order problems to firstorder logic and running first-order provers on the translations is another method of automated higher-order theorem proving. Variations of this method are widely utilised by interactive theorem provers and their hammers such as Sledgehammer [18] and the CoqHammer [11] . Almost all translations to first-order logic translate λ-expressions using combinators. It is well known that the set of combinators S, K and I is sufficient to translate any λ-expression. For purposes of completeness, these combinators must be axiomatised: S τ 1 , τ 2 , τ 3 x y z = x z (y z), K τ 1 , τ 2 x y = x and I τ x = x. If translating to a monomorphic logic a finite set of axioms cannot achieve completeness. However, till now, translation based methods have proven disappointing and only achieved decent results with interactive theorem provers when the problems are first-order or nearly first-order [22] . One major reason for this is that inferences between combinator axioms can be hugely explosive. A common first-order proof calculus is superposition [19] . Consider a superposition inference from the K axiom onto the right-hand of the S axiom. The result is S K y z = z. There is little to restrict such inferences. Superposition is parameterised by a simplification ordering and inferences are only carried out on the larger side of literals with respect to this ordering. Inferences are not carried out at variables. Consider the S-, Kand I-axioms given above. There can clearly be no unifiers between a subterm of the left side of one axiom and the left side of another except at a variable. Thus, if a simplification ordering exists that orients the axioms left-to-right, inferences amongst the axioms would be impossible. Currently, no such simplification ordering is known to exist and the authors suspect that no such ordering can exist. Whilst there is a large body of work on higher-order orderings, all either lack some property required for them to be simplification orderings or are unsuitable for orienting the combinator axioms. Jouannaud and Rubio introduced a higher-order version of the recursive path order called HORPO [14] . HORPO is compatible with β-reduction which suggests that without much difficulty it could be modified to be compatible with weak reduction. However, the ordering does not enjoy the subterm property, nor is it transitive. Likewise, is the case for orderings based on HORPO such as the computability path ordering [8] and the iterative HOIPO of Kop and Van Raamsdonk [16] . More recently, a pair of orderings for λ-free higher-order terms have been developed [2, 7] . These orderings lack a specific monotonicity property, but this does not prevent their use in superposition [3] . However, neither ordering orients combinator axioms directly. We investigate an extension of the graceful higher-order basic KBO > hb introduced by Becker et al. [2] . Our new ordering, > ski , orients combinator equations left-to-right. Thus, if it is used to parameterise a superposition calculus, there can be no inferences among the axioms. The > ski ordering lacks full compatibility with contexts which is normally a requirement for an ordering to parameterise superposition. In particular, the ordering is not compatible with the so-called unstable contexts. In separate work we show that this is not an obstacle to achieving completeness [5] . A complete superposition calculus for HOL already exists [3] . This calculus has the λ-calculus rather than combinatory logic as its underlying logic. It also employs higher-order unification. There appear to be two potential benefits to using a slightly modified first-order superposition calculus parameterised by our new ordering > ski over lambda superposition as developed in [3] . • A superposition calculus parameterised by > ski is far closer to standard firstorder superposition than lambda superposition. Unification is first-order and there is no need to deal with binders and bound variables. This allows the re-use of the well-studied data-structures and algorithms used in first-order superposition [12, 20] . • As discussed further in the conclusion (Sect. 6), the > ski ordering allows the comparison of a larger class of non-ground terms than the ordering used in [3] . This results in fewer superposition inferences. In Sect. 2, we provide the necessary preliminaries and then move on to the main contributions of this paper which are: • Two approaches extending the > hb ordering by first comparing terms by the length of the longest weak reduction from them. The approaches differ in the manner in which they compare non-ground terms. A useful trait for an ordering that parameterises superposition is to be able to compare a large class of non-ground terms since this reduces the number of inferences carried out. The most powerful method of defining a non-ground ordering is to semantically lift a ground ordering, i.e., to define t s to hold iff tθ sθ for all grounding substitutions θ. Such an ordering in non-computable and both our methods attempt to approximate it (Sect. 3). • A set of proofs that the introduced > ski ordering enjoys the necessary properties required for its use within the superposition calculus (Sect. 4) and a set of examples demonstrating how the ordering applies to certain terms (Sect. 5). Syntax of Types and Terms: We work in a polymorphic applicative first-order logic. Let V ty be a set of type variables and Σ ty be a set of type constructors with fixed arities. It is assumed that a binary type constructor → is present in Σ ty which is written infix. The set of types is defined: Polymorphic Types τ ::= κ( τ n ) | α | τ → τ where α ∈ V ty and κ ∈ Σ ty The notation t n is used to denote a tuple or list of types or terms depending on the context. A type declaration is of the form Π α .σ where σ is a type and all type variables in σ appear in α . Let Σ be a set of typed function symbols and V a set of variables with associated types. It is assumed that Σ contains the following function symbols, known as basic combinators: The set of terms over Σ and V is defined below. In what follows, type subscripts, and at times even type arguments, are omitted. The type of the term f τ n is σ{ α n → τ n }. Following [2] , terms of the form t 1 t 2 are called applications. Non-application terms are called heads. A term can uniquely be decomposed into a head and n arguments. Let t = ζ t n . Then head(t) = ζ where ζ could be a variable or constant applied to possibly zero type arguments. The symbol C any denotes a member of {S, C, B, K, I}, whilst C 3 denotes a member of {S, C, B}. These symbols are only used when the combinator is assumed to have a full complement of arguments. Thus, in C 3 t n , n ≥ 3 is assumed. The symbols x, y, z . . . are reserved for variables, c, d, f . . . for noncombinator constants and ζ, ξ range over arbitrary symbols and, by an abuse of notation, at times even terms. A term is ground if it contains no variables and term ground if it contains no term variables. We define a subset of first-order subterms called stable subterms. Let LPP(t, p) (LPP stands for Longest Proper Prefix) be a partial function that takes a term t and a position p and returns the longest proper prefix p of p such that head(t| p ) is not a partially applied combinator if such a position exists. For a position p ∈ pos(t), p is a stable position in t if LPP(t, p) is not defined or head(t| LPP(t,p) ) is not a combinator. A stable subterm is a subterm occurring at a stable position and is denoted t u p . We call t p a stable context and drop the position where it is not relevant. For example, the subterm a is not stable in f (S a b c), S (S a) b c (in both cases, head(t| LPP(t,p) ) = S) and a c (a is not a first-order subterm), but is in g a b and f (S a) b. A subterm that is not stable is known as an unstable subterm. The notation t[u] p denotes an arbitrary subterm u of t that occurs at position p and may be unstable. The notation t[u 1 , . . . , u n ] (or t[ u n ]) denotes the term t containing n non-overlapping subterms u 1 to u n . By u[] n , we refer to a context with n non-overlapping holes. Whilst this resembles the notation for a term at position n, ambiguity is avoided by never using n to denote a position or p to denote a natural number. Weak Reduction: Each combinator is defined by its characteristic equation; S x y z = x z (y z), C x y z = x z y, B x y z = x (y z), K x y = x and I x = x. A term t weak-reduces to a term t in one step (denoted t −→ w t ) if t = u[s] p and there exists a combinator axiom l = r and substitution σ such that lσ = s and t = u[rσ] p . The term lσ in t is called a weak redex or just redex. By −→ * w , the reflexive transitive closure of −→ w is denoted. If term t weak-reduces to term t in n steps, we write t −→ n w t . Further, if there exists a weak-reduction path from a term t of length n, we say that t ∈ n w . Weak-reduction is terminating and confluent as proved in [13] . By (t) ↓ w , we denote the term formed from t by contracting its leftmost redex. The length of the longest weak reduction from a term t is denoted t . This measure is one of the crucial features of the ordering investigated in this paper. To show that the measure is computable we provide a maximal weak-reduction strategy and prove its maximality. The strategy is used in a number of proofs later in the paper. It is in a sense equivalent to Barendregt's 'perpetual strategy' in the λ-calculus [1] . Our proof of its maximality follows the style of Van Raamsdonk et al. [23] in their proof of the maximality of a particular β-reduction strategy. We begin by proving the fundamental lemma of maximality for combinatory terms. Proof. Assume that C any = K. Then any maximal reduction from K t n is of the form: There is another method of reducing K t n to s: As the length of this reduction is the same as the previous reduction, it must be a maximal reduction as well. Therefore we have that: Conversely, assume that C any is not K. We prove that the formula holds if C any = S. The other cases are similar. If C any = S, any maximal reduction from S t n must be of the form: There is another method of reducing S t n to s: Thus, we have that S t n = m+m +1 ≤ m+m 3 +m +1 = (S t n ) ↓ w +1. Since m + m + 1 is the length of the maximal reduction, equality must hold. The reduction strategy F ∞ is maximal. Proof. As the Lemma is not of direct relevance to the paper, its proof is relegated to the accompanying technical report [6] . First, Becker et al.'s [2] graceful higher-order basic KBO is presented as it is utilised within our ordering. The presentation here differs slightly from that in [2] because we do not allow ordinal weightings and all function symbols have finite arities. Furthermore, we do not allow the use of different operators for the comparison of tuples, but rather restrict the comparison of tuples to use only the length-lexicographic extension of the base order. This is denoted length lex hb . The length-lexicographic extension first compares the lengths of tuples and if these are equal, carries out a lexicographic comparison. For this section, terms are assumed to be untyped following the original presentation. Standard first-order KBO first compares the weights of terms, then compares their head-symbols and finally compares arguments recursively. When working with higher-order terms, the head symbol may be a variable. To allow the comparison of variable heads, a mapping ghd is introduced that maps variable heads to members of Σ that could possibly instantiate the head. This mapping respects arities if for any variable x, all members of ghd(x) have arities greater or equal to that of x. The mapping can be extended to constant heads by taking Let be a total well-founded ordering or precedence on Σ. The precedence is extended to arbitrary heads by defining ζ ξ iff ∀f ∈ ghd(ζ) and ∀g ∈ ghd(ξ), f g. Let w be a function from Σ to N that denotes the weight of a function symbol and W a function from T to N denoting the weight of a term. Let ε ∈ N >0 . For all constants c, w (c) ≥ ε. The weight of a term is defined recursively: The graceful higher-order basic Knuth-Bendix order > hb is defined inductively as follows. Let t = ζ t and s = ξ s . Then t > hb s if vars # (s) ⊆ vars # (t) and any of the following are satisfied: The combinator orienting KBO is the focus of this paper. It has the property that all ground instances of combinator axioms are oriented by it left-to-right. This is achieved by first comparing terms by the length of the longest weak reduction from the term and then using > hb . This simple approach runs into problems with regards to stability under substitution, a crucial feature for any ordering used in superposition. Consider the terms t = f x a and s = x b. As the length of the maximum reduction from both terms is 0, the terms would be compared using > hb resulting in t s as W (t) > W (s). Now, consider the substitution θ = {x → I}. Then, sθ = 1 whilst tθ = 0 resulting in sθ tθ. The easiest and most general way of obtaining an order which is stable under substitution would be to restrict the definition of the combinator orienting KBO to ground terms and then semantically lift it to non-ground terms as mentioned in the introduction. However, the semantic lifting of the ground order is noncomputable and therefore useless for practical purposes. We therefore provide two approaches to achieving an ordering that can compare non-ground terms and is stable under substitution both of which approximate the semantic lifting. Both require some conditions on the forms of terms that can be compared. The first is simpler, but more conservative than the second. First, in the spirit of Bentkamp et al. [3] , we provide a translation that replaces "problematic" subterms of the terms to be compared with fresh variables. With this approach, the simple variable condition of the standard KBO, vars # (s) ⊆ vars # (t), ensures stability. However, this approach is overconstrained and prevents the comparison of terms such as t = x a and s = x b despite the fact that for all substitutions θ, tθ = sθ . Therefore, we present a second approach wherein no replacement of subterms occurs. This comes at the expense of a far more complex variable condition. Roughly, the condition stipulates that two terms are comparable if and only if the variables and relevant combinators are in identical positions in each. Approach 1. Because the > hb ordering is not defined over typed terms, type arguments are replaced by equivalent term arguments before comparison. The translation ([]) from T to untyped terms is given below. First we define precisely the subterms that require replacing by variables. Consider a term t of the form C any t n . If there exists a position p such t| p is a variable, then t is a type-1 term. A term x t n where n > 0 is a type-2 term. The translation to untyped terms is defined as follows. If t is a type variable τ , then An untyped term t weak reduces to an untyped term t in one step if t = u[s] p and there exists a combinator axiom l = r and substitution σ such that ([l])σ = s and t = u[ ([r] )σ] p . The aim of the ordering presented here is to parametrise the superposition calculus. For this purpose, the property that for terms t and t , t −→ w t =⇒ t t , is desired. To this end, the following lemma is proved. Proof. The =⇒ direction can be proved by a straightforward induction on the t. The opposite direction is proved by an induction on ([t]). The combinator orienting Knuth-Bendix order (approach 1) > ski1 is defined as follows. For terms t and s, let t = ([t]) and s = ([s]). Then t > ski1 s if vars # (s ) ⊆ vars # (t ) and: R1 t > s or, R2 t = s and t > hb s . Approach 2. Using approach 1, terms t = y a and s = y b are incomparable. Both are type-2 terms and therefore ([t]) = x t and ([s]) = x s . The variable condition obviously fails to hold between x t and x s . Therefore, we consider another approach which does not replace subterms with fresh variables. We introduce a new translation from T to untyped terms that merely replaces type arguments with equivalent term arguments and does not affect term arguments at all. The simpler translation comes at the cost of a more complex variable condition. Before the revised variable definition can be provided, some further terminology requires introduction. Let C any occur in t at position p and let p be the shortest prefix of p such that head(t| p ) is a combinator and for all positions p between p and p , head(t| p ) is a combinator. Let p be a prefix of p of length one shorter than p if such a position exists and otherwise. Then C any is safe in t if t| p is ground and head(t| p ) / ∈ V and unsafe otherwise. Intuitively, unsafe combinators are those that could affect a variable on a longest reduction path or could become applied to a subterm of a substitution. For example, all combinators in the term S (K I) a x are unsafe because they affect x, whilst the combinator in f (I b) y is safe. The combinators in x (S I) a are unsafe because they could potentially interact with a term substituted for x. We say a subterm is top-level in a term t if it doesn't appear beneath an applied variable or fully applied combinator head in t. Let t 1 and t 2 be untyped terms. The predicate saf e(t 1 , t 2 ) holds if for every position p in t 2 such that t 2 | p = C any t n and C any (not necessarily fully applied) is unsafe, then t 1 | p = C any s n and for 1 ≤ i ≤ n, s i ≥ t i . Further, for all p in pos(t 2 ) such that t 2 | p = x t n , then t 1 | p = x s n and for The definition of saf e ensures that if saf e(t, s) and t ≥ s , then tσ ≥ sσ for any substitution σ a result we prove in Lemma 13. Consider terms t = x (I(I (I a))) b and s = x a (I (I b)). We have that t = 3 > s = 2. However, it is not the case that saf e(t, s) because the condition that t i ≥ s i for all i is not met. We draw out some obvious consequences of the definition of safety. Firstly, the predicate enjoys the subterm property in the following sense. If p is a position defined in terms t 1 and t 2 , then saf e(t 1 , t 2 ) =⇒ saf e(t 1 | p , t 2 | p ). Secondly, the predicate is transitive; saf e(t 1 , t 2 ) ∧ saf e(t 2 , t 3 ) =⇒ saf e(t 1 , t 3 ). There is a useful property that holds for non-ground terms t and s such that saf e(t, s). Definition 6 (Semisafe). Let t and s be untyped terms. Let C any s n be a term that occurs in s at p such that all head symbols above C any in s are combinators. Then semisaf e(t, s) if t| p = C any t n and for 1 ≤ i ≤ n, t i ≥ s i . It is clearly the case that (t not ground ) ∧ (s not ground ) ∧ saf e(t, s) =⇒ semisaf e(t, s). The implication does not hold in the other direction. A useful property of semisaf e is that it is stable under head reduction. If for terms t and s that reduce at their heads to t and s respective, we have semisaf e(t, s), then we have semisaf e(t , s ). Let t = t and s = s for polymorphic terms t and s. Let A be the multiset of all top-level, non-ground, first-order subterms in s of the form x s n (n may be 0) or C any t n . Let B be a similarly defined multiset of subterms of t . Then, var cond(t , s ) holds if there exists an injective total function f from A to B such that f only associates terms t 1 and t 2 if saf e(t 1 , t 2 ). For example var cond(t, s) holds where t = f y (x a) and s = g (x b). In this case A = {x b} and B = {y, x a}. There exists and injective total function from A to B that matches the requirements by relating x b to x a. However, the variable condition does not hold in either direction if t = f y (x a) and s = g (x (I b)). In this case, x (I b) cannot be related to x a since the condition that a ≥ I b is not fulfilled. We now define the combinator orienting Knuth-Bendix order (approach 2) > ski . For terms t and s, let t = t and s = s . Then t > ski s if var cond(t , s ) and: R1 t > s or, R2 t = s and t > hb s . Proof. Since for all ground instances of the axioms l ≈ r, we have l > r , the theorem follows by an application of R1. It should be noted that for non-ground instances of an axiom l ≈ r, we do not necessarily have l > ski r since l and r may be incomparable. This is no problem since the definition of > ski could easily be amended to have l > ski r by definition if l ≈ r is an instance of an axiom. Lemma 4 ensures that stability under substitution would not be affected by such an amendment. Various properties of the order > ski are proved here. The proofs provided here lack detail, the full proofs can be found in our report [6] . The proofs can easily be modified to hold for the less powerful > ski1 ordering. In general, for an ordering to parameterise a superposition calculus, it needs to be a simplification ordering [19] . That is, superposition is parameterised by an irreflexive, transitive, total on ground-terms, compatible with contexts, stable under substitution and wellfounded binary relation. Compatibility with contexts can be relaxed at the cost of extra inferences [3, 5, 9] . A desirable property to have in our case is coincidence with first-order KBO, since without this, the calculus would not behave on firstorder problems as standard first-order superposition would. Proof. Let s = s . It is obvious that s = s . Therefore s > ski s can only be derived by rule R2. However, this is precluded by the irreflexivity of > hb . Proof. Let s = s , t = t and u = u . From var cond(s , t ) and var cond(t , u ), var cond(s , u ) by the definition of var cond and the application of the transitivity of saf e. If s > t or t > u then s > u and s > ski u follows by an application of rule R1. Therefore, suppose that s = t = u . Then it must be the case that s > hb t and t > hb u . It follows from the transitivity of > hb that s > hb u and thus s > ski u. Let s and t be ground terms that are not syntactically equal. Then either s > ski t or t > ski s. Proof. Let s = s and t = t . If s = t then by R1 either s > ski t or t > ski s. Otherwise, s and t are compared using > hb and either t > hb s or s > hb t holds by the ground totality of > hb and the injectivity of . Proof. Let s = s and t = t . Since t is a subterm of s, t is a subterm of s and s ≥ t because any weak reduction in t is also a weak reduction in s . If s > t , the theorem follows by an application of R1. Otherwise s and t are compared using > hb and s > hb t holds by the subterm property of > hb . Thus s > ski t. Next, a series of lemmas are proved that are utilised in the proof of the ordering's compatibility with contexts and stability under substitution. We prove two monotonicity properties Theorems 5 and 6. Both hold for non-ground terms, but to show this, it is required to show that the variable condition holds between terms u[t] and u[s] for t and s such that t > ski s. To avoid this complication, we prove the Lemmas for ground terms which suffices for our purposes. To avoid clutter, assume that terms mentioned in the statement of Lemmas 5-16 are all untyped, formed by translating polymorphic terms. Assume that the F ∞ redex of s is in t 1 . Further, assume that t 1 > 0. Then, for some i in {1 . . . m }, it must be the case that v i > 0. Let j be the smallest index such that v j > 0. Then by the definition of F ∞ , Suppose that the F ∞ redex of s is not in t 1 . This can only be the case if t 1 = 0 in which case t 1 = 0 as well. In this case, by the definition of Without loss of generality, assume that the F ∞ redex of s i occurs inside t i . Then t i must be a subterm of s i . Assume that t i > 0 and thus s i > 0. Since for all i, s i and s i only differ at positions where one contains a t j and the other contains a t j and t i ≥ t i for 1 ≤ i ≤ m , we have that s j = 0 implies s j = 0. Thus, using the definition of The induction hypothesis can be applied to s i and s i to conclude that the F ∞ redex of s i occurs inside t i . The lemma follows immediately. If u does not have a hole at its head, then s = ζ s n and s = ζ s n where ζ is not a fully applied combinator other than K (if it was, the F ∞ redex would be at the head). If ζ is not a combinator, the proof follows by a similar induction to above. Therefore, assume that ζ = K. It must be the case that s 2 > 0 otherwise the F ∞ redex in s would be at the head and not within a t i . By the definition of F ∞ , By the induction hypothesis, the F ∞ redex of s 2 occurs in t j . Proof. Let s = u[ t n ] and let p 1 , . . . p n be the positions of t n in s. Since s is reducible, there must exist a p such that s| p is a redex. If p > p i for some i, there exists a p = such that p = p i p . Then, u[t 1 , . . . , t i , . . . , t n ]| pi = t i [C any r n ] p −→ w t i [(C any r n ) ↓ w ] p . Lett i = t i [(C any r n ) ↓ w ] p . We thus have that t i −→ wti and thus u[t 1 , . . . , t i , . . . , t n ] −→ w u[t 1 , . . . ,t i , . . . , t n ]. It cannot be the case that p = p i for any i because head(t i ) is not a combinator for any t i . In the case where p < p i or p p i for all i, we have that u[ t n ] = (u[ x n ])σ and u[ x n ]| p is a redex where σ = { x n → t n }. Letû be formed from u[ x n ] by reducing its redex at p. Assume that u[ t n ] = 0. Then t i = 0 for 1 ≤ i ≤ n. Now assume that u[ t n ] = 0. Then there must exist some position p such that s | p is a redex. We have that p = p i for all p i as head(t i ) / ∈ {I, K, B, C, S}. Assume p > p i for some p i . But then, t i > 0 which contradicts the fact that t i = t i for all i. Therefore, for all p i either p < p i or p p i . But then, if s | p is a redex, so must s| p be, contradicting the fact that u[ t n ] = 0. Thus, we conclude that In the first case, by Lemma 7 and t i = t i we have [t 1 , . . . ,t i , . . . , t n ] where t i −→ wt i . By the induction hypothesis u = u and thus s = s . In the second case, F ∞ (s ) = u =û{ x n → t n } where u[ x n ] −→ wû . Again, the induction hypothesis can be used to show u = u and the theorem follows. We now prove part (2); u[ t n ] must be greater than 0. Again, let u = F ∞ (s) and u = F ∞ (s ). If u = u[t 1 , . . . ,t i , . . . , t n ] and t i = 0, then by Lemma 7 u = u[t 1 , . . . ,t i , . . . , t n ] where t i −→ wt i unless t i = 0 and the lemma follows by the induction hypothesis. If t i = 0, consider terms u and s . If t i > 0 or t j > t j for some j = i, then the induction hypothesis can be used to show u > s and therefore s = u + 1 > s . Otherwise, t j = t j for all j = i and t i = 0 = t i . Part 1 of this lemma can be used to show that u = s and thus s = u + 1 > s . If u =û{ x n → t n }, then u =û{ x n → t n } and the lemma follows by the induction hypothesis. Proof. Let s = s , t = t and u = u . By Lemma Proof. Proceed by induction on the size of the context u. If u is the empty context, both parts of the theorem hold trivially. The inductive case is proved for the first implication of the lemma first. If u is not the empty context, u s is of the form u ζ t 1 . . . t i−1 , s, t i+1 . . . t n . By the definition of a stable subterm ζ cannot be a fully applied combinator and thus by Lemma 5 we have that ζ t 1 . Proof. If s > t then by Lemma 10, u s > u t holds and then by an application of R1 we have u s > ski u t . Otherwise, if s = t , then by Lemma 10 we have that u s = u t . Thus u s and u t are compared using > hb . By the compatibility with contexts of > hb , u s > hb u t holds and then by ofan application of R2 u s > ski u t is true. We next prove stability under substitution. In order to prove this, it needs to be shown that for untyped terms s and t and all substitutions σ: 1. var cond(s, t) implies var cond(sσ, tσ). The first is proved in Lemma 15. A slightly generalised version of (2) is proved in Lemma 14. Lemmas 11-13 are helper lemmas used in the proof of the above two properties. Lemma 11. For a single hole context u such that the hole does not occur below a fully applied combinator and any term t, u t = u + t . Proof. Proof to be found in report. Let t n and s n be terms such that for n 1 . . . n n ∈ N and for 1 ≤ i ≤ n, t i ≥ s i + n i . Further, let t = t 1 t 2 . . . t n and s = s 1 s 2 . . . s n . Assume that semisaf e(t, s) holds. Then t ≥ s + n i=1 n i . Proof. Proof to be found in report. Let t and s be non-ground terms such that t ≥ s + m for some m ∈ N and saf e(t, s). Then, for any substitution σ, tσ ≥ sσ + m and saf e(tσ, sσ). Proof. Proof to be found in report. For terms t and s such that var cond(t, s) holds and t ≥ s + n for some n ∈ N, for all substitutions σ, tσ ≥ sσ + n. Proof. If s and t are ground, the theorem is trivial. If s is ground, then tσ ≥ t ≥ s +n. If s is not ground, then var cond(t, s) implies that t is not ground. Therefore, assume that neither is ground. If head(s) (and therefore head(t) by the variable condition) are fully applied combinators or variables, then var cond (t, s) implies saf e(t, s) and Lemma 13 can be invoked to prove the lemma. Therefore, assume that both have non-variable, non-fully applied combinator heads. Let t = u t m and s = u s m where s m are all the non-ground, toplevel, first-order subterms of the form x args or C any args in s. By the variable condition, we have that there exists a total injective function respecting the given conditions from the s i to non-ground, top-level, first-order subterms of t of the form x args or C any args . Let t m be the terms related to s m by this function. Without loss of generality, assume that this function relates s 1 to t 1 , s 2 to t 2 and so on. For 1 ≤ i ≤ m, t i = s i + m i for m i ∈ N. This follows from the fact that since t i and s i are both non-ground and saf e(t i , s i ), we have semisaf e(t i , s i ) and can therefore invoke Lemma 12. Let m = u − u . Note that m could be negative. By Lemma 11, To conclude the proof: Proof. Let t = u t m and s = u s m where s m are all the non-ground, toplevel, first-order subterms of the form x args or C any args in s. By the variable condition, we have that there exists a total injective function respecting the given conditions from the s i to non-ground, top-level, first-order subterms of t of the form x args or C any args . Let t m be the terms related to s m by this function. Without loss of generality, assume that this function relates s 1 to t 1 , s 2 to t 2 and so on. By the definition of the variable condition, we have that u must be ground. This implies that any non-ground subterms of sσ must be subterms of some Assume that for some i and p ∈ pos(s i σ), s i σ| p is a non-ground, top-level, first-order subterm of the form x args or C any args . We show that t i σ| p is a non-ground, top-level, first-order subterm of tσ and saf e(t i | p , s i | p ). This implies the existence of a total, injective function from the multiset of non-ground, top-level first-order subterms in sσ to the like multiset of tσ in turn proving var cond(tσ, sσ). From Lemma 13, it can be shown that for 1 ≤ i ≤ m, saf e(t i σ, s i σ). By the subterm property of safety, this implies that saf e(t i σ| p , s i σ| p ). To show that t i σ| p must be a non-ground, top-level, first-order subterm in tσ, it can be assumed that this not the case. This easily leads to a contradiction with saf e(t i σ, s i σ). Let t be a polymorphic term and σ be a substitution. We define a new substitution ρ such that the domain of ρ is dom(σ). Define yρ = yσ . For all terms t, tσ = t ρ. Proof. Via a straightforward induction on t. Theorem 7 (Stability under Substitution). If s > ski t then sσ > ski tσ for all substitutions σ that respect the ghd mapping. Proof. Let s = s and t = t . Let ρ be defined as per Lemma 16. First, we show that if R1 was used to derive s > ski t and thus s > t then s ρ > t ρ and thus sσ > ski tσ because sσ = s ρ and tσ = t ρ. From Lemma 15 and var cond(s , t ), var cond(s ρ, t ρ) holds. Furthermore, if s > t , then by Lemma 14 s ρ > t ρ and sσ > ski tσ by an application of R1. On the other hand, if s = t , then R2 was used to derive s > ski t. By Lemma 14 s ρ ≥ t ρ . If s ρ > t ρ , then this is the same as the former case. Otherwise s ρ = t ρ and s ρ and tρ are compared using R2. From the stability under substitution of > hb , s ρ > hb t ρ follows and sσ > ski tσ can be concluded. Proof. Assume that such a chain exists. For each s i > ski s i+1 derived by R1, we have that s i > s i+1 . For each s i > ski s i+1 derived by R2, we have that s i = s i+1 . Therefore the number of times s i > ski s i+1 by R1 in the infinite chain must be finite and there must exist some m such that for all n > m, s n > ski s n+1 by R2. Therefore, there exists an infinite sequence of > hb comparisons s m > hb s m+1 > hb s m+2 · · · . This contradicts the wellfoundedness of > hb . Theorem 9 (Coincidence with First-Order KBO). Let > fo be the firstorder KBO as described by Becker et al. in [2] . Assume that > ski and > fo are parameterised by the same precedence and that > fo always compares tuples using the lexicographic extension operator. Then > ski and > fo always agree on first-order terms. Proof. Let t = t and s = s . Since s and t are first-order, s = 0 and t = 0. Thus, s and t will always be compared by > hb . Since > hb coincides with > fo on first-order terms, so does > ski . To give a flavour of how the ordering behaves, we provide a number of examples. Here s > ski t despite the fact that s is syntactically smaller than t because s has a maximum reduction of 1 as opposed to 0 of t. (h y) b) . This is very similar to the previous example, but in this case the terms are incomparable. Let s be a name for the subterm (S x (h y) b) in s and t a name for the subterm (S x a y) . The variable y occurs in different positions in s and t . Therefore, s cannot be related to t by the variable condition and the two terms are incomparable. Consider terms t = f (x (g (K I a b))) and s = h (I a) (x c). The variable condition holds between t and s by relating (x (g (K I a b))) to (x c). The combinator I in s is not unsafe and therefore does not need to be related to a combinator in t. Since t = 2 > s = 1, t > ski s. Intuitively, this is safe because a substitution for x in t can duplicate (g (K I a b)) whose maximum reduction length is 2 whilst a substitution for x in s can only duplicate c whose maximum reduction length is 0. We have presented an ordering that orients all ground instances of S, C, B, K and I axioms left-to-right. The ordering enjoys many other useful properties such as stability under substitution, compatibility with stable contexts, ground totality and transitivity. In as yet unpublished work, we have used this ordering to parameterise a complete superposition calculus for HOL [5] . Lack of full compatibility with context has not been an obstacle. In the standard first-order proof of the completeness of superposition, compatibility with contexts is used in model construction to rule out the need for superposition inferences beneath variables [19] . Thus, by utilising > ski , some superposition is required beneath variables. However, because terms with functional heads are compatible with all contexts, such inference are quite restricted. The > ski ordering presented here is able to compare non-ground terms that cannot be compared by any ordering used to parameterise Bentkamp et al.'s lambda superposition calculus [3] . They define terms to be β-equivalence classes. Non-ground terms are compared using a quasiorder, , such that t s iff for all grounding substitutions θ, tθ sθ. Consider terms t = x a b and s = x b a and grounding substitutions θ 1 = x → λx y .f y x and θ 2 = x → λx y .f x y. By ground totality of it must be the case that either f a b f b a or f b a f a b. Without loss of generality assume the first. Then, neither t s nor s t since tθ 1 = f b a ≺ f a b = sθ 1 and tθ 2 = f a b f b a = sθ 2 . The > ski ordering allows weak reduction (or β-reduction) to be treated as part of the superposition calculus. This allows terms t and t such that t −→ + w t (or t −→ + β t ) to be considered separate terms resulting in terms such as t and s given above being comparable. Since t = s , t and s are compared using > hb with stability under substitution ensured by the stability under substitution of > hb . Many of the definitions that have been provided here are conservative and can be tightened to allow the comparison of a far larger class of non-ground terms without losing stability under substitution. We provide an example of how the definition of stable subterm could be refined in our report [6] . In further work, we hope to thoroughly explore such refinements. The Lambda Calculus: Its Syntax and Semantics, 2nd edn A transfinite Knuth-Bendix order for lambda-free higher-order terms Superposition with lambdas The higher-order prover Leo-II A combinator-based superposition calculus for higherorder logic A Knuth-Bendix-like ordering for orienting combinator equations (technical report) A lambda-free higher-order recursive path order The computability path ordering: the end of a quest Paramodulation with nonmonotonic orderings Satallax: an automatic higher-order prover Hammer for Coq: automation for dependent type theory Substitution tree indexing Lambda-Calculus and Combinators: An Introduction Polymorphic higher-order recursive path orderings How to prove higher order theorems in first order logic A higher-order iterative path ordering Translating higher-order clauses to first-order clauses Paramodulation-based theorem proving Term indexing, chap. 26 Extensional paramodulation for higher-order logic and its effective implementation Leo-III Leo-II and Satallax on the Sledgehammer test bench Perpetual reductions in lambda calculus Acknowledgements. Thanks to Jasmin Blanchette, Alexander Bentkamp and Petar Vukmirović for many discussions on aspects of this research. We would also like to thank reviewers of this paper, whose comments have done much to shape this paper. The first author thanks the family of James Elson for funding his research.