key: cord-0567707-yq1gi2ox
authors: Bansal, Suguman
title: Automata-based Quantitative Verification
date: 2020-10-05
journal: nan
DOI: nan
sha: 96e5ca6ae0a589cb4736275fb8cace4920582d47
doc_id: 567707
cord_uid: yq1gi2ox

Quantitative analysis of computing systems is an emerging area in automated formal analysis. Such properties address aspects such as costs and rewards, quality measures, resource consumption, distance metrics, etc. Existing solutions for problems in quantitative analysis face two challenges, namely lack of generalizability and separation-of-techniques. Lack of generalizability refers to the issue that solution approaches are specialized to cost model. Different cost models deploy such disparate algorithms that there is no transfer of knowledge from one cost model to another. Separation-of-techniques refers to the inherent dichotomy in solving problems in quantitative analysis. Most algorithms comprise of a structural phase which reasons about the structure of the quantitative system(s) using techniques from automata or graphs, and a numerical phase, which reasons about the quantitative dimension/cost model using numerical methods. The techniques used in both phases are so unlike each other that they are difficult to combine, thereby impacting scalability. This thesis contributes to a novel framework that addresses these challenges. The introduced framework, called comparator automata or comparators in short, builds on automata-theoretic foundations to generalize across a variety of cost models. Comparators enable automata-based methods in the numerical phase, hence eradicating the dependence on numerical methods. In doing so, comparators are able to integrate the structural and numerical phases. On the theoretical front, we demonstrate that these have the advantage of generalizable results, and yield complexity-theoretic improvements over a range of problems in quantitative analysis. On the empirical front, we demonstrate that comparator-based solutions render more efficient, scalable, and robust performance, and are able to integrate quantitative with qualitative objectives.

This thesis is based on the following publications. Publications 1-2 are yet to appear in an official proceedings at the time of thesis submission. 1 . On the analysis of quantitative games II Quantitative inclusion with discounted-sum 77 5 Analysis of DS inclusion with integer discount factor 80 5 In most of these use-cases, systems are evaluated on their functional properties that describe temporal system behaviors, safety or liveness conditions and so on.

This thesis looks into the extension of formal methods with quantitative properties of systems, leading towards scalable and efficient verification and synthesis with quantitative properties.

The notion of correctness with functional properties is Boolean. The richness of modern systems justifies properties that are quantitative. These can be thought to extend functional properties, since in this case executions are assigned to a realvalued cost using a cost model that captures the quantitative dimension of interest.

Quantitative properties can reason about aspects such as quality measures, cost-and resource-consumption, distance metrics and the like [12, 14, 19, 66] , which functional properties cannot easily express. For example, whether an arbiter grants every request is a functional property; But the promptness with which an arbiter grants requests is a quantitative property.

The analysis of quantitative properties of computing systems, or quantitative analysis in short, is an emerging area in automated formal analysis. It enables the formal reasoning about features such as timeliness, quality, reliability and so on. Quantitative properties have been used in several forms. These include probabilistic guarantees about the correctness of hardware or software [19, 66, 68] ; use of distance-metrics to produce low-cost program repair [53, 60] , generate feedback with few corrections in computer-aided education [84] , and automated generation of questions of similar difficulty for MOOCs [9] ; in planning for robots with resource-constraints [58] , with hard and soft constraints where the soft constraints are expressed quantitatively [69, 90] , and generate plans of higher quality [12, 29] .

The cornerstones of formal analysis of quantitative properties of systems are (a).

Verification of quantitative properties, and (b) Synthesis from quantiative properties.

This thesis focuses on the two problems that form the fundamentals of these two cornerstones. These problems are Quantiative inclusion and Solving quantitative games, respectively, as detailed below:

From "Verification of quantitative properties" to "Quantiative inclusion".

In the verification of functional properties, the task is to verify if a system S satisfies a functional property P , possibly represented by a temporal logic [77] . Traditionally, S and P are interpreted as sets of (infinite-length) executions, and S is determined to satisfy P if S ⊆ P . The task, however, can also be framed in terms of a comparison between executions in S and P . Suppose an execution w is assigned a cost of 1 if it belongs to the language of the system or property, and 0 otherwise. Then determining if S ⊆ P amounts to checking whether the cost of every execution in S is less than or equal to its cost in P [21, 89] .

The verification of quantitative properties is an extension of the same from functional properties. In the quantitative scenario, executions in the system S and property P are assigned a real-valued cost, as opposed to a 0 or 1 cost. Then, as earlier, verification of quantitative properties amounts to checking whether the cost of every execution in S is less than or equal to its cost in P . When the system and property are modeled/represented by a finite-state (quantitative) abstraction, this problem is referred to as quantitative inclusion [39, 55] . Formally speaking, given two finite-state (quantitative) abstractions S and P , quantitative inclusion determines whether the cost of every execution in S is less than or equal to its cost in P .

For verification purposes, the systems are often modeled by finite-state (quantitative) abstractions. Even though properties are typically represented by quantitative logics, several of them can be converted to the aforementioned abstractions [46, 50] .

Therefore, verification from quantitative properties reduces to quantitative inclusion.

More generally, quantitative inclusion is applied to compare systems on their quantitative dimensions (Illustration in Fig 1.1) , and in the analysis of rational behaviors in multi-agent systems with reward-maximizing agents [8, 23] .

From "Synthesis from quantitative properties" to "Solving quantitative games". Synthesis from a functional properties in an uncertain environment is perceived as a game between an uncontrollable environment and the system that we wish to designed. In this game, the environment and the system have the opposing objectives of guaranteeing that the resulting play flouts and satisfies the functional property, respectively. The system can be constructed if it is able to win the game against the environment. As earlier, synthesis from quantitative properties extends the later by assigning real-valued costs to the plays [29, 37, 58] .

As a concrete instantiation, suppose we are to design a plan for a robot to patrol a grid without colliding into another (movable) robot present on the grid. Now, as our robot moves around the grid, it expends the charge on its battery, and will have to enter the recharge stations every now and then. Therefore, while planning for our robot, we need to guarantee that its battery charge never goes below level 0. Note that this criteria is a quantitative constraint.

Such problems of planning in an uncertain environment with quantitative constraints can be reduced to solving a min-max optimization on a two-player quantitative graph game. The two players refer to the system for which the plan is being designed, and an uncontrollable environment which is assumed to be adversarial to the objective of the system player. In these games, players take turns to pass a token along the transition relation between the states. As the token is pushed around, the play accumulates weights along the transitions using the underlying cost model f : Z ω → R. One player maximizes the cost while the other minimizes it.

Both of the problems extend their functional counterparts. As a result, quantitative analysis inherits their high complexity and practical intractability [11, 20] . However, algorithms that solve over functional properties cannot be directly applied to solve with quantitative properties. Therefore, the developments in functional properties have limited impact on solving with qualitative properties. To this end, this thesis focuses on development of efficient and scalable solutions for quantitative analysis.

This thesis begins with identifying two broad challenges that obstruct the theoretical understanding and algorithmic development in quantitative analysis. A concrete instantiation of this challenges appears in problems over weighted ωautomata [52] . Weighted ω-automaton [52] are a well-established finite-state (quantiative) abstraction used to model quantitative system, as illustrated in Fig 1. 1. An execution is an infinite sequence of labels that arise from an infinite sequence of subsequent transitions. Weighted ω-automata assign a real-valued cost to all executions over the machine using a cost model f : Z ω → R. The cost of an execution is assigned by applying the cost model f to the weight-sequence arising from transitions along the execution. In case the transition relation is non-deterministic, each execution may have multiple runs. In these cases, the cost of the execution is resolved by taking the infimum/supremum of cost along all runs.

In case of quantitative inclusion over weighted ω-automata, the lack of generalizable is apparent from the diversity in complexity-theoretic results. While quantitative inclusion is PSPACE-hard, there is ample variance in upper bounds. Quantitative inclusion is PSPACE-complete under limsup/liminf [39] , undecidable for limitaverage [50] , and decidability is unknown for discounted-sum in general but its decidable fragments are EXPTIME [31, 39] .

Yet another well-studied problem over weighted ω-automata is to determine if there exists an execution with cost that exceeds a constant threshold value [47, 52] .

The problem finds applications in verification from quantitative logics [46, 50] , planning with quantitative constraints in closed environments [69] , low-cost program repair. The landscape is even less uniform on this problem. Known solutions involve diverse techniques ranging from linear-programming for discounted-sum [18] to negative-weight cycle detection for limit-average [61] to computation of maximum weight of cycles for limsup/liminf [38] .

The question that this thesis asks is whether one can develop a unifying theory for quantitative analysis that generalizes across a variety of cost models. It would be too navie to expect that all cost models can be brought under a single theoretical framework. So, the natural question to ask is about boundary conditions over cost models that can and cannot be generalized. As a concrete instantiation, consider the problem of planning in an uncertain environment under quantitative constraints. Here, the structural phase consists of the reduction from planning to the quantitative graph game, while the numerical phase consists of solving the min-max optimization on the game. In case of our robot planning example, the quantitative game has as many states as there are configurations of the grid. So, if the grid has a size of n × n, then the quantitative graph game will have O(n 4 ) states. Therefore, the size of the min-max optimization problems grows rapidly, and will limit scalability of planning.

The question this thesis asks is whether one can design an algorithmic approach for quantitative analysis that integrates the two phases as opposed to the existing separation-of-techniques , and whether an integrated approach will lead to efficient and scalable solutions.

Existing work on probabilistic verification presents initial evidence of the existence of integrated approaches [67, 76] . Probabilistic verification incorporates state-of-theart symbolic data structures and algorithms based on BDDs (Binary Decision Diagrams) [34] or MTBDD (Multi-Terminal Binary Decision Diagrams). These Decision Diagrams can encode and manipulate both the the structural and numerical aspects of underlying problem at once, and hence are an integrated method. The disadvantage of Decision Diagrams is that their performance is unpredictable, as they are known to exhibit large variance in runtime and memory consumption. Therefore, existing integrated approaches fall short of wholesome improvement in the theory and practice of problems in quantitative analysis.

Existing solution approaches suffer from the lack of generalizability and separationof-techniques, adversely impacting a clear theoretical understanding of quantitative analysis, and large-scale applicability due to limitations on scalability. This thesis contributes towards a novel theoretical framework that addresses both of the challenges, and demonstrates utility on problems of quantitative inclusion and solving quantitative games. The introduced framework, called comparator automata or comparators in short, builds on automata-theoretic foundations to generalize across a variety of cost models. The crux of comparators is that they substitute the numerical analysis phase with automata-based methods, and hence naturally offer an integrated method for quantitative analysis. In all, we show that comparator-based algorithms have the advantages of generalizable results, yield complexity-theoretic and algorithmic development, and render practically scalable solutions.

The detailed contributions are as follows:

1. Comparator automata: The approach takes the view that the comparison of costs of system executions with the costs on other executions is the fundamental operation in quantitative reasoning, and hence that should be brought to the forefront. To this end, comparator automata as an automata-theoretic formulation of this form of comparison between weighted sequences. Specifically, comparator automata (comparators, in short), is a class of automata that read pairs of infinite weight sequences synchronously, and compare their costs in an online manner. We show that cost models such as limsup/liminf and discounted-sum with integer discount factors permit comparators that require af finite amount of memory, while comparators for limit-average require an infinite-amount of memory. We show that results and algorithms over weighted ω-automata and quantitative games generalize over cost models that permit comparators with finite memory. Lastly, since comparators are automata-theoretic, these algorithms are integrated in approach.

We illustrate further benefits of comparator-based approaches by investigating the discounted-sum cost model, which is a fundamental cost model in decision-making and planning domains including reinforcement learning, game theory, economics 2 . Quantitative inclusion over discounted-sum: The decidability of quantita-tive inclusion over discounted-sum, DS inclusion, is unknown when the discount factor is an integer. This has been an open problem for almost 15 years now.

When the discount factor is an integer, DS inclusion is known to have an EX-PTIME upper bound and PSPACE lower bound. Hence, its exact complexity is unknown.

Comparator-based arguments make resolutions towards both of these questions.

We are able to prove that when the discount factor is an integer, then DS inclusion is indeed PSPACE-complete. When the discount factor is not an integer, then we are not able to resolve the decidability debate. However we are able to provide an anytime algorithm for the same, hence rendering a pragmatic solution.

Last but not the least, we demonstrate that by leveraging its automata-theoretic foundations, comparator-based algorithms for DS inclusion are more scalable and efficient in practice that existing methods. 3 . Quantitative games over discounted-sum: The benefits of ω-regular and safety/co-safety properties of comparators for discounted-sum continue in quantitative games as well. We show that comparator-based solutions to quantitative games are more efficient in theory and more efficient in practice. In addition, they broaden the scope of quantitative games by extending with temporal goals.

Chapter 2 introduces the necessary notation and background, and should be treated as an index. The technical sections of this thesis have been split into three parts.

Part I lays down the theoretical foundations of comparator automata, our automata-based technique to obtain integrated methods to solve problems in quantitative reasoning. The formal introduction of comparator automata, and utility

of ω-regular comparators in designing generalizable solutions is in Chapter 3.

Chapter 4 furthers the study by investigating aggregate functions that permit ω-regular functions.

Part II undertakes an investigation of discounted-sum automata, following the result that DS comparators are ω-regular iff their discount factor is an integer. The

Part is split into three chapters. Chapter Part III consists of Chapter 8 . It studies the advantages of safety/co-safety automata comparator automata in discounted-sum games.

Last but not the least, the thesis concludes with a discussion of future directions in Chapter 9. Büchi automaton correspond to acceptors of words that require a finite-amount of memory. Formally, a (finite-state) Büchi automaton [87] is a tuple A = (S , Σ, δ, Init, F), where S is a finite set of states, Σ is a finite input alphabet, δ ⊆ (S × Σ × S ) is the transition relation, Init ⊆ S is the set of initial states, and F ⊆ S is the set of accepting states [87] .

A Büchi automaton is deterministic if for all states s and inputs a, |{s |(s, a, s ) ∈ δ for some s }| ≤ 1 and |Init| = 1. Otherwise, it is nondeterministic. A Büchi automaton is complete if for all states s and inputs a, |{s |(s, a, s ) ∈ δ for some s }| ≥ 1. For a word w = w 0 w 1 · · · ∈ Σ ω , a run ρ of w is a sequence of states s 0 s 1 . . . s.t. 

Büchi pushdown automaton correspond to acceptors of words that require an infinite-amount of memory. A Büchi pushdown automaton [45] is a tuple A = (S , Σ, Γ, δ, Init, Z 0 , F), where S , Σ, Γ, and F are finite sets of states, input alphabet, pushdown alphabet and accepting states,

is the transition relation, Init ⊆ S is a set of initial states, Z 0 ∈ Γ is the start symbol.

A run ρ on a word w = w 0 w 1 · · · ∈ Σ ω of a Büchi PDA A is a sequence of configurations (s 0 , γ 0 ), (s 1 , γ 1 ) . . . satisfying (1) s 0 ∈ Init, γ 0 = Z 0 , and (2) A word w is an accepting word if it has an accepting run. Languages accepted by

Büchi PDA are called ω-context-free languages (ω-CFL).

Weighted ω-automaton A weighted automaton [39, 72] over infinite words is a tuple A = (M, γ, f ), where M = (S , Σ, δ, Init, S ) is a Büchi automaton with all states as accepting, γ : δ → Q is a weight function, and f : Q → R is the aggregate function [39, 72] .

Words and runs in weighted automata are defined as they are in Büchi automata.

The weight-sequence of run ρ = s 0 s 1 . . . of word w = w 0 w 1 . . . is given by wt ρ = n 0 n 1 n 2 . . . where n i = γ(s i , w i , s i+1 ) for all i. The weight of a run ρ, denoted by f (ρ), is given by f (wt ρ ). Here the weight of a word w ∈ Σ ω in weighted automata is defined as wt A (w) = sup{f (ρ)|ρ is a run of w in A}. In general, weight of a word can also be defined as the infimum of the weight of all its runs. By convention, if a word w / ∈ L(M) its weight wt A (w) = −∞.

Games over graphs are popular in formal methods to study interactions between two players. More specifically, these are extensively used in the context of planning and synthesis. A graph game consists two players. It is played over a directed graph in which the states are partitioned between the two players. A play consists of the players moving a token along edges in the graph: The play begins in the initial state, and the token is pushed to the next state by the player that own the current state. This play goes on till infinitum. The outcome of which player wins a play is determined by an acceptance condition over the play itself. Reachability, safety, and parity are few examples of such conditions, and are described in detail below. In other cases, the plays could be associated with a quantitative value. In these cases, the winning criteria will be a quantitative property, as described later in the section.

Both, reachability and safety games, are defined over the structure G = (V = V 0 V 1 , v init , E, F) [87] , where V , V 0 , V 1 , v init , and E are defined as above. We assume that every state has at least one outgoing edge, i.e, vE = ∅ for all v ∈ V . The nonempty set of states F ⊆ V is called the accepting and rejecting states in reachability and safety games, resp. 

A parity game is defined over the structure 

is Wlog, a play is said to be winning if the maximum color appearing infinitely often in its color sequence is even. As earlier, a strategy Π i is winning for player P i if strategies of the opponent player P 1−i , all resulting plays are winning for P i . To solve a parity game refers to determining whether there exists a winning strategy for player P 1 .

A quantitative graph with complete information game, referred to as quantitative game in short, is defined over a structure A play of a game involves two players, denoted by P 0 and P 1 , that form an infinite path by moving a token along the transitions as follows: At the beginning, the token is at the initial state. If the current position v belongs to V 0 , then P 0 chooses the successor state w from vE; otherwise, if v ∈ V 1 , then P 1 chooses the next state from

. is an infinite sequence of states such that the first state v 0 = v init , and each pair of successive states is a transition, i.e., (v k , v k+1 ) ∈ E for all k ≥ 0. The cost sequence of a play ρ is the sequence of costs w 0 w 1 w 2 . . . such that

Given the aggregate function f : Z ω → R, the cost of play ρ, denoted wt(ρ), is the aggregate function applied to the cost sequence, i.e., wt(ρ) = f ρ. W.l.o.g, we assume that the players have opposing objectives, one player attempts to maximize the cost from plays while the other attempts to minimize the cost.

A strategy for player P i is a is a partial function Π i :

Intuitively, strategy Π i directs the player P i which state to go to next based on the history of the play such that it is consistent with the transitions. A player P i is said to follow a strategy Π on a play ρ if for all k-length

.

An incomplete-information quantitative game is a tuple G = (S,

where S, O, Σ are sets of states, observations, and actions, respectively, s I ∈ S

is the initial state, δ ⊆ S × Σ × S is the transition relation, γ : S → Z × Z is the weight function, and f : Z ω → R is the aggregate function.

The transition relation δ is complete, i.e., for all states p and actions a, there

Player P 0 has incomplete information about the game G; it only perceives the observation play o ρ . Player P 1 receives full information and witnesses play ρ. Plays begin in the initial state s 0 = s I .

For i ≥ 0, Player P 0 selects action a i . Next, player P 1 selects the state s i+1 , such that

The weight of state s is the pair of payoffs γ(s) = (γ(s) 0 , γ(s) 1 ). The weight sequence wt i of player P i along ρ is given by γ(s 0 ) i γ(s 1 ) i . . . , and its payoff from ρ is given by f (wt i ) for aggregate function f , denoted by f (ρ i ), for simplicity.

A play on which a player receives a greater payoff is said to be a winning play for the player. A strategy for player P 0 is given by a function α : O * → Σ since it only sees observations. Player P 0 follows strategy α if for all i, a i = α(o 0 . . . o i ). A strategy α is said to be a winning strategy for player P 0 if all plays following α are winning plays for P 0 .

of a sequence A, for i ≥ 0.

Discounted-sum (DS) is a commonly appearing mode of aggregation which captures the intuition that weights incurred in the near future are more significant than those incurred later on [48] . The discount-factor is a rational-valued paramater that governs the rate at which weights loose their significance in the DS.

Let d > 1 be the rational-valued discount-factor. Then the discounted-sum of an

DS is a preferred mode of aggregation across several domains including reinforcement learning [85] , planning under uncertainty [78] , and game-theory [75] . One reason behind its popularity is that DS is known to exist for all bounded infinite-length weight sequences, since its value is guaranteed to converge.

The limit-average of an infinite is intuitively defined as is the point of convergence of the average of prefixes of the sequence.

Let Sum(A[0, n−1]) = n−1 i=0 A[i] denote the sum of the n-length prefix of sequence A. Then intuitively, the limit-average of a sequence A should be defined as lim n→∞ 1 n · Sum(A[0, n − 1]). However, this is not well-defined since 1 n · Sum(A[0, n − 1]) may not converge as n → ∞. Therefore, limit-average is defined in terms of auxilary functions limit-average infimum and limit-average supremum.

The limit-average infimum of an infinite weight sequence A, denoted by LimAvgInf(A), is defined as

The limit-average supremum of an infinite weight sequence A, denoted by

LimAvgSup(A), is defined as

Then, the limit-average of sequence A, denoted by LimAvg(A) is defined either as the limit-average infinum or supremum. Note, limit-average is well defined only if the limit-average infimum and limit-average supremum coincide, in which case limitaverage is the same as limit-average infinum. In the other case, it is simply defined as either the limit-average supremum or infimum.

The class of ω-regular aggregate function are those functions on which arithmetical operations can be performed on on automata [42] .

Let β ≥ 2 be an integer base, Let Digit(β) = {0, . . . , β − 1} be its digit set.

Let x ∈ R, then there exist unique words Int(x, β) = z 0 z 1 · · · ∈ Digit(β) * · 0 ω and

Thus, z i and f i are respectively the i-th least significant digit in the base β representation of the integer part of x, and the i-th most significant digit in the base β representation of the fractional part of x. Then, the real-number x ∈ R in base β is represented by

and (Int(x, β), Frac(x, β)) is the interleaved word of Int(x, β) and Frac(x, β). Clearly,

For all integer β ≥ 2, we denote the alphabet of representation of real-numbers in base β by AlphaRep(β).

Let Σ be a finite set, and β ≥ 2 be an integer-valued base. A Büchi automaton A over alphabet Σ × AlphaRep(β) is an aggregate function automata of type Σ ω → R if the following conditions hold:

• For all A ∈ Σ ω , there exists at most one x ∈ R such that (A, rep(x, β)) ∈ L(A), and • For all A ∈ Σ ω , there exists an x ∈ R such that (A, rep(x, β)) ∈ L(A) Σ and AlphaRep(β) are the input and output alphabets, respectively. An aggregate function f : Σ ω → R is ω-regular under integer base β ≥ 2 if there exists an aggregate function automaton A over alphabet Σ × AlphaRep(β) such that for all sequences

Definition 2.2 (Quantitative inclusion) Let P and Q be weighted ω-automata with the same aggregate function f . The strict quantitative inclusion problem, denoted by P ⊂ f Q, asks whether for all words w ∈ Σ ω , wt P (w) < wt Q (w). The non-strict quantitative inclusion problem, denoted by P ⊆ f Q, asks whether for all words w ∈ Σ ω , wt P (w) ≤ wt Q (w).

Quantitative inclusion, strict and non-strict, is PSPACE-complete for limsup and liminf [39] , and undecidable for limit-average [50] . For discounted-sum with integer discount-factor it is in EXPTIME [31, 39] , and decidability is unknown for rational discount-factors Applications of quantitative inclusion are verification from quantitative logics, verification of reward-maximizing agents in multi-agent systems such as online economic protocols, auctions systems and so on, and in comparing systems based on a quantitative dimension.

Games with complete information Several problems in planning and synthesis with quantitative properties are formulated into an analysis problem over a quantiative game [37] . Optimization over such games is a popular choice of analysis in literature.

Given a quantitative graph game G, the optimization problem is to compute the optimal cost from all possible plays from the game, under the assumption that the objectives of P 0 and P 1 are to maximize and minimize the cost of plays, respectively.

Zwick and Patterson have shown that the optimization problem is pseudopolynomial, and that the optimal cost can be obtained via memoryless strategies for both players [93] . They also show that a VI algorithm will converge to the optimal cost. However, a thorough worst-case analysis of VI is missing.

We argue that the optimal solution may not be required in several tasks. For instance, a solution with minimal battery consumption may be replaced by one that operates within the battery life. Hence, many tasks can be reformulated into an alternative form of analysis in which the problem is to search for a solution the adheres to a given threshold constraint [1] . We call this the satisficing problem.

Definition 2.4 (Satisficing problem) Given a quantitative graph game G and a threshold value v ∈ Q, the satisficing problem is to determine whether player P 1 has a strategy such that for all possible resulting plays of the game, the cost of all plays is less than (or ≤) to the threshold v, assuming that the objectives of P 0 and P 1 are to maximize and minimize the cost of plays, respectively.

In several tasks one is interested in combining quantitative games with a temporal goal. For quantitative games, it is known that an optimal solution may not exist when combined with temporal goals [41] Games with incomplete information

The main problem in games with incomplete information is to determine whether a player has a winning strategy. If so, one is interested in generating it.

Theoretical framework parators in short.

Chapter 3 formally introduces comparator automata, our automata-based technique to obtain integrated methods to solve problems in quantitative reasoning, as opposed to separation-of-techniques used so far. We show that generalizable solutions for quantitative inclusion and solving quantitative games can be designed for all aggregate functions for which the comparator is represented by a Büchi automata.

Chapter 4 studies and constructs comparator automata commonly occurring aggregate functions, namely discounted-sum and limit-average.

Comparator automata: Foundations of a generalizable and integrated theory for quantitative reasoning [39] , undecidable for limit-average [50] , and decidability is unknown for discounted-sum in general but its decidable fragments are EXPTIME [31, 39] .

In view of these vast differences, the aforementioned landscape could benefit from a unified theory that generalizes across aggregate functions. This chapter contributes towards just that: A unified theory for quantitative reasoning that generalizes across a wide class of aggregate functions.

To this end, we take the view that the notion of comparisons between systems runs or inputs is central to formal methods especially to quantitative analysis, and hence comparison should be brought to the forefront. To see why our view holds, first consider the classical model checking problem of verifying if a system S satisfies a linear-time temporal specification P [44] . Traditionally, this problem is phrased language-theoretically: S and P are interpreted as sets of (infinite) words, and S is determined to satisfy P if S ⊆ P . The problem, however, can also be framed in terms of a comparison between words in S and P . Suppose a word w is assigned a weight of 1 if it belongs to the language of the system or property, and 0 otherwise. Then determining if S ⊆ P amounts to checking whether the weight of every word in S is less than or equal to its weight in P [21] . The ubiquity of comparisons becomes more pronounced in quantitative analysis: Firstly, because every system execution is assigned a real-valued cost. W.l.o.g, we can assume that the cost model is an

The cost of an execution is related to the aggregate function f applied to the weight-sequence corresponding to the execution; Secondly, because problems in quantitative analysis reduce to comparing the cost of executions to a constant value (such as in quantitative games), or more generally to the cost of another execution (as in quantitative inclusion).

Keeping comparisons at the center, we introduce a language-theoretic/automatatheoretic formulation of the comparison between weighted sequences. Specifically, in Section 3.1 we introduce comparator automata (comparators, in short), a class of automata that read pairs of infinite weight sequences synchronously, and compare their aggregate values in an online manner. Formally, a comparator automata for aggregate function f , relation R, and upper bound µ > 0 is an automaton that accepts The advantage our automata-based formulation of comparators in this aspect is that comparators reduce the numerical problem of comparison of aggregation of weight sequences into one of membership in an automaton. This way, enabling both phases, the structural phase and numerical phase, of quantitative analysis to be per-formed using automata-based techniques. Subsequently, creating an opportunity to design integrated methods as opposed to separation-of-techniques methods for problems in quantitative analysis. Infact, the generalizable algorithms that we design when the aggregate function permits ω-regular comparators are based on reducing the problems in quantitative analysis to those in qualitative analysis. In particular, quantitative inclusion is reduced to language inclusion, and quantitative games under perfect and imperfect information are both reduced to solving parity games.

As a result, our novel comparator automata framework not only results in generalizable algorithms for problems in quantitative analysis, but the algorithms designed using comparators are inherently integrated by approach. Thereby, comparators resolve both challenges in quantitative analysis.

This section introduces comparison languages and comparator automata as a a class of langauges/automata that can read pairs of weight sequences synchronously and establish an equality or inequality relationship between these sequences. Formally, we define: 1 . If the comparison language for f over Σ × Σ with inequality relation R is ωregular, then the comparison language for f over Σ × Σ with all inequality and equality relations R ∈ {≤, ≥, <, >, =, =} is ω-regular aggregate function f is ω-regular.

The comparison language for f over Σ × Σ with relation = is ω-regular iff the comparison language for f over Σ × Σ with relation = is ω-regular. 3 . If the comparison language for f over Σ × Σ with relation R is not ω-regular, then the comparison language for f over Σ × Σ with all inequality relations R ∈ {≤, ≥, <, >} is not ω-regular. But note that the language of the complementation of A is the comparison language for >. Therefore, comparison language for > is also ω-regular. Now, it is easy to show that (A, B) is a word in the comparison language for ≤ iff (B, A) is a word in the comparison language of ≥. Now the Büchi automaton for comparison language for ≥ can be constructed from A by swapping the alphabet. Therefore, the comparison language for ≥ is also ω-regular. By the same reason as above, if the comparison language for ≥ is ω-regular, then the comparison language for < is also ω-regular.

Finally, since the intersection of Büchi automaton is also a Büchi automaton, the Büchi automaton for comparison language of = can be obtained by intersecting those for ≤ and ≥. Lastly, ≤ can be obtained by complementing that for =.

Hence, one can prove that if the comparison language for any one inequality 

relation is ω-regular is sufficient to show that the comparison language for all relations is ω-regular. Item 2 and Item 3 can similarly be shown using closure properties of Büchi automaton.

It is worth mentioning that Item( 1) is a means to demonstrate whether an aggregate function is ω-regular. Chapter 4 will give concrete examples of aggregate function that are ω-regular. We issue a concrete example below to illustrate these concepts.

We explain comparison languages and comparator automata through an example.

The aggregate function we consider is the limit supremum function. Loosely speaking, the limit supremum function determines the maximally appearing value appearing in an infinite sequence. We will construct a Büchi automaton corresponding to its comparator automaton for an inequality relation to show that its comparison language/comparator is ω-regular (Lemma 3.1). This is sufficient due to Theorem 3.1-Item 1.

Observe that the union of Büchi automata A k (Fig 3.1 , Theorem 3.1) for k ∈ {−µ, −µ + 1, . . . , µ} corresponds to the coveted limsup comparator for relation ≥.

Since the union of Büchi automata is also a Büchi automata, this implies that limsup comparator for ≥ is ω-regular.

The limit infimum (liminf, in short) of an integer sequence is the smallest integer that appears infinitely often in it; its comparators will have a similar construction to their limsup counterparts. Hence, the comparison language for the liminf aggregate function is also ω-regular for all relations.

When the comparison language for an aggregate function and a relation can be represented by a Büchi pushdown automaton (and not a Büchi automaton), then the comparison language and comparator automata are referred to as ω-pushdown.

Unlike ω-regular comparison languages, ω-pushdown comparison languages may not exhibit closure properties since the underlying Büchi pushdown automaton are not closed under several operations such as complementation and intersection.

The next chapter (Chapter 4) will illustrate an example of an aggregate function for which the comparison language is ω-pushdown.

This section illustrates the generalizability with ω-regular comparators. We demonstrate this over the problems of quantitative inclusion, and solving quantitative games under perfect and imperfect information. By virtue of the automata-based nature of comparator automata, all algorithms we design are integrated in approach.

The analysis of quantitative dimensions of computing systems such as cost, resource consumption, and distance metrics [14, 19, 66] has been studied thoroughly to design efficient computing systems. Cost-aware program-synthesis [29, 37] and low-cost program-repair [60] have found compelling applications in robotics [58, 69] , education [53] , and the like. Quantitative verification facilitates efficient system design by automatically determining if a system implementation is more efficient than a specification model.

At the core of quantitative verification lies the problem of quantitative inclusion which formalizes the goal of determining which of two given systems is more efficient [39, 55, 72] . In quantitative inclusion, quantitative systems are abstracted as

weighted ω-automata [16, 52, 73] . Recall, a run in a weighted ω-automaton is associated with a sequence of weights. The quantitative dimension of these runs is determined by the weight of runs, which is computed by taking an aggregate of the run's weight sequence. The problem of quantitative inclusion between two weighted ω-automata determines whether the weight of all words in one automaton is less that (or equal to) that in the other automaton. It can be thought of as the quantitative generalization of (qualitative) language inclusion.

In the chapter's preface, we saw how the solution approaches and complexity for f -inclusion vary with differences in the aggregate function f . This section presents a generic algorithm (Algorithm 1) to solve quantitative inclusion for function which permits ω-regular comparators. This section focuses on the non-strict variant of quantitative inclusion. Strict quantitative inclusion is similar, hence has been skipped.

Given weighted ω-automata P and Q with an aggregate function f , let whether P is non-strictly f -included in Q be denoted by P ⊆ f Q. For sake of succinctness, we simply say inclusion with an ω-regular comparator to mean f -inclusion where the function f permits an ω-regular comparator.

The following running example is used to walk us through the steps of InclusionRegular, the generic algorithm for inclusion with ω-regular comparators presented in Algo- Let weighted ω-automata P and Q be as illustrated in Fig. 3 .2a- 3.2b with the limsup aggregate function. From Theorem 3.2 we know that the limsup comparator A ≤ LS for ≤ is ω-regular. Therefore, the generic algorithm we describe will apply to the example.

The word w = a ω has one run ρ P 1 = p 1 p ω 2 with weight sequence wt P 1 = 1 ω in P and two runs ρ Q 1 = q 1 q ω 2 with weight sequence wt Q 1 = 0, 1 ω and run ρ Q 2 = q 1 q ω 2 with weight sequence wt Q 2 = 2, 1 ω . Clearly, wt P (w) ≤ wt Q (w). Therefore P ⊆ f Q.

This section describes Algorithm 1 for inclusion for ω-regular comparators with the motivating example as a runnign example. Intuitively, the algorithm must be able to identify that for run ρ P 1 of w in P , there exists a run

is accepted by the limsup comparator for ≤.

Key ideas A run ρ P in P on word w ∈ Σ ω is said to be dominated w.r.t P ⊆ f Q if there exists a run ρ Q in Q on the same word w such that wt P (ρ P ) ≤ wt Q (ρ Q ).

The central construction of InclusionRegular is a Büchi automaton Dom that consists of exactly the domianted runs of P w.r.t P ⊆ f Q. Then, InclusionRegular returns True iff Dom contains all runs of P . In order to construct Dom, the algorithm first constructs another Büchi automaton DomProof that accepts word (ρ P , ρ Q ) iff ρ P and ρ Q are runs of the same word in P and Q respectively, and wt P (ρ P ) ≤ wt Q (ρ Q )

i.e. if w P and w Q are weight sequence of ρ P and ρ Q , respectively, then (w P , w Q ) is present in the ω-regular comparator A ≤ f for aggregate function f with relation ≤. The projection of DomProof on runs of P results in Dom.

Algorithm details For sake a simplicity, we assume that every word present in P is also present in Q i.e. P ⊆ Q (qualitative inclusion). InclusionRegular 1 . UniqueId: AugmentWtAndLabel transforms weighted ω-automaton A into Büchi automatonÂ by converting transition τ = (s, a, t) with weight γ(τ ) in A to transitionτ = (s, (a, γ(τ ), l), t) inÂ, where l is a unique label assigned to transition τ . The wordρ = (a 0 , n 0 , l 0 )(a 1 , n 1 , l 1 ) · · · ∈Â iff run ρ ∈ A on word a 0 a 1 . . . with weight sequence n 0 n 1 . . . . Labels ensure bijection between runs in A and words inÂ. Words ofÂ have a single run inÂ. Hence, transformation of weighted ω-automata P and Q to Büchi automataP andQ enables disambiguation between runs of P and Q (Line 1-4).

The correspondingÂ for weighted ω-automata P and Q from Figure 3 .2a- 3.2b are given in Figure 3 .3a-3.3b respectively.

Compare: The output of this step is the Büchi automaton Dom, that contains the wordρ ∈P iff ρ is a dominated run in P w.r.t P ⊆ f Q (Lines 5-4).

MakeProduct(P ,Q) constructsP ×Q s.t. word (ρ P ,ρ Q ) ∈P ×Q iff ρ P and ρ Q are runs of the same word in P and Q respectively (Line 5). Concretely,

transitionτ P ×τ Q = ((s P , s Q ), (a, n P , l P , n Q , l Q ), (t P , t Q )) is inP ×Q, as shown in Figure 3 .3c.

Intersect intersects the weight components ofP ×Q with comparator A ≤ f (Line 3). The resulting automaton DomProof accepts word (ρ P ,ρ Q ) iff f (ρ P ) ≤ f (ρ Q ), and ρ P and ρ Q are runs on the same word in P and Q respectively. The result of Intersect betweenP ×Q with the limsup comparator A ≤ LS for relation ≤ (Figure 3 .3d) is given in Figure 3 .3e.

The projection of DomProof on the words ofP returns Dom which contains the wordρ P iff ρ P is a dominated run in P w.r.t P ⊆ f Q (Line 4), as shown in Algorithm details We now prove the correctness of the algorithm and determine its complexity. We begin with a proof of correctness. in P (or Q) is represented by the unique wordρ = (w, wt, l) inP (orQ) where l is the unique label sequence associated with each run in P (or Q). Since every label on each transition is separate,P andQ are deterministic automata. Now,P ×Q is constructed by ensuring that two transitions are combined in the product only if their alphabet is the same. Therefore if (w, wt 1 , l 1 , wt 2 , l 2 ) ∈P ×Q, thenρ = (w, wt 1 , l 1 ) ∈P , σ = (w, wt 2 , l 2 ) ∈Q. Hence, there exist runs ρ and σ with weight sequences wt 1 and wt 2 in P and Q, respectively. Next,P ×Q is intersected over the weight sequences with ω-regular comparator A ≤ f for aggregate function f and relation ≤. Therefore (w, wt 1 , l 1 , wt 2 , l 2 ) ∈ DomProof iff f (wt 1 ) ≤ f (wt 2 ). Therefore runs ρ in P and σ in Q are runs on the same word s.t. aggregate weight in P is less than or equal to that of σ in Q. Therefore Dom constitutes of theseρ. Therefore Dom consists ofρ only if ρ is a dominated run in P w.r.t P ⊆ f Q.

Every step of the algorithm has a two-way implication, hence it is also true that every dominated run in P w.r.t P ⊆ f Q is present in Dom. iff P ⊆ f Q.

Proof 5P consists of all runs of P . Dom consists of all dominated run in P w.r.t

It is worth noting that Algorithm InclusionRegular can be easily adapted to solve strict quantitative inclusion, denoted P ⊂ f Q, by repeating the same procedure with the ω-regular comparator with the inequality relation <. In this case, a run ρ P in P on word w ∈ Σ ω is said to be dominated w.r.t P ⊂ f Q if there exists a run ρ Q in Q on the same word w such that wt P (ρ P ) < wt Q (ρ Q ). A similar adaption will work for quantitative equivalence, denoted P ≡ f Q.

Lastly, we present the complexity analysis InclusionRegular: The PSPACE-hardness of the quantitative inclusion is established via reduction from the qualitative inclusion problem, which is PSPACE-complete. The formal reduction is as follows: Let P and Q be Büchi automata (with all states as accepting states). Reduce P , Q to weighted automata P , Q by assigning a weight of 1 to each transition. Since all runs in P , Q have the same weight sequence, weight of all words in P and Q is the same for any function f . It is easy to see P ⊆ Q (qualitative inclusion) iff P ⊆ f Q (quantitative inclusion).

Theorem 3.3 extends to weighted ω-automata when weight of words is the infimum of weight of runs. The key idea for P ⊆ f Q here is to ensure that for every run ρ Q in Q there exists a run on the same word in ρ P in P s.t. f (ρ P ) ≤ f (ρ Q ).

When P f Q, there exists word(s) w ∈ Σ * s.t wt P (w) > wt Q (w). Such a word w is said to be a counterexample word. The ability to extract counterexamples and analyze them has been of immense advantage in verification and synthesis in qualitative systems [21] .

Here, we show how InclusionRegular can be adapted to yields Büchi automatonrepresentations for all counterexamples of an instance of quantitative inclusion for ω-regular comparators. Hopefully, this could be a starting point for counter-example guided frameworks in the quantitative verification of systems:

Theorem 3.4 All counterexamples of the quantitative inclusion problem for an ωregular aggregate function can be expressed by a Büchi automaton.

Proof 7 For word w to be a counterexample, it must contain a run in P that is not dominated. Clearly, all non-dominated runs of P w.r.t to the quantitative inclusion are members ofP \ Dom. The counterexamples words can be obtained fromP \ Dom by modifying its alphabet to the alphabet of P by dropping transition weights and their unique labels.

Recall, a quantitative graph with complete information game, referred to as graph game in short, is defined over a structure G = (V = V 0 V 1 , v init , E, γ, f ). It consists of a directed graph (V, E), and a partition (V 0 , V 1 ) of its set of states V . State v init is the initial state of the game. vE designates the set {w ∈ V |(v, w) ∈ E} to indicate the successor states of state v ∈ V . For convenience, we assume that every state has at least one outgoing edge, i.e, vE = ∅ for all v ∈ V . Each transition of the game is associated with a cost determined by the cost function γ : E → Z. Finally,

We illustrate how to solve the satisficing problem over quantitative games when the comparator automata for the aggregate function is ω-regular.

Wlog, let the winning condition for the maximizing player be to ensure that the cost of resulting play is greater than or equal to a given threshold value v. Then the winning condition for the minimizing player is to ensure that the cost of plays is less than threshold v.

For sake of simplicity, let the threshold value be 0. Further, suppose that f (B) = 0

where B is the sequence of all 0s. The we claim that the satisficing problem will reduce to solving a parity game.

Let A be the comparator automata for the inequality ≥. Currently A accepts a sequence (A, B) iff f (A) ≥ f (B). We can convert A to an NBA that accepts sequence A iff f (A) ≥ 0 if we fix sequence B to 0 ω . Let us denote this by A 0 . Note that the size of A 0 is the same as A. Finally, determinize A 0 into a parity automata P . Now, the winning condition of the maximizing agent in this game is that the weight sequence of a play must be accepted by the automaton P . Therefore, we take synchronized product of the quantitative game with the parity automata to obtain a parity game. Then the maximizing player will win the quantitative game iff it wins in the aforementioned parity game. states with |A f | colors, where |D| = |A f | |A f .

Since, D f is the deterministic parity automaton equivalent to

The parity automaton will have |A f | colors. Therefore, the product game will be a parity automaton with states equal to the product of the size of the game and the parity automaton, and colors equal to the size of the initial NBA.

Given an incomplete-information quantitative game G = (S, s I , O, Σ, δ, γ, f ), our objective is to determine if player P 0 has a winning strategy α : O * → Σ for ω-regular aggregate function f . We assume we are given the ω-regular comparator A f for function f . Note that a function A * → B can be treated like a B-labeled A-tree, and vice-versa. Hence, we proceed by finding a Σ-labeled O-tree -the winning strategy tree. Every branch of a winning strategy-tree is an observed play o ρ of G for which every actual play ρ is a winning play for P 0 .

We first consider all game trees of G by interpreting G as a tree-automaton over Σ-labeled S-trees. Nodes n ∈ S * of the game-tree correspond to states in S and labeled by actions in Σ taken by player P 0 . Thus, the root node ε corresponds to s I , and a node s i 0 , . . . , s i k corresponds to the state s i k reached via s I , s i 0 , . . . , s i k−1 .

Consider now a node x corresponding to state s and labeled by an action σ. Then

x has children xs 1 , . . . xs n , for every s i ∈ S. If s i ∈ δ(s, σ), then we call xs i a valid child, otherwise we call it an invalid child. Branches that contain invalid children correspond to invalid plays.

A game-tree τ is a winning tree for player P 0 if every branch of τ is either a winning play for P 0 or an invalid play of G. is then reduced to non-emptiness checking of M.

Given an incomplete-information quantitative game G and ω-regular comparator A f for the aggregate function f , the complexity of determining whether P 0 has a winning strategy is exponential in

Since, D f is the deterministic parity automaton equivalent to

The thinning operation is linear in size of |G × D f |, therefore |M| = |G| · |D f |. Non-emptiness checking of alternating parity tree automata is exponential. Therefore, our procedure is doubly exponential in size of the comparator and exponential in size of the game.

The question of tighter bounds is open.

This chapter identified a novel mode for comparison in quantitative systems: the online comparison of aggregate values of sequences of quantitative weights. This notion is embodied by comparators automata that read two infinite sequences of weights synchronously and relate their aggregate values. We showed that ω-regular comparators not only yield generic algorithms for problems including quantitative inclusion and winning strategies in incomplete-information quantitative games, they also result in algorithmic advances.

We believe comparators, especially ω-regular comparators, can be of significant utility in verification and synthesis of quantitative systems, as demonstrated by the existence of finite-representation of counterexamples of the quantitative inclusion problem. Another potential application is computing equilibria in quantitative games.

Applications of the prefix-average comparator, in general ω-context-free comparators, is open to further investigation. Another direction to pursue is to study aggregate functions in more detail, and attempt to solve the conjecture relating ω-regular aggregate functions and ω-regular comparators.

Chapter 3 laid down the theoretical foundations of comparator automata, and demonstrated that for the class of aggregate functions that permit ω-regular comparators, comparator automata result in algorithms that are generalizable as well as integrated in approach. This chapter asks a concrete question: Which aggregate functions allow ω-regular comparators?

We know from Theorem 3.2 that limsup and liminf posses ω-regular comparators.

This chapter investigates the comparators for commonly appearing aggregate functions, namely discounted-sum and limit-average. We observe that discounted-sum permits ω-regular comparators iff the discount factor is an integer (Section 4.1-4.2), while limit-average does not permit ω-regular comparators (Section 4.3). In an attempt to establish a broader characterization of which aggregate function permit ω-regular comparators, we investigate the class of ω-regular functions. We deliver positive news: All ω-regular functions will have an ω-regular comparator (Section 4.4).

Last but not the least, despite being a generalizable framework, comparator-based algorithms may be better in complexity that existing approaches. In particular, for the case of DS with integer discount factors, we observe that the comparator-based algorithm is PSPACE as opposed to the previously known EXPTIME and EXPSPACE algorithm. This also establishes that DS inclusion for integer discount factor is PSPACE- 

Recall, discounted-sum aggregation accumulates diminishing returns. This section establishes is that a DS comparison language and DS comparator are not ω-regular when the associated discount factor is not an integer.

We begin by formally defining DS comparison languages and DS comparator au- The proof begins with defining a cut-point language [38] : For a weighted ωautomaton A and a real number r ∈ R, the cut-point language of A w.r.t. r is defined as L ≥r = {w ∈ L(A)|wt A (w) ≥ r}. When the discount factor is a rational value 1 < d < 2, it is known that not all deterministic weighted ω-automaton with discounted-sum aggregate function (DS-automaton, in short) have an ω-regular cutpoint language for an r ∈ R [38] . In this section, this prior result is first extended to all non-integer, rational discount factors d > 1. Finally, this extension is utilized to establish that the DS comparison language for a non-integer, rational discount-factors d > 1 is not ω-regular. This proves that the DS aggregation function is not ω-regular when the discount factor is a non-integer, rational number.

Theorem 4.1 Let d > 1 be a non-integer, rational discount factor. There exists a deterministic discounted-sum automata A and a rational value r ∈ Q such that its cut-point language w.r.t. r is not ω-regular.

Proof 10 Since the proof for 1 < d < 2 has been presented in [38] , we skip that case.

The proof presented here extends the earlier result on 1 < d < 2 from [38] to all non-integer, rational discount factors d > 1.

Let d > 2 be a non-integer, rational discount-factor. Define deterministic discounted-sum automata A over the alphabet {0, 1, . . . , d −1} such that the weight of transitions on alphabet n ∈ {0, 1, . . . , d −1} is n. Therefore, weight of word w ∈ A is DS (w, d). Consider its cut-point language L ≥( d −1) . We say a finite-length word w

Intuitively, a finite word w is ambiguous if it can be extended into infinite words in A such that some extensions lie in L ≥( d −1) and some do not. Clearly, the finite word w = d − 2 is ambiguous.

Note that when d > 2 is non-integer, rational valued, then d −1 d−1 > 1, so word d − 2 falls within the range for ambiguity. We claim that if finite word w is ambiguous, then either w · ( d − 2) or w · ( d − 1) is ambiguous. To prove ambiguity, we need

is a is non-integer, rational discount-factor. Therefore, every ambiguous finite word can be extended to another ambiguous word. This means there exists an infinite word w ≥ such that DS (w ≥ , d) = d − 1 and all finite prefixes of w ≥ are ambiguous.

Let us assume that the language L ≥( d −1) is ω-regular and represented by Büchi automaton B.

For n < m, let the n-and m-length prefixes of w ≥ , denoted w ≥ [0, n − 1] and w ≥ [0, m − 1], respectively, be such that they reach the same states in B. Then there exists an infinite length

Eliminating DS (w s , d) from the equations and simplification, we get:

The above is a polynomial over d with degree m − 1 and integer coefficients. Specifically, d = p q > 2 such that integers p, q > 1, and p and q are mutually prime. Since d = p q is a root of the above equation, q must divide co-efficient of the highest degree term, in this case it is m − 1. The co-efficient of the highest degree term in the polynomial above is (w ≥ [0] − ( d − 1)). Recall from construction of w ≥ above,

So the co-efficient of the highest degree term is −1, which is not divisible by integer q > 1. Hence, resulting in a contradiction.

Finally, we use Theorem 4.1 to prove the DS comparison language is not ω-regular when the discount-factor d > 1 is not an integer.

Theorem 4.2 Comparison language for the DS aggregation function is not ω-regular when the discount-factor d > 1 is a non-integer, rational number.

To prove that the DS aggregation function is not ω-regular when the discount factor is not an integer, it is sufficient to prove that DS comparison language for ≥ is not ω-regular.

Let d > 1 be a non-integer, rational discount factor. Let A be the weighted ω-automaton as described in proof of Lemma 4. 1 . Consider its cut-point language L ≥( d −1) . From Lemma 4.1 and [38] , we know that L ≥( d −1) is not an ω-regular language.

Suppose there exists an ω-regular DS comparator A ≤ d for non-integer rational discount factor d > 1 for relation ≥. We define the Büchi automaton P s.t. L(P) = Since the DS comparison language with a non-integer discount-factor for any one inequality relation is not ω-regular, due to closure properties of ω-regular (Theorem 3.1-Item 3) this implies that the DS comparison language for all other inequities is also not ω-regular all inequalities is also not ω-regular. Finally, the cut-point argument can be extended to = and = relation to show that their DS comparison languages with non-integer discount-factors is also not ω-regular.

This section proves that a DS comparison languages is ω-regular when the discountfactor d > 1 is an integer. Hence, DS aggregation function is ω-regular for integer discount-factors. The result is proven by explicitly constructing a Büchi automaton that accepts the DS comparison language for an arbitrary upper bound µ, an inequality relation R and integer discount factor d > 1.

An immediate side-effect of this result is that DS inclusion is PSPACE-complete.

Not only does this improve upon the previously best known algorithm that was EXPTIME and EXPSPACE, it also resolves 15- year long open question of the complexity class of DS inclusion with integer discount factor (Section 4.2.1).

Recall the definitions of DS comparison language and DS comparator automata from Section 4. 1. Let integer µ > 0 be the upper-bound on sequences. The core intuition is that bounded sequences can be converted to their value in an integer base d via a finitestate transducer. Lexicographic comparison of the converted sequences renders the desired DS-comparator. Conversion of sequences to base d requires a certain amount of look-ahead by the transducer. Here we describe a method that directly incorporates the look-ahead with lexicographic comparison to obtain the Büchi automaton corresponding to the DS comparator for integer discount factor d > 1.

We explain the construction in detail now. We complete the construction for the relation <. For sake of simplicity, we assume that sequences the weight sequences are positive integer sequences. Under this assumption, for a weight sequence A and integer discount-factor d > 1, DS (A, d) can be interpreted as a value in base d i.e. [42] . The proofs can be extended to integer sequences easily.

Unlike comparison of numbers in base d, the lexicographically larger sequence may not be larger in value since (i) The elements of weight sequences may be larger in value than base d, and (ii) Every value has multiple infinite-sequence representations.

To overcome the two challenges mentioned above, we resort to arithmetic techniques

and DS (C, d) > 0. Therefore, to compare the discounted-sum of A and B, the objective is to obtain the sequence C. Arithmetic in base d also results in sequence X of carry elements. Then:

Lemma 4.1 Let A, B, C, X be weight sequences, d > 1 be a positive integer such that following equations holds true: If we can prove that X and C are also bounded-sequences and can be constructed from a finite-set of integers, we would be able to further proceed to construct a Büchi automaton for the desired comparator.

We proceed by providing an inductive construction of sequences C and X that satisfy properties in Lemma 4.1, and show that these sequences are bounded when A and B are bounded. In particular, when A and B are bounded integer-sequences, then sequences C and X constructed here are also bounded-integer sequences. Therefore, they are be constructed from a finite-set of integers. Proofs for sequence C are in Lemma 4.3-Lemma 4.5, and proof for sequence X is in Lemma 4. 6 .

We begin with introducing some notation.

We define the residual function Res : N ∪ {0} → R as follows:

Then we define C[i] as follows: 

Then, we define X[i] as follows:

Therefore, we have defined sequences C and X as above. We now prove the desired properties one-by-one.

First, we establish sequences C, X as defined here satisfy Equations 1-2 from Lemma 4. 1 . Therefore, ensuring that C is indeed the difference between sequences B and A, and X is their carry-sequence.

Lemma 4.2 Let A and B be bounded integer sequences and C and X be defined as above. Then,

Proof 13 We prove this by induction on i using definition of function X.

Suppose the invariant holds true for all i ≤ n, we show that it is true for n +

Next, we establish the sequence C is a bounded integer-sequences, therefore it can be represented by a finite-set of integers. 

i = 0, Res(0) = DS − (B, A, d, ·)− DS − (B, A, d, ·) . By definition of C[0],Res(0) = DS − (B, A, d, ·) − C[0] ⇐⇒ Res(0) = DS − (B, A, d, ·) − CSum(0).

Suppose the induction hypothesis is true for all i < n. We prove it is true when

We show this is true even for k + 1.

x ≥ 0, and fractional 0 ≤ f < 1. Then, from definition of Res, we get Res(k 

Therefore, we have established that sequence C is non-negative integer-valued and is bounded by maxC = µ · d d−1 . Finally, we prove that sequence X is also a bounded-integer sequence, thereby proving that it is bounded, and can be represented with a finite-set of integers. Note that for all i ≥ 0, by expanding out the definition of X[i] we get that X[i] is an integer for all i ≥ 0. We are left with proving boundedness of X:

We summarize our results from 4.2-Lemma 4.6 as follows: integer-valued sequences X and C that satisfy the conditions in Lemma 4. 1 . Furthermore, C and X are bounded as follows:

Intuitively, we construct a Büchi automaton A < d with states of the form (x, c) where x and c range over all possible values of X and C, respectively, and a special initial state s. Transitions over alphabet (a, b) replicate the equations in Lemma 4. 1 

transitions from the start state (s, (a, b), (x, c)) satisfy a + c + x = b to replicate Equation 1 (Lemma 4.1) at the 0-th index, and all other transitions ((

The complete construction is as follows:

Full and final construction Let d > 1 be an integer discount factor, µ > 0 be the upper bound, and < be the strict inequality relation. The DS comparator automata with discount factor d > 1, upper bound µ, and relation < is constructed as follows:

• State s is the initial state, and F are accepting states

• δ d ⊆ S × Σ × S is defined as follows: 1 . Transitions from start state s: Furthermore, since the first transition to accepting states (x, c) where c = ⊥ is possible Note that the proof of Theorem 3.1-Item-1 also gives a way to construct the DS comparators for all other equality or inequality relations. However, if those were to be followed, then the resulting DS comparators may have a very large number of states. For instance, the proof of of Theorem 3.1-Item-1 suggests to construct the DS comparator for ≥ by taking the complement of A < d constructed above. Büchi complementation [80] will result in an exponential blow-up in the state space. Hence, this method of constructing DS comparators for the remaining relations is not practical.

The silver lining here is that the construction of A < d can be modified in trivial ways to obtain the DS comparators for all other relations. Hence, For ≤: The DS comparator for ≤ can be constructed from the DS comparator for < by changing the accepting states only. For the DS comparator for ≤, let the accepting states be F ∪ S ⊥ . Intuitively, the states S ⊥ are those where the discountedsum of the difference sequence C is 0, since sequence C = 0 ω . Therefore, by adding S ⊥ to the accepting states of DS comparator for <, we have included all those sequence for which DS (C, d) = 0. Thereby, converting it to the DS comparator for ≤.

For =: In this case, set the accepting states in the DS comparator for < to S ⊥ only. This way, the automaton will accept a sequence (A, B) iff the corresponding C is such that DS (C, d) = 0 since C = 0ω. 

Finally, we utilize the ω-regularity of DS comparators with integer discount factors to establish that DS inclusion is PSPACE-complete when the discount factor is an integer. The prior best known algorithm is EXPTIME and EXPSPACE [31, 39] , which does not match with its known PSAPCE lower bound. complexity is based on an exponential determinization construction (subset construction) combined with arithmetical reasoning [31, 39] . We observe that the determinization construction can be performed on-the-fly in PSPACE. To perform, however, the arithmetical reasoning on-the-fly in PSPACE would require essentially using the same bit-level ((x, c)-state) techniques that we have used to construct DS-comparator.

The limit-average of a sequence refers to the point of convergence of the average of prefixes of the sequence. Unlike discounted-sum, limit average is known to not converge for all sequences. To work around this limitation, most applications simply use limit-average infimum or limit-average supremum of sequences [33, 39, 40, 93] . We argue that the usage of limit-average infimum or limit-average supremum in liu of limit-average for purpose of comparison can be misleading. Such inaccuracies in limit-average comparison may occur when the limit-average of at least one sequence does not exist. However, it is not easy to distinguish sequences for which limit-average exists from those for which it doesn't.

We define prefix-average comparison as a relaxation of limit-average comparison. Prefix-average comparison coincides with limit-average comparison when limitaverage exists for both sequences. Otherwise, it determines whether eventually the average of prefixes of one sequence are greater than those of the other. This comparison does not require the limit-average to exist to return intuitive results. Further, we show that the prefix-average comparator is ω-context-free.

Let Σ = {0, 1, . . . , µ} be a finite alphabet with µ > 0. The limit-average language L LA contains the sequence (word) A ∈ Σ ω iff its limit-average exists. Suppose L LA were ω-regular, then

over finite words. The limit-average of sequences is determined by its behavior in the limit, so limit-average of sequences in V ω i exists. Additionally, the average of all (finite) words in V i must be the same. If this were not the case, then two words in V i with unequal averages l 1 and l 2 , can generate a word w ∈ V ω i s.t the average of its prefixes oscillates between l 1 and l 2 . This cannot occur, since limit-average of w exists. Let the average of sequences in V i be a i , then limit-average of sequences in V ω i and U i · V ω i is also a i . This is contradictory since there are sequences with limit-average different from the a i (see appendix). Similarly, since every ω-CFL is [45] , a similar argument proves that L LA is not ω-context-free.

Theorem 4.6 L LA is neither an ω-regular nor an ω-context-free language.

We first prove that L LA is not ω-regular.

Let us assume that the language L LA is ω-regular. Then there exists a finite

For all i ∈ {0, 1, . . . n}, the limit-average of any word in U i ·V ω i is given by the suffix

Therefore, limit-average of all words in V ω i must exist. Now as discussed above, the average of all words in V i must be the same. Furthermore, the limit-average of all words in V ω i must be the same, say LimAvg(w) = a i for all w ∈ V ω i .

Then the limit-average of all words in L LA is one of a 0 , a 1 . . . a n . Let a = p q s.t p < q, snd a = a i for i ∈ {0, 1, . . . , µ}. Consider the word w = (1 p 0 q−p ) ω . It is easy to see the LimAvg(w) = a. However, this word is not present in L LA since the limit-average of all words in L LA is equal to a 0 or a 1 . . . or a n .

Therefore, our assumption that L LA is an ω-regular language has been contradicted.

Next we prove that L LA is not an ω-CFL.

Every ω-context-free language can be written in the form of n i=0 U i · V ω i where U i and V i are context-free languages over finite words. The rest of this proof is similar to the proof for non-ω-regularity of L LA .

In the next section, we will define prefix-average comparison as a relaxation of limit-average comparison. To show how prefix-average comparison relates to limitaverage comparison, we will require the following two lemmas: Quantifiers ∃ ∞ i and ∃ f i denote the existence of infinitely many and only finitely many indices i, respectively.

Lemma 4.7 Let A and B be sequences s.t. their limit average exists. If

Proof 23 Let the limit average of sequence A, B be a, b respectively. Since the limit average of A and B exists, for every > 0, there exists N s.t. for all n > N , 

Proof 24 Let the limit-average of sequence A, B be L a , L b respectively. Since, the limit average of both A and B exists, for every > 0, there exists N s.t. for all

By arguing as in Lemma 4.7, it must be the case that for all n > Nk

But this is not possible, since we are given that

Hence LimAvg(A) ≥ LimAvg(B).

The previous section relates limit-average comparison with the sums of equal length prefixes of the sequences (Lemma 4.7-4.8). The comparison criteria is based on the number of times sum of prefix of one sequence is greater than the other, which does not rely on the existence of limit-average. Unfortunately, this criteria cannot be used for limit-average comparison since it is incomplete (Lemma 4.8). Specifically, for sequences A and B with equal limit-average it is possible that

. Instead, we use this criteria to define prefix-average comparison. In this section, we define prefix-average comparison and explain how it relaxes limit-average comparison. Lastly, we construct the prefix-average comparator, and prove that it is not ω-regular but is ω-context-free. 

Note that, by definition prefix average comparison is defined on inequality relation ≥ or ≤, and not for the other inequality or equality relations. Intuitively, prefix-average comparison states that PrefixAvg(A) ≥ PrefixAvg(B) if eventually the sum of prefixes of A are always greater than those of B. We use ≥ since the average of prefixes may be equal when the difference between the sum is small. It coincides with limitaverage comparison when the limit-average exists for both sequences. Definition4.1 and Lemma 4.7-4.8 relate limit-average comparison and prefix-average comparison:

When limit-average of A and B exists, then

The first item falls directly from definitions. Lemma 4.9 (Pumping Lemma for ω-regular language [15] ) Let L be an ω-

for all i, w i = x i y i z i , |x i y i | ≤ p and |y i | > 0 and for every sequence of pumping factors

Theorem 4.7 The prefix-average comparator for ≥ is not ω-regular.

Select as factor w i (from Lemma 4.9) the sequence (0, 1) p . Pump each y i enough times so that the resulting word isŵ = (Â,

We discuss key ideas and sketch the construction of the prefix average comparator.

the difference between sum of i-length prefix of A and B.

For sequences A and B to satisfy

While reading a word, the prefix-sum difference is maintained by states and the stack of ω-PDA: states maintain whether it is negative or positive, while number of tokens in the stack equals its absolute value. The automaton non-deterministically guesses the aforementioned index N , beyond which the automaton ensure that prefix-sum difference remains positive.

The push-down comparator A ≥ PA consists of three states: (i) State s P and (ii) State s N that indicate that the prefix-sum difference is greater than zero and or not respectively, State transition between s N and s P occurs only if the stack action is to pop but the stack consists of k < |a − b| tokens. In this case, stack is emptied, state transition is performed and |a − b| − k tokens are pushed into the stack. For an execution of (A, B) to be an accepting run, the automaton non-deterministically transitions into state s F . State s F acts similar to state s P except that execution is terminated if there aren't enough tokens to pop out of the stack. A ≥ PA accepts by accepting state. To see why the construction is correct, it is sufficient to prove that at each index i, the number of tokens in the stack is equal to |Sum( 

We provide a sketch of the construction of the Büchi push-down autoamaton A ≥ PA , and then prove that it corresponds to the prefix average comparator.

Let µ be the bound on sequences. Then Σ = {0, 1, . . . , n} is the alphabet of

• Σ × Σ is the alphabet of the language.

• Γ = {Z 0 , α} is the push down alphabet.

• s 0 = s N is the start state of the push down automata.

• Z 0 is the start symbol of the stack.

• s F is the accepting state of the automaton. Automaton A ≥ PA accepts words by final state.

• Here we give a sketch of the behavior of the transition function δ.

-When A ≥ PA is in configuration (s P , τ ) for τ ∈ Γ, push a number of α-s into the stack.

Next, pop b number of α-s. If after popping k α-s where k < b, the PDA's configuration becomes (s P , Z 0 ), then first move to state (s N , Z 0 ) and then resume with pushing b − k α-s into the stack.

-When A ≥ PA is in configuration (s N , τ ) for τ ∈ Γ, push b number of α-s into the stack Next, pop a number of α-s. If after popping k α-s where k < a, the PDA's configuration becomes (s N , Z 0 ), then first move to state (s P , Z 0 ) and then resume with pushing a − k α-s into the stack.

-When A ≥ PA is in configuration (s P , τ ) for τ = Z 0 , first move to configuration (s F , τ ) and then push a number of α-s and pop b number of α-s. Note that there are no provisions for popping α if the stack hits Z 0 along this transition.

-

Note that there are no provisions for popping α if the stack hits Z 0 along this transition. On A ≥ PA this corresponds to the condition that there being only finitely many times when the PDA is in state N during the run of (A, B). This is ensured by the push down automaton since the word can be accepted only in state F and there is no outgoing edge from F . Therefore, every word that is accepted by

. If a run of (A, B) switches to F at this m, then it will be accepted by the push down automaton. Since A ≥ PA allows for non-deterministic move to (F, τ ) from (P, τ ), the run of (A, B) will always be able to move to F after index m. Hence,

Theorem 4. 8 The prefix-average comparator for relation ≥ is an ω-CFL.

While ω-CFL can be easily expressed, they do not possess closure properties, and problems on ω-CFL are easily undecidable. Hence, the application of ω-context-free comparator will require further investigation. The question we ask is whether a necessary and sufficient condition for an aggregate function to have an ω-regular compartor is that the function should be ω-regular?

We prove one side of the argument. We show that if a function is ω-regular, then its comparator will be ω-regular. A proof/disproof of the other direction is open for investigation.

The argument showing that every ω-regular function will have ω-regular comparators is given below: relations is also ω-regular.

We show that if an aggregate function is ω-regular under base β, then its comparator for relation > is ω-regular. By closure properties of ω-regular comparators, this implies that comparators of the aggregate function are ω-regular for all inequality and equality relations.

But first we prove that for a given integer base β ≥ 2 there exists an automaton

and β > 2 be an integer base. Let rep(a, β) = sign a · (Int(a, β), Frac(a, β)) and

Then, the following statements can be proven using simple evaluation from definitions:

• When sign a = + and sign b = −. Then a > b.

Since Int(a, β) and Int(b, β) eventually only see digit 0 i.e. they are necessarily identical eventually, there exists an index i such that it is the last position where Int(a, β) and Int(b, β) differ. If ) and Int(b, β) eventually only see digit 0 i.e. they are necessarily identical eventually. Therefore, there exists an index i such that it is the last position where Int(a, β) and

-Finally, if Int(a, β) = Int(b, β) and Frac(a, β) = Frac(b, β): Then a = b.

• When sign a = − and sign b = +. Then a < b.

Since the conditions given above are exhaustive and mutually exclusive, we conclude that for all a, b ∈ R and integer base β ≥ 2, let rep(a, β) = sign a · (Int(a, β), Frac(a, β))

. Then a > b iff one of the following conditions occurs:

1. sign a = + and sign b = −.

Note that each of these five condition can be easily expressed by a Büchi automaton over alphabet AlphaRep(β) for an integer β ≥ 2. For an integer β ≥ 2, the union of all these Büchi automata will result in a Büchi automaton A β such that for all a, b ∈ R and A = rep(a, β) and

Now we come to the main part of the proof. Let f : Σ ω → R be an ω-regular aggregate function with aggregate function automata A f . We will construct an ω-regular

above. Since A f and A β are both Büchi automata, the comparator for function f with relation > is also a Büchi auotmaton. Therefore, the comparator for aggregate

The converse direction is still open For all aggregate functions considered in this paper for which the comparator is ω-regular, the function has also been ω-regular.

But that is not a full proof of the converse direction. For now, due to the lack of any counterexample, we present the converse direction as a conjecture: for an aggregate function f is ω-regular for all inequality and equality relations, then its aggregate function is also ω-regular under base β.

This chapter studied comparator automata for well known aggregate functions, namely discounted-sum and limit average. Among these, only discounted-sum with integer discount factors can be represented by ω-regular comparators. In later chapters, we will observe that once can design ω-regular comparators for approximations to discounted-sum with non-integer discount factor (Chapter 7). To obtain a a broader classification of aggregate functions that permit ω-regular comparators, we show that ω-regular aggregate functions exhibit ω-regular comparators. However, we do not whether ω-regular functions is the sufficient condition for the existence of ω-regular comparators. We conjecture that may be the case, but leave that as an open question.

Quantitative inclusion with discounted-sum

Having laid out the theoretical framework of comparator automata in Part I, Part II and Part III demonstrate the efficacy of the integrated approach proposed by comparators.

This Chapter 3 studies Discounted-sum inclusion or DS inclusion in detail. Recall, in Theorem 4.5 we use comparator automata to establish that DS inclusion is PSPACEcomplete when the discount factor is an integer; The decidability of DS inclusion is still unknown when the discount factor is not an integer. This part will delve into solving DS inclusion in practice.

Chapter 5 conducts the first comparative study of the empirical performance of existing algorithms for DS inclusion with integer discount factors. These are a separation-of-techniques algorithm from prior work, and our ω-regular comparator algorithm for DS inclusion. Our analysis shows how the two approaches complement each other. This is a nuanced picture that is much richer than the one obtained from the complexity-theoretic study alone.

Chapter 6 picks up from where Chapter 5 ends. We prove that DS comparison languages are either safety or co-safety langauges. We show that when this property is utilized to solve DS inclusion using comparators, then comparator-based approach outperforms the separation-of-techniques algorithm on all accounts.

Chapter 7 shifts focus to solving DS inclusion with non-integer discount factors. Currently even its decidability is unknown. Therefore, to solve DS inclusion in practice, we design an anytime algorithm, which will either terminate with a crisp True or False answer after a finite amount of time, or continuously generate a tighter approximation. This algorithm makes use of comparator automata for approximations of DS with non-integer discount factor, which are shown to exhibit regularity.

We refer to a weighted ω-automata with the discounted-sum aggregate function by discounted-sum automata. In detail, a discounted-sum automaton with discount factor

is a Büchi automaton, and γ : δ → N is the weight function that assigns a weight to each transition of automaton M. Words and runs in weighted ω-automata are defined as they are in Büchi automata. Note that all states are accepting states in p ∈ S and for all a ∈ Σ, there exists q ∈ S s.t (p, a, q) ∈ δ. We abuse notation, and use w ∈ A to mean w ∈ L(A) for Büchi automaton or DS-automaton A. Given DS automata P and Q and discount-factor d > 1, the discounted-sum inclusion problem,

Chapter 5

Discounted-sum inclusion, or DS inclusion, is when quantitative inclusion is performed with the discounted-sum aggregation function. Prior work has demonstrated the applicability of DS inclusion in computation of rational solutions in multi-agent systems with rational agents [23] . Yet, the focus on DS inclusion has mostly been from a complexity-theoretic perspective, and not on algorithmic performance. Even in this thesis so far, comparator automata have been used to establish that DS inclusion with integer discount factor is PSPACE-complete. But whether these comparator-based algorithms are scalable and efficient has not been evaluated yet.

To this end, this chapter undertakes a thorough theoretical and empirical analysis of two contrasting approaches for DS inclusion with integer discount factors: our comparator-based integrated algorithm, and the prior known separation-of-techniques algorithm. We present the first implementations of these algorithms, and perform extensive experimentation to compare between the two approaches. Our analysis shows how the two approaches complement each other. This is a nuanced picture that is much richer than the one obtained from the complexity-theoretic study alone.

The hardness of quantitative inclusion for nondeterministic DS automata, or DS inclusion, is evident from PSPACE-hardness of language-inclusion (LI) problem for nondeterministic Büchi automata [87] . Decision procedures for DS inclusion were first investigated in [39] , and subsequently through target discounted-sum [32] , DSdeterminization [31] . The comparator-based argument [25] , presented in Chapter 4, finally established its PSPACE-completeness. However, these theoretical advances in DS inclusion have not been accompanied with the development of efficient and scalable tools and algorithms. This is the focus of this chapter; our goal is to develop practical algorithms and tools for DS inclusion.

Theoretical advances have lead to two algorithmic approaches for DS inclusion.

The first approach, referred to as DetLP, combines automata-theoretic reasoning with linear-programming (LP). This method first determinizes the DS automata [31] , and reduces the problem of DS inclusion for deterministic DS automata to LP [17, 18] .

Since determinization of DS automata causes an exponential blow-up, DetLP yields an exponential time algorithm. An essential feature of this approach is the separation of automata-theoretic reasoning-determinization-and numerical reasoning, performed by an LP-solver. Because of this separation, it does not seem easy to apply on-the-fly techniques to this approach and perform it using polynomial space, so this approach uses exponential time and space.

In contrast, the second algorithm for DS inclusion, referred to as BCV (after name of authors) is purely automata-theoretic [25] , was presented in Chapter 4-Section 4. 2 

handled by a special Büchi automaton, called the comparator, that enables an on-line comparison of the discounted-sum of a pair of weight-sequences. Aided by the comparator, BCV reduces DS inclusion to language-equivalence between Büchi automata.

Since language-equivalence is in PSPACE, BCV is a polynomial-space algorithm.

While the complexity-theoretic argument may seem to suggest a clear advantage for the pure automata-theoretic approach of BCV, the perspective from an implementation point of view is more nuanced. BCV relies on LI-solvers as its key algorithmic component. The polynomial-space approach for LI relies on Savitch's Theorem, which proves the equivalence between deterministic and non-deterministic space complexity [81] . This theorem, however, does not yield a practical algorithm. Existing efficient LI-solvers [4, 5] are based on Ramsey-based inclusion testing [7] or rank-based approaches [64] . These tools actually use exponential time and space. In fact, the exponential blow-up of Ramsey-based approach seems to be worse than that of DSdeterminization. Thus, the theoretical advantage BCV seems to evaporate upon close examination. Thus, it is far from clear which algorithmic approach is superior. To resolve this issue, we provide in this paper the first implementations for both algorithms and perform exhaustive empirical analysis to compare their performance.

Our first tool, also called DetLP, implements its namesake algorithm as it is.

We rely on existing LP-solver GLPSOL to perform numerical reasoning. Our second tool, called QuIP, starts from BCV, but improves on it. The key improvement arises from the construction of an improved comparator with fewer states. We revisit the reduction to language inclusion in [25] accordingly. The new reduction reduces the transition-density of the inputs to the LI-solver (Transition density is the ratio of transitions to states), improving the overall performance of QuIP since LI-solvers are known to scale better at lower transition-density inputs [71] Our empirical analysis reveals that theoretical complexity does not provide a full picture. Despite its poorer complexity, QuIP scales significantly better than DetLP,

although DetLP solves more benchmarks. Based on these observations, we propose a method for DS inclusion that leverages the complementary strengths of these tools to offer a scalable tool for DS inclusion. Our evaluation also exposes the limitations of 

As an example of such a problem formulation, consider the system and specification in Figure 5 .1a and Figure 5 .1b, respectively [39] . Here, the specification P depicts the worst-case energy-consumption model for a motor, and the system S is a candidate implementation of the motor. Transitions in S and P are labeled by transition-action and transition-cost. The cost of an execution (a sequence of actions) is given by an aggregate of the costs of transitions along its run (a sequence of automaton states).

In non-deterministic automata, where each execution may have multiple runs, cost of the execution is the cost of the run with maximum cost. A critical question here is to check whether implementation S is more energy-efficient than specification P . This problem can be framed as a problem of quantitative inclusion between S and P .

A purely complexity-theoretic analysis will indicate that DetLP will fare poorer than BCV in practice since the former is exponential in time and space whereas the later is only polynomial in space. However, an empirical evaluation of these algorithms reveals that opposite: DetLP outperforms BCV .

This section undertakes a closer theoretical examination of both algorithms in order to shed light on the seemingly anomalous behavior. Not only does our analysis explain the behavior but also uncovers avenues for improvements to BCV .

DetLP follows a separation-of-techniques approach wherein the first step consists of determinization of the DS automata and the second step reduces DS inclusion to linear programming. As one may notice, the first step is automata-based while the second is based on numerical methods.

Böker and Henzinger studied complexity and decision-procedures for determinization of DS automata in detail [31] . They proved that a DS automata can be deter- The value stored in the i-th place in the |S|-tuple represents the "gap" or extra-cost of reaching state q i over a finite-word w compared to its best value so far. The crux of the argument lies in proving that when the DS automata is complete and the discountfactor is an integer, the "gap" can take only finitely-many values, yielding finiteness of the determinized DS automata, albeit exponentially larger than the original. variables and O(τ P · µ s Q · |Σ|) constraints.

Therefore, the final complexity of DetLP is as follows:

Theorem 5.2 [17, 39] [Complexity of DetLP] Let P and Q be DS automata with s P and s Q number of states respectively, τ P states in P . Let the alphabet be Σ and maximum weight on transitions be µ.

Proof 29 Anderson and Conitzer [17] proved that this system of linear equations can be solved in O(m · n 2 ) for m constraints and n variables.

BCV is based on an integrated approach for DS inclusion wherein the entire algorithm uses automata-based reasoning only. Strictly speaking, BCV is based on a generic algorithm for inclusion under a general class of aggregate functions which permit ωregular comparators, presented in Chapter 4-Section 4.2. 1 . BCV (Algorithm 2) refers to its adaptation to DS. It is described in complete detail, for sake of clarity.

Recall, a run ρ ∈ P of word w ∈ L(P ) is a said to be dominated w.

The key idea behind BCV is that P ⊆ d Q holds iff every run of P is a dominated run w.r.t Q. Procedure AugmentWtAndLabel separates between runs of the same word in DS automata by assigning a unique transition-identity to each transition. It also appends the transition weight, to enable weight comparison afterwards. Specifically, it Algorithm 2 BCV(P, Q, d), Is P ⊆ d Q?

1: Input: Weighted automata P , Q, and discount-factor d 

comparator A µ,d ≤ is constructed with upper-bound µ that equals the maximum weight of transitions in P and Q, and discount-factor d. Intersect matches the alphabet of P ×Q with A µ,d ≤ , and intersects them. The resulting automaton DomWithWitness

DomWithWitness on the first three components ofP returns Dom which contains the word (w, wt P , id P ) iff it is a dominated run in P . Finally, language-equivalence between Dom andP returns the answer.

The proof for PSPACE-complexity of BCV relies on LI to be PSPACE. In practice, though, implementations of LI apply Ramsey-based inclusion testing [7] , rank-based methods [64] etc. All of these algorithms are exponential in time and space in the worst case. Any implementation of BCV will have to rely on an LI-solver. Therefore, in practice BCV is also exponential in time and space. In fact, we show that its worst-case complexity (in practice) is poorer than DetLP.

Another reason that prevents BCV from practical implementations is that it does not optimize the size of intermediate automata. Specifically, we show that the size and transition-density of Dom, which is one of the inputs to LI-solver, is very high (Transition density is the ratio of transitions to states). Both of these parameters are known to be deterrents to the performance of existing LI-solvers [6] , subsequently to BCV as well: 

Proof 30 It is easy to see that the number of states and transitions ofPQ are the same as those of P and Q, respectively. Therefore, the number of states and transitions inP ×Q are O(s P s Q ) and O(τ P τ Q ), respectively. The alphabet ofP ×Q is of the form (a, wt 1 , id 1 , wt 2 , id 2 ) for a ∈ Σ, wt 1 , wt 2 are non-negative weights bounded by µ and id i are unique transition-ids in P and Q respectively. The alphabet of comparator A µ,d ≤ is of the form (wt 1 , wt 2 ). To perform intersection of these two, the alphabet of comparator needs to be matched to that of the product, causing a blow-up in number of transitions in the comparator by a factor of |Σ| · τ P · τ Q . Therefore, the number of states and transitions in DomWithWitness and Dom is given by O(s P s Q s d )

and

The comparator is a non-deterministic Büchi automata with O(µ 2 ) states over an alphabet of size µ 2 [25] . Since transition-density δ = |S| · |Σ| for non-deterministic

Büchi automata, the transition-density of the comparator is O(µ 4 ). Therefore,

and δ P , δ Q and δ d be their transition-densities. Number of states and transition-

The corollary illustrates that the transition-density of Dom is very high even and n Q = wt 2 , where j = (i + 1) mod 2 if either s 1 or t 1 is an accepting state, and j = i otherwise. We call intersection over substring of alphabet Intersect. The following is easy to prove:

Intersect extends alphabet of A µ,d ≤ to match the alphabet ofP ×Q and Intersect selects a substring of the alphabet ofP ×Q as defined above. Then, L(A 1 ) ≡ L(A 2 ).

Intersect prevents the blow-up by |Σ|·τ P ·τ Q , resulting in only O(τ P τ Q τ d ) transitions

in Dom Therefore, Recall that language-inclusion queries areP ⊆ Dom and Dom ⊆P . Since Dom has many more states thanP , the complexity ofP ⊆ Dom dominates.

Theorem 5.2 and Theorem 5.3 demonstrate that the complexity of BCV (in practice) is worse than DetLP. This explains the inferior performance of BCV in practice.

The ealrier investigate explains why BCV does not lend itself to a practical implementation for DS inclusion ( § 5. 2.2) , and also identifies its drawbacks. This section proposes an improved comparator-based algorithm QuIP for DS inclusion, as is described in § 5. 3.2 . QuIP improves upon BCV by means of a new optimized comparator that we describe in §5. 3 .1.

The 2 

Proof 31 This follows proof by expansion of all terms. Specifically, expand out sequence C and X. In fact, the bounds used for the comparator carry over to this case as well. Sequence C is non-negative and is bounded by µ C = µ · d d−1 since −µ C is the minimum value of discounted-sum of V , and integer-sequence X is bounded by

On combining Lemma 5.5 with the bounds on X and C we get: 

Proof 32 Equations 1-2 from Lemma 5. Construction

where ⊥ is a special character, and x ∈ Z.

• Σ = {v : |v| ≤ µ} where v is an integer. 1 . Transitions from start state s: 

The construction of the DS comparator automata in Section 5. 3 .1 leads to an implementation-friendly QuIP from BCV. The core focus of QuIP is to ensure that Finally, the language of Dom is equated with that ofP −wt which is the automaton generated fromP after discarding weights from transitions. However, it is easy to prove that Dom ⊆P −wt . Therefore, instead of language-equivalence between Dom 

We provide implementations of our tools QuIP and DetLP and conduct experiments on a large number of synthetically-generated benchmarks to compare their performance.

We seek to find answers to the following questions:

1. Which tool has better performance, as measured by runtime, and number of benchmarks solved?

2. How does change in transition-density affect performance of the tools? 3 . How dependent are our tools on their underlying solvers?

We implement our tools QuIP and DetLP in C++, with compiler optimization o3 enabled. We implement our own library for all Büchi-automata and DS-automata operations, except for language-inclusion for which we use the state-of-the-art LIsolver RABIT [5] as a black-box. We enable the -fast flag in RABIT, and tune its JAVA-threads with Xss, Xms, Xmx set to 1GB, 1GB and 8GB respectively. We use the large-scale LP-solver GLPSOL provided by GLPK (GNU Linear Programming Kit) [2] inside DetLP. We did not tune GLPSOL since it consumes a very small percentage of total time in DetLP, as we see later in The universal automata is constructed on the restricted alphabet of only those weights that appear in the productP ×Q to include only necessary transitions. We also reduce its size with Büchi minimization tool Reduce [5] .

Since all states ofP ×Q are accepting, we conduct the intersection so that it avoids doubling the number of product states. This can be done, since it is sufficient to keep track of whether words visit accepting states in the universal.

To the best of our knowledge, there are no standardized benchmarks for DS-automata.

We 

Our experiments were designed with the objective to compare DetLP and QuIP. Due to the lack of standardized benchmarks, we conduct our experiments on randomlygenerated benchmarks. Therefore, the parameters for P ⊆ d Q are the number of states s P and s Q , transition density δ, and maximum weight wt. We seek to find answers to the questions described at the beginning of § 5.4.

Each instantiation of the parameter-tuple (s P , s Q , δ, wt) and a choice of tool between QuIP and DetLP corresponds to one experiment. In each experiment, the weighted-automata P and Q are randomly-generated with the parameters (s P , δ, wt)

and (s Q , δ, wt), respectively, and language-inclusion is performed by the chosen tool.

Since all inputs are randomly-generated, each experiment is repeated for 50 times to obtain statistically significant data. Each experiment is run for a total of 1000 sec on for a single node of a high-performance cluster. Each node of the cluster consists of two quad-core Intel-Xeon processor running at 2.83GHz, with 8GB of memory per node. The runtime of experiments that do not terminate within the given time limit is assigned a runtime of ∞. We report the median of the runtime-data collected from all iterations of the experiment. We investigate the above intuition by analyzing the runtime trends for both tools. Fig . 5 .3a plots the runtime for both tools. The plot shows that QuIP fares significantly better than DetLP in runtime at δ = 2. 5 . The plots for both the tools on logscale seem curved (Fig. 5.3a) , suggesting a sub-exponential runtime complexity. These were observed at higher δ as well. However, at higher δ we observe very few outliers on the runtime-trend graphs of QuIP at larger inputs when just a few more than 50%

of the runs are successful. This is expected since effectively, the median reports the runtime of the slower runs in these cases. We also determined how runtime performance of tools changes with increasing discount-factor d. Both tools consume lesser time as d increases.

Finally, we test for scalability of both tools. In Fig. 5 .4a, we plot the median of total runtime as s Q increases at δ = 2.5, 3 (s P = 10, µ = 4) for QuIP. We attempt to best-fit the data-points for each δ with functions that are linear, quadratic and cubic in s Q using squares of residuals method. Fig 5. 4b does the same for DetLP. We observe that QuIP and DetLP are best fit by functions that are linear and quadratic in s Q , respectively.

Our empirical analysis arrives at conclusions that a purely theoretical exploration would not have. First of all, we observe that despite having a the worse theoretical complexity, the median-time complexity of QuIP is better than DetLP by an order of n. In theory, QuIP scales exponentially in s Q , but only linearly in s Q in runtime.

Similarly, runtime of DetLP scales quadratically in s Q . The huge margin of complexity difference emphasizes why solely theoretical analysis of algorithms is not sufficient.

Earlier empirical analysis of LI-solvers had made us aware of their dependence on transition-density δ. As a result, we were able to design QuIP cognizant of parameter δ. Therefore, its runtime dependence on δ is not surprising. However 

In theory, DS inclusion for integer discount factors is PSPACE-complete [25] . Recent algorithmic approaches have tapped into language-theoretic properties of discountedsum aggregate function [25, 42] to design practical algorithms for DS inclusion [24, 25] . These algorithms use DS comparator automata (DS comparator, in short) as their main technique, and are purely automata-theoretic. While these algorithms outperform other existing approaches for DS inclusion in runtime [31, 39] , even these do not scale well on weighted-automata with more than few hundreds of states [24] . Therefore, this chapter meets the scalability challenge of DS inclusion by delving deeper into language-theoretic properties of discounted-sum aggregate functions. By doing so, we obtain algorithms for DS inclusion that render both tighter theoretical complexity and improved scalability in practice. Hence, we can conclude that comparator-based methods are effective in practice as well.

Organization This chapter goes as follows. Section 6.2 reviews classes of Büchi automata for which complementation can be performed without using Safra's complementation method [80] . Section 6.3 introduces safety/co-safety comparison languages and correspondingly their safety/co-safety comparator automata. We prove that quantitative inclusion performed with safety/co-safety comparators is more efficient than quantitative inclusion with ω-regular comparators. Section 6.4 studies DS aggregation in depth, and shows that DS comparion languages are either safety/cosafety languages. We use this newly discovered property to design much compact and deterministic constructions for DS comparators, and finally utilize these comparators to re-design DS inclusion. Finally, Section 6.5 conducts the empirical evaluation, and affirms the superiority of comparator-based methods for DS inclusion with integer discount factors.

We coin the terms Safraless automata to refer to Büchi automata for which complementation can be performed without Safra's Büchi complementation [80] . We review two classes of Safraless automata [63] : (a). Safety and co-safety automata, and (b).

Weak Büchi automata. For both of these classes of Safraless automata, complementation can be performed using simpler methods that use subset-construction. As a result, these automata incur a 2 O(n) blow-up as opposed to a 2 O(n log n) blow-up, where n is the number of states in the original automaton. In addition, subset-construction based methods are more amenable to efficient practical implementations since they can be performed on-the-fly.

Let L ⊆ Σ ω be a language over alphabet Σ. A finite word w ∈ Σ * is a bad prefix for L if for all infinite words y ∈ Σ ω , x · y / ∈ L. A language L is a safety language if every word w / ∈ L has a bad prefix for L. A language L is a co-safety language if its complement language is a safety language [13] . When a safety or co-safety language is an ω-regular language, the Büchi automaton representing it is called a safety or co-safety automaton, respectively [63] . Wlog, safety and co-safety automaton contain a sink state from which every outgoing transitions loops back to the sink state and there is a transition on every alphabet symbol. All states except the sink state are accepting in a safety automaton, while only the sink state is accepting in a co-safety automaton. The complementation of safety automaton is a co-safety automaton, and vice-versa. Safety automata are closed under intersection, and co-safety automata are closed under union.

A Büchi automaton A = (S, Σ, δ, s I , F) is weak [64] if its states S can be partitioned

partitions are either accepting or rejecting, and (b). There exists a partial order ≤ on the collection of the S i such that for every s ∈ S i and t ∈ S j for which t ∈ δ(s, a)

for some a ∈ Σ, S j ≤ S i i.e. The transition function is restricted so that in each transition, the automaton either stays at the same set or moves to a set smaller in the partial order.

It is worth noting that safety and co-safety automata are also weak Büchi automata. In both of these cases, the states of the regular safety/co-safety automata are partitioned into two disjoint sets: The first set contains all but the sink state, and the second set contains the sink state only. Note how in both cases, the sets have either all accepting states or all non-accepting states, and the transitions are unidirectional order w.r.t to the partitions.

We make the observation that the intersection of a safety and co-safety automaton is indeed a weak Büchi automata. We prove it as follows: Based on the transition relation, we assign the order of partitions as follows:

• Partition (S C ,S S ) has highest order

• Parition ({sink C }, {sink S }) has lowest order • Other two partitions take any order in between, since there are no transitions in between these paritions.

Clearly, the partition (S C ,{sink S }) is the only accepting partition.

We introduce comparator automata that are safety and co-safety automata Section 6.3.1. Section 6.3.2 designs an algorithm for quantitative inclusion with regular safety/co-safety comparators. We observe that the complexity of quantitative inclusion with safety/co-safety comparator automata is complexity is lower than that with ω-regular comparators.

We begin with formal definitions of safety/co-safety comparison languages and safety/co-safety comparators: R is said to be a safety comparison language (or a co-safety comparison language) if L is a safety language (or a co-safety language).

Let Σ be a finite set of integers, f : Z ω → R be an aggregate function, and R ∈ {≤, <, ≥, >, =, =} be a relation.

A comparator for aggregate function f and relation R is a safety comparator (or cosafety comparator) is the comparison language for f and R is a safety language (or co-safety language).

A safety comparator is regular if its language is ω-regular (equivalently, if its automaton is a safety automaton). Likewise, a co-safety comparator is regular if its language is ω-regular (equivalently, automaton is a co-safety automaton).

Just like the closure property of ω-regular comparison languages, safety and cosafety comparison languages exhibit closure under safety or co-safety languages as well. By complementation duality of safety and co-safety languages, comparison language for an aggregate function f for non-strict inequality ≤ is safety iff the comparison language for f for strict inequality < is co-safety. Since safety languages and safety automata are closed under intersection, safety comparison languages and regular safety comparator for non-strict inequality renders the same for equality. Similarly, since co-safety languages and co-safety automata are closed under union, co-safety comparison languages and regular co-safety comparators for non-strict inequality render the same for the inequality relation. As a result, if the comparison language for any one relation is either a safety or co-safety language, the comparison languages for all relations will also be safety or co-safety languages. Therefore, it suffices to examine the comparison language for one relation only.

This section covers the first technical contributions of this work. It studies safety and co-safety properties of comparison languages and comparators, and utilizes them to obtain tighter theoretical upper-bound for quantitative inclusion with these comparators that exhibit safety/co-safety and ω-regular properties as opposed to comparators that are only ω-regular.

A run of word w in a weighted-automaton is maximal if its weight is the supremum weight of all runs of w in the weighted-automaton. A run ρ P of w in P is a counterexample for P ⊆ Q (or P ⊂ Q) iff there exists a maximal run sup Q of w in Q such that wt(ρ P ) > wt(sup Q ) (or wt(ρ P ) ≥ wt(sup Q )). Consequently, P ⊆ Q (or P ⊂ Q) iff there are no counterexample runs in P . Therefore, the roadmap to solve quantitative inclusion for regular safety/co-safety comparators is as follows:

1. Use regular safety/co-safety comparators to construct the maximal automaton of Q i.e. an automaton that accepts all maximal runs of Q (Corollary 6.1).

construct a counterexample automaton that accepts all counterexample runs of the inclusion problem P ⊆ Q (or P ⊂ Q) (Lemma 6.3). 3 . Solve quantitative inclusion for safety/co-safety comparator by checking for emptiness of the counterexample (Theorem 6.2).

Let W be a weighted automaton. Then the annotated automaton of W , denoted byŴ , is the Büchi automaton obtained by transforming transition s

Observe thatŴ is a safety automaton since all its states are accepting. A run on word w with weight sequence wt in W corresponds to an annotated word (w, wt) inŴ , and vice-versa.

This section covers the construction of the maximal automaton from a weighted automaton. Let W andŴ be a weighted automaton and its annotated automaton, respectively. We call an annotated word (w, wt 1 ) inŴ maximal if for all other words of the form (w, wt 2 ) inŴ , wt(wt 1 ) ≥ wt(wt 2 ). Clearly, (w, wt 1 ) is a maximal word inŴ iff word w has a run with weight sequence wt 1 in W that is maximal. We define maximal automaton of weighted automaton W , denoted Maximal(W ), to be the automaton that accepts all maximal words of its annotated automataŴ .

We show that when the comparator is regular safety/co-safety, the construction of the maximal automata incurs a 2 O(n) blow-up. This section exposes the construction for maximal automaton when comparator for non-strict inequality is regular safety.

The other case when the comparator for strict inequality is regular co-safety is similar, hence skipped. Proof 34 Intuitively, an annotated word (w, wt 1 ) is not maximal inŴ for one of the following two reasons: Either (w, wt 1 ) is not a word inŴ , or there exists another word (w, wt 2 ) inŴ s.t. wt(wt 1 ) < wt(wt 2 ) (equivalently (wt 1 , wt 2 ) is not in the comparator non-strict inequality). BothŴ and comparator for non-strict inequality are safety languages, so the language of maximal words must also be a safety language.

In detail, we show that every annotated word that is not a maximal word inŴ must have a bad-prefix. An annotated word (w, wt 1 ) is not maximal inŴ for one of two reasons: Either (w, wt 1 ) is not a word inŴ , or there exists another word (w, wt 2 ) inŴ s.t. wt(wt 1 ) < wt(wt 2 ).

If (w, wt 1 ) is not a word inŴ , sinceŴ is a safety automata as well, (w, wt 1 ) must have a bad-prefix w.r.tŴ . The same bad-prefix serves as a bad-prefix w.r.t.

the maximal language since none of its extensions are inŴ either.

Otherwise since the comparator for < is ω-regular co-safety, there exists an nlength prefix of (wt 1 , wt 2 ) s.t. all extensions of (wt 1 , wt 2 )[n] are present in the ωregular co-safety automaton. We claim the prefix of (w, wt 1 ) of same length n is a bad-prefix for the maximal language. Consider the n-length prefix (w, wt 1 )[n] of (w, wt 1 ). Let (w ext , wt ext ) be an infinite extension of (w, wt)[n]. If (w ext , wt ext ) is not a word inŴ , it is also not a maximal word inŴ .

If (w ext , wt ext ) is a word inŴ , Let (w ext , wt ext ) be and extensions of (w, wt 2 )[n].

Note that the first component of extension of (w, wt 1 )[n] and (w, wt 2 )[n] are the same.

This is possible since the underlying Büchi automata of W is a complete automaton.

Since (wt ext , wt ext ) is an extension of (wt 1 , wt 2 )[n], wt(wt ext ) < wt(wt ext ). Therefore, (w ext , wt ext ) is not a maximal word.

We now proceed to construct the safety automata for Maximal ( The challenge here is to construct an automaton for condition (b). Intuitively, this automaton simulates the following action: As the automaton reads word (w, wt 1 ), it must spawn all words of the form (w, wt 2 ) inŴ , while also ensuring that wt(wt 1 ) ≥ wt(wt 2 ) holds for every word (w, wt 2 ) inŴ . SinceŴ is a safety automaton, for a word (w, wt 1 ) ∈Σ ω , all words of the form (w, wt 2 ) ∈Ŵ can be traced by subsetconstruction. Similarly since the comparator C for non-strict inequality (≥) is a safety automaton, all words of the form (wt 1 , wt 2 ) ∈ C can be traced by subset-construction as well. The construction needs to carefully align the word (w, wt 1 ) with the all possible (w, wt 2 ) ∈Ŵ and their respective weight sequences (wt 1 , wt 2 ) ∈ C.

Construction and detailed proof Let W be a weighted automaton, with annotated automatonŴ and C denote its regular safety comparator for non-strict A similar construction proves that the maximal automata of weighted automata W with regular safety comparator C for strict inequality contains |W | · 2 O(|W |·|C|) states. The difference is that in this case the maximal automaton may not be a safety automaton. But it will still be a weak Büchi automaton. Therefore, Lemma 6.2 generalizes to: 

This section covers the construction of the counterexample automaton. Given weighted automata P and Q, an annotated word (w, wt P ) in annotated automataP is a counterexample word of P ⊆ Q (or P ⊂ Q) if there exists (w, wt Q ) in Maximal(Q) s.t.

wt(wt P ) > wt(wt Q ) (or wt(wt P ) ≥ wt(wt Q )). Clearly, annotated word (w, wt P ) is a counterexample word iff there exists a counterexample run of w with weight-sequence wt P in P .

For this section, we abbreviate strict and non-strict to strct and nstrct, respectively.

For inc ∈ {strct, nstrct}, the counterexample automaton for inc-quantitative inclusion, denoted by Counterexample(inc), is the automaton that contains all counterexample words of the problem instance. We construct the counterexample automaton as follows: 

In this section, we give the final algorithm for quantitative inclusion with regular safety/co-safety comparators, and contrast the worst-case complexity of quantitative inclusion with ω-regular and regular safety/co-safety comparators.

Theorem 6.2 Let P , Q be weighted-automata with regular safety/co-safety comparators. Let C ≤ and C < be the comparators for ≤ and <, respectively. Then

• Strict quantitative inclusion P ⊂ Q is reduced to emptiness checking of a Büchi automaton of size |P ||C ≤ ||Q| · 2 O(|Q|·|C<|) .

• Non-strict quantitative inclusion P ⊆ Q is reduced to emptiness checking of a Büchi automaton of size |P ||C < ||Q| · 2 O(|Q|·|C<|) .

Proof 37 Strict and non-strict are abbreviated to strct and nstrct, respectively. For inc ∈ {strct, nstrct}, inc-quantitative inclusion holds iff Counterexample(inc) is empty.

Size of Counterexample(inc) is the product of size of P , Maximal(Q) (Corollary 6.1), and the appropriate comparator as described in Lemma 6. 3 .

In contrast, quantitative inclusion with ω-regular comparators reduces to emptiness of a Büchi automaton with |P | · 2 O(|P ||Q||C|·log(|P ||Q||C|)) states [25] . The 2 O(n log n) blow-up is unavoidable due to Büchi complementation. Clearly, quantitative inclusion with regular safety/co-safety has lower worst-case complexity.

This section covers the second technical contributions of this chapter. It uncovers that DS comparison languages are safety/co-safety, and utilizes this information to obtain tighter theoretical upper-bound for DS inclusion when the discount factor is an integer. Unless mentioned otherwise, the discount-factor is an integer.

In We adopt these definitions in the Section 6.4.

A note on notation: Throughout this section, the concatenation of finite sequence

x with finite or infinite sequence y is denoted by x · y in the following.

The central result of this section is that DS comparison languages are safety or cosafety languages for all (integer and non-integer) discount-factors (Theorem 6.3).

In particular, since DS comparison languages are ω-regular for integer discountfactors [25] , this implies that DS comparators for integer discount-factors form safety or co-safety automata (Corollary 6.2). The argument for safety/co-safety of DS comparison languages depends on the property that the discounted-sum aggregate of all bounded weight-sequences exists for all discount-factors d > 1 [79] . Proof 38 Due to duality of safety/co-safety languages, it is sufficient to show that DS-comparison language with ≤ is a safety language.

Let us assume that DS-comparison language with ≤ is not a safety language. Let W be a weight-sequence in the complement of DS-comparison language with ≤ such that it does not have a bad prefix.

Since W is in the complement of DS-comparison language with ≤, DS (W, d) > 0.

By assumption, every i-length prefix W [i] of W can be extended to a bounded weight-

The contribution of tail sequences W [i . . . ] and Y i to the discounted-sum of W and W [i] · Y i , respectively diminishes exponentially as the value of i increases. In addition, since and W and W [i] · Y i share a common i-length prefix W [i], their discounted-sum values must converge to each other. The discounted sum of W is fixed and greater than 0, due to convergence there must be a k ≥ 0 such that DS (W [k] · Y k , d) > 0. Contradiction. Therefore, DS-comparison language with ≤ is a safety language.

The above intuition is formalized as follows: Since DS (W, d) > 0 and 

Therefore, DS-comparator with ≤ is a safety comparator.

Semantically this result implies that for a bounded-weight sequence C and rational discount-factor d > 1, if DS (C, d) > 0 then C must have a finite prefix C pre such that the discounted-sum of the finite prefix is so large that no infinite extension by bounded weight-sequence Y can reduce the discounted-sum of C pre · Y with the same discount-factor d to zero or below.

Prior work shows that DS-comparison languages are expressed by Büchi automata iff the discount-factor is an integer [26] . Therefore: Lastly, it is worth mentioning that for the same reason [26] DS comparators for non-integer rational discount-factors do not form safety or co-safety automata.

This section issues deterministic safety/co-safety constructions for DS comparators with integer discount-factors. This is different from prior works since they supply non-deterministic Büchi constructions only [24, 25] . An outcome of DS comparators being regular safety/co-safety (Corollary 6.2) is a proof that DS comparators permit deterministic Büchi constructions, since non-deterministic and deterministic safety automata (and co-safety automata) have equal expressiveness [63] . Therefore, one way to obtain deterministic Büchi construction for DS comparators is to determinize the non-deterministic constructions using standard procedures [63, 80] . However, this will result in exponentially larger deterministic constructions. To this end, this section offers direct deterministic safety/co-safety automata constructions for DS comparator that not only avoid an exponential blow-up but also match their non-deterministic counterparts in number of states (Theorem 6.5).

Due to duality and closure properties of safety/co-safety automata, we only present the construction of deterministic safety automata for DS comparator with upper bound µ, integer discount-factor d > 1 and relation ≤, denoted by A µ,d ≤ . We proceed by obtaining a deterministic finite automaton, (DFA), denoted by bad(µ, d, ≤), for the language of bad-prefixes of A µ,d ≤ (Theorem 6.4). Trivial modifications to bad(µ, d, ≤)

will furnish the coveted deterministic safety automata for A µ,d ≤ (Theorem 6.5).

We begin with some definitions. Let W be a finite weight-sequence. By abuse of notation, the discounted-sum of finite-sequence W with discount-factor d is defined There are infinitely many possibilities for the values of recoverable gap. We need to limit the recoverable gap values to finitely many values of interest. The core aspect of this construction is to identify these values.

First, we obtain a lower bound on recoverable gap for bad-prefixes of A µ,d ≤ :

Lemma 6.4 Let µ and d > 1 be the bound and discount-factor, resp. Let T = µ d−1 be the threshold value. Let W be a non-empty, bounded, finite weight-sequence. Weight

Proof 40 Let a finite weight-sequence W be a bad-prefix of A µ,d ≤ .

Then, DS (W · Y , d) > 0 for all infinite and bounded weight-sequences Y . Since

Since all finite and bounded extensions of bad-prefixes are also bad-prefixes, Lemma 6.4 implies that if the recoverable-gap of a finite sequence is strinctly lower that threshold T, then recoverable gap of all of its extensions also exceed T. Since recoverable gap exceeding threshold T is the precise condition for bad-prefixes, all states with recoverable gap exceeding T can be merged into a single state. Note, this state forms an accepting sink in bad(µ, d, ≤).

Next, we attempt to merge very low recoverable gap value into a single state. Also, its run in the DFA bad(µ, d, ≤) is the single-state sequence 0, where 0 is the initial state of the DFA.Therefore, the hypothesis is true for the base case.

Let the hypothesis be true for all words of length n. We prove that the hypothesis can be extended to words of length n + 1.

Let W = V · a be a word of length n + 1, where V is a word of length n and last V be its last state . By induction hypothesis,

If gap(W, d) > T , we show that last = bad.

If last V = bad, then by construction bad is a sink state, in which case last = bad. 

Since veryGood is a sink state, none of its transitions will go to state bad.

Conversely, let last = bad. We show that gap(W, d) > T . Sink bad is the only accepting state, a words visits bad iff its recoverable gap is strictly above T , the DFA accepts the language of bad-prefixes.

Finally, the DS comparators are constructed as follows:

Theorem 6.5 Let µ be the upper bound, and d > 1 be the integer discount-factor.

DS comparator for all inequalities and equality are either deterministic safety or deterministic co-safety automata with O(µ) states.

This proof should also be treated as an exercise that illustrates that DFA bad(µ, d, ≤) is the basis of all our constructions. In fact, the analysis of bad-prefixes and very-good prefixes for the one case of the ≤ relation is sufficient.

For >: Syntactically, DFA bad(µ, d, ≤) is also a deterministic co-safety automaton.

We claim that this deterministic co-safety automaton accepts the DS comparison language for the relation >. This is true due to the complementation duality of the definitions of safety and co-safety languages. A sequence A is in a co-safety language L iff it has a finite length prefix B that is a bad prefix for the complement safety language L. The missing link here is that DS comparison languages for > and DS comparison language for ≤ are complements of each other. Finally, the safety automata for DS comparison language with = simply flips the accepting and non-accepting states of the co-safety automata for =.

As a matter of fact, the most compact non-deterministic DS comparator constructions with parameters µ, d and R also contain O(µ) states [24] . In fact, we can go a step further and prove that the deterministic Büchi automaton constructed in Thereom 6.5 is a minimal DFA, and hence it must be the case that the safety/co-safety automata for DS comparison language for ≤ or > are minimal DBA as well.

In order to prove that DFA bad(µ, d, ≤) is minimal, we prove that no two states are equivalent as per Myhill-Nerode equivalence relation [74] , and proving minimal form [59] . As per this relation, two states s and t are said to be equivalent, denoted Hence, in this section, not only did we construct safety/co-safety forms for DS comparators, our constructions are also the minimal deterministic representations.

These are the most efficient deterministic constructions possible for DS comparators.

Finally, we solve DS inclusion with integer discount factors with using the safety and co-safety comparators presented in the previous Section 6. 4 . 2 . Here we will apply the generic algorithm for quantitative inclusion with regular safety/co-safety comparators (Section 6. 3.2) to the case of discounted-sum with integer discount factors. As seen before in Theorem 6.2, the expectation is that the resulting algorithm for DS inclusion with integer discount factor will have an improved complexity. Lastly, we make use of the subset construction based methods in order to present an improved on-the-fly algorithm for DS inclusion, called QuIPFly.

The following statement instantiates Theorem 6.2 for discounted-sum aggregation.

Note that the difference in the formal statement is that here we are able to completely characterize the type of the Büchi automaton as well:

Theorem 6.7 Let P and Q be weighted-automata, and C be a regular safety/cosafety comparator for an inequality with integer discount factor d > 1.

• Strict DS-inclusion P ⊂ Q is reduced to emptiness checking of a safety automaton of size |P ||C||Q| · 2 O(|Q|·|C|) .

• Non-strict DS-inclusion P ⊆ Q is reduced to emptiness checking of a weak-Büchi automaton [64] of size |P ||C||Q| · 2 O(|Q|·|C|) .

Proof 46 Unlike Theorem 6.2, the Büchi automaton for emptiness checking has a specific form: safety or weak-Büchi [64] . We can characterize these forms since we know that the DS comparator for strict inequality and equality are co-safety and safety automata respectively (Corollary 6.2).

Recall, that the Büchi automata that is being checked for emptiness is the counterexample automaton. In essence, the counterexample automaton is formed by an intersection of the maximal automaton with an appropriate comparator (Lemma 6.3).

Furthermore, we know that when the comparator for non-strict inequality is safety, as for discounted sum, the maximal automaton will be a safety automata (Lemma 6.1).

Consequently, in case of strict DS inclusion, the counterexample automata is constructed by the intersection of a safety maximal automaton with the regular safety DS-comparator for non-strict inequality. Due to closure of safety automata under intersection, the resulting counterexample automata will be a safety automata.

In case of non-strict DS-inclusion, the counterexample automata is constructed by the intersection of the safety maximal automaton with the regular co-safety DScomparator for strict inequality. As a consequence of Theorem 6.1, the counterexample automaton is weak-Büchi.

Lastly, the size of the resulting safety/weak Büchi automaton is given by the product of the minimal automata (Corollary 6.1) and the comparator. Recall, that the size of all comparators for discounted sum are identical (Theorem 6.5 and Theorem 6.6).

Hence, all are given a size of |C|.

Therefore, the final word is: Clearly, this is an improvement over the worst-case complexity of QuIP.

The algorithm for DS inclusion with integer discount factor d > 1 proposed in Theorem 6.7 checks for emptiness of the counterexample automata. A naive algorithm will construct the counterexample automata fully, and then check if they are empty by ensuring the absence of an accepting lasso.

In QuIPFly, we implement a more efficient algorithm. In our implementation, we make use of the fact that counterexample automata is a safety/weak Büchi that is constructed from the intersection of a safety maximal automata and an appropriate determinisitc safety/co-safety DS comparator. Recall that maximal automata can be constructed in an on-the-fly fashion since it involves subset construction like techniques. Since, the comparators are also deterministic, they can also be designed in Note that the on-the-fly mechanism does not alter the worst-case performance of DS inclusion, but it contribute to making the algorithm (a) more efficient as it can terminate as soon as an accepting lasso is found, and (b) less memory exhaustive, since it usually doesn't require to store the entire counterexample automata in memory at any given time.

The goal of the empirical analysis is to examine performance of DS-inclusion with integer discount-factor with safety/co-safety comparators against existing tools to investigate the practical merit of our algorithm. Our optimized on-the-fly algorithm for safety/co-safety comparator-based approach for DS inclusion with integer discount factors is implemented in our prototype called QuIPFly. QuIPFly is written in Python We compare these tools along two axes: runtime and number of benchmarks solved.

Due to lack of standardized benchmarks for weighted automata, we follow a standard approach to performance evaluation of automata-theoretic tools [6, 71, 86] by experimenting with randomly generated benchmarks, using random benchmark generation procedure described in [24] .

The parameters for each experiment are number of states s P and s Q of weighted automata, transition density δ, maximum weight wt, integer discount-factor d, and

inc ∈ {strct, nstrct}. In each experiment, weighted automata P and Q are randomly 

For clarity of exposition, we present the observations for only one parameter-tuple.

Trends and observations for other parameters were similar. 

The goal of this chapter was to improve comparator-based solutions for DS inclusion.

To this end, here we further the understanding of language-theoretic properties of discounted-sum aggregate function by demonstrating that DS-comparison languages form safety and co-safety languages. These properties are utilized to obtain a decision procedure for DS-inclusion that offers both tighter theoretical complexity, improved scalability, and better performance than separation-of-techniques. To the best of our knowledge, this is the first work that applies language-theoretic properties such as safety/co-safety in the context of quantitative reasoning. In doing so, we demonstrate that the close integration of structural analysis and numerical analysis using comparator automata can enhance algorithms in quantitative reasoning since they can benefit from the abundance of work in qualitative reasoning.

The theoretical and practical progress in DS inclusion so far have been made for the case when the discount factor is an integer. The problem of complexity of DS inclusion with non-integer discount factors is still very much open -so much so that even its decidability is a mystery so far. In wake of the unknown decidability of DS inclusion with non-integer discount factor, this chapter takes the approach of developing pragmatic solutions to the problem. In this case, we develop an anytime algorithm [49] for DS inclusion for non-integer discount factors.

In its most generality, the decidability of DS inclusion is still unknown [39] . The goal of this work is to build a pragmatic solution for DS inclusion despite its unknown decidability. Prior results have shown that the parameter that determines the rate at which weights loose their significance, called the discount factor, primarily governs the hardness of DS inclusion: When the discount factor is an integer, DS inclusion is PSPACE-complete [25] ; On the contrary, when the discount factor is a non-integer, decidability of DS inclusion is unknown, and has been an open problem for more than a decade [39] . This work strives to address DS inclusion for non-integer discount factors, since most practical applications of DS aggregate function, such as reinforcement learning [85] , planning under uncertainty [78] , and game theory [75] , make use of non-integer discount factors. In particular, these applications require that the discount factor d > 1 to be very close to 1, i.e. 1 < d < 2. Therefore, this work focuses on DS inclusion when 1 < d < 2.

This work addresses DS inclusion for non-integer discount factors. In particular, we investigate the case when the discount factor d > 1 is very close to 1, i.e. 1 < d < 2, since that is the value used in practice. Our focus is not on resolving whether DS inclusion is decidable for 1 < d < 2 but to design pragmatic solutions for DS inclusion despite its decidability being unknown.

A natural step in this direction is to solve approximations of DS inclusion. However, we discover that approximations of DS inclusion is at least as hard as DS inclusion itself (Theorem 7.1), since DS inclusion reduces to its approximations in polynomial time. Hence, the decidability of approximations of DS inclusion is also currently unknown. As a result, we are faced with the challenge to design a solution for DS inclusion with 1 < d < 2 without the ability to develop algorithms for DS inclusion or its approximations.

To address this challenge, we turn to anytime algorithms that have been used extensively to design pragmatic solution in AI for problems with high computational complexity or even undecidability [49] . Anytime algorithms are a class of algorithms that generate approximate answers quickly and proceed to construct progressively better approximate solutions over time [49] . In the process, they may either generate an exact solution and terminate, or continuously generate a better approximation.

In the context of designing a pragmatic solution for DS inclusion with 1 < d < 2 , we design an anytime algorithm for the same. In addition, our anytime algorithm is co-computational enumerable, i.e., it is guaranteed to terminate on input instances on which DS inclusion does not hold. The algorithm gives only approximate answers otherwise.

The formalism is detailed here. Let P and Q be weighted automata, and let 1 < d < 2 be the discount factor. We say P is DS included in Q, denoted by P ⊆ Q, if weight of all executions in P is less than or equal to that in Q. So, our anytime algorithm, called DSInclusion, either terminates and returns a crisp True or False answer to P ⊆ Q, or it continuously generates an approximation that P is d · ε-close to Q, in a precise sense defined below. If an execution of DSInclusion is interrupted due to external factors such as manual interference or resource overflow, then the algorithm will return the most recently computed d · ε-close approximation.

Additionally, if P ⊆ Q does not hold, then the algorithm is guaranteed to terminate after a finite amount of time. Note that we do not prove termination when P ⊆ Q holds. But this is not surprising, because if we could then we would have proven decidability of DS inclusion for 1 < d < 2.

The core of our algorithm is anytimeInclusion (Overview: Another advantage of our anytime algorithm is that it can be run upto a desired precision 0 < ε c < 1. For this the recursive procedure DSInclusion will be run till it is invoked with approximation factor ε c in the worst case, i.e., in the worst-case DSInclusion will be invoked for O(log 1 εc ) times, each time with a lower approximation factor. In this case, the complexity of running DSInclusion is linear in 1 εc , exponential in size of the input DS automata and exponential in 1 (d−1) 2 . This shows that as d and ε c become smaller solving upto a desired precision explodes rapidly. These observations are expected, as they corroborate with the known result on undecidability of suminclusion (Loosely speaking, DS inclusion with d = 1, and ε = 0) [11, 62] .

In conclusion, this work does not resolve the decidability debate on DS inclusion with 1 < d < 2. Instead, it designs an anytime algorithm that renders approximations with theoretical guarantees, and time-bounds when the precision value is fixed. Thus, our algorithm is suitable for pragmatic purposes.

This section defines terminology and notation used in the rest of this chapter. A key difference in this chapter is that we solve DS inclusion over finite words. A similar proof can be adapted for infinite words.

A finite sequence of weights is said to be bounded by µ > 0 if the absolute value of all weights in the sequence are less than or equal to µ.

A finite-state automaton [87] is a tuple A = (S , Σ, δ, Init, F), where S is a finite set of states, Σ is a finite input alphabet, δ ⊆ (S × Σ × S ) is the transition relation, Init ⊆ S is the set of initial states, and F ⊆ S is the set of accepting states. A finitestate automaton is deterministic if for all states s and inputs a, |{s |(s, a, s ) ∈ δ}| ≤ 1 and |Init| = 1; otherwise, it is nondeterministic. Deterministic and non-determinsitic finite-state automata are denoted by DFA and NFA, respectively. An NFA is complete if for all s ∈ S and a ∈ Σ, there exists a transition (s, a, t) ∈ δ for some state t ∈ S . For a word w = w 0 w 1 . . . w m ∈ Σ * , a run ρ of w is a sequence of states s 0 s 1 . . . s m+1 , such that s 0 ∈ Init, and τ i = (s i , w i , s i+1 ) ∈ δ for all i. A run ρ is accepting if its last state s m+1 ∈ F. A word w is accepted by NFA/DFA if it has an accepting run. The weight of word w ∈ Σ * in weighted automata is defined as wt(w, A) = max{f (ρ)|ρ is a run of w in A}. The problem of f -inclusion compares the weight of words in two weighted automata with the same aggregate function. Given weighted automata P and Q with aggregate function f : N * → R, P is said to be f -included in Q if for all words w ∈ Σ * , wt(w, P ) ≤ wt(w, Q).

This work studies the discounted-sum inclusion problem. The discounted sum (DS) of a finite sequence A = a 0 . . . a n for discount factor d > 1, denoted DS (A, d) = a 0 + a 1 d · · · + an d n . DS automata with discount factor d > 1 are weighted automata with discounted sum aggregation with discount factor d. Therefore, given two DS automata P and Q with the same discount factor, P is DS-included in Q if weight of every word in P is less than or equal to that in Q. We reserve the notation P ⊆ Q to denote P is DS-included in Q. While DS inclusion is PSPACE-complete when the discount factor is an integer, its decidability is still open for non-integer discount factors [32, 39] . For sake of clearer exposition, this paper assumes that the weight along transitions in a DS automata are non-negative integers. There are no technical differences in extending the result to integer weights.

In a similar vein to recent progess on DS inlcusion for integer discount factors [24, 28] , this work makes use of comparator automata to design the anytime algorithm.

Comparator automata and comparison languages are defined as follows: For a finite set of integers Σ, an aggregate function f : Z * → R, and equality or inequality relation R ∈ {<, >, ≤, ≥, =, =}, the comparison language for f and R is a language over the alphabet Σ that accepts a word A ∈ Σ * iff f (A) R 0 holds. A comparator automaton (comparator, in short) for f and R is an automaton that accepts the comparison language for f and R [25] . A comparator is regular if its automaton is an NFA. Prior work has shown that comparator for discounted sum (DS comparator, in short) is regular for all R iff the discount factor is an integer [26] . Regularity of the comparator has been crucial to the progress in DS inclusion for integer discount factors. Therefore, another contribution in this work is to define an approximation for discounted-sum with non-integer discount factor so that its comparator is regular.

We will use these regular comparators to design the anytime algorithm.

Before delving into the technical details of the anytime algorithm, we give a roadmap of the approach. Notions of approximations are central to anytime algorithms, so we begin with defining approximations to DS inclusion: Definition 7.1 (d · ε-close approximation) Given DS automata P and Q with discount factor d > 1, and an approximation parameter ε > 0, P is said to be d · εclose to Q, denoted by P ⊆ Q+d·ε, if for all words w ∈ Σ ω , wt(w, P ) ≤ wt(w, Q)+d·ε.

Definition 7.2 (d · ε-far approximation) Given DS automata P and Q with dis-count factor d > 1, and an approximation parameter ε > 0, P is said to be d · ε-far

Given the open decidability status of DS inclusion, the next alternative is to develop algorithms for the aforementioned approximations. That is not possible either, however, for the following reason: 1. There exists a polynomial time reduction from DS inclusion to d · ε-close approximation for an approximation factor ε > 0.

2. There exists a polynomial time reduction from DS inclusion to d · ε-far approximation for an approximation factor ε > 0.

The intuitive argument behind showing that P ⊆ Q reduces to P ⊆ Q + d · ε for an ε > 0 is to show that one can transform Q into a weighted automaton

Similarly, to show that P ⊆ Q reduces to P ⊆ Q − d · ε, we show that one can transform Q into Q such that Q = Q+d·ε. An direct corollary of Theorem 7.1 is that the decidability of approximations of DS inclusion is currently unknown as well.

DS inclusion is decidable iff over approximation of DS is decidable: First we prove that every instance of DS inclusion can be reduced to an instance of over approximation of DS inclusion. Let d > 1 be a discount factor and P and Q be DS automata. Then we will show that there exists alternate DS automata R and S, and an approximation factor ε > 0 such that P ⊆ Q ⇐⇒ R ⊆ S + d · ε. Let # be an character that is not present in the alphabet of P and Q First of all, we generate a new DS automaton P # from P such that every word in P is prefixed with the character # and its weight is multiplied by 1 d . This can be done by a simple automata-theoretic transformation of P : Include all states and transitions from P in P # . Retain all accepting states of P in P # . Add a new state s # . Add a transition from s # to state s v init of P , and assign it a weight of 0. Make the new state s # the accepting state. This is P # . Similarly, construct DS automaton Q # from Q. It is easy to see that P ⊆ Q iff P # ⊆ Q # . Finally, construct Q from Q # by assigning the transition from s # to s init a weight of −1. Then it is easy to see that P # ⊆ Q # iff

Next, we prove that every instance of over-approximation of DS inclusion can be reduced to an instance of DS inclusion. Let P and Q be DS automata with discount factor d > 1, and let ε > 0 be its approximation factor. Suppose P ⊆ Q + d · ε holds.

Let d · ε = r s . Generate P # and Q # as earlier.

Since P ⊆ Q + d · ε holds, we get that P # ⊆ Q # + ε holds. Modify Q # to Q so that the weight of the transition from its inital state is now ε. Suppose ε = m n for natural numbers m, n > 0. Then, Multiply the weight of all edges in P # and Q with n to obtain new DS automata R and S, respectively. Then, it is easy to see that P ⊆ Q + d · ε holds then R ⊆ S holds.

the previous proof except that the "+" will be replace by "-".

Given these challenges, we propose an anytime algorithm for DS inclusion. The anytime algorithm either terminates after a finite amount of time with a crisp True or False answer to DS inclusion, or it continuously generates d · ε-close approximations, where the approximation factor 0 < ε < 1 decreases in time. If an execution of the algorithm is interrupted at anytime before a natural termination, then it returns the most recently computed d · ε-close approximation.

Inputs: DS automata P , Q, discount factor 1 < d < 2, and approximation factor 0 < ε < 1 1: if lowApproxDSInc(P, Q, d, ε) returns P ⊆ Q = False then Algorithm 4 outlines our anytime procedure. On receiving DS automata P and Q with discount factor 1 < d < 2, the algorithm invokes anytimeInclusion with an initial approximation factor 0 < ε init < 1. As is clear from Algorithm 4, anytimeInclusion is a tail recursive procedure in which the approximation factor is halved in each new invocation. In the invocation with approximation factor 0 < ε < 1, anytimeInclusion calls two functions lowApproxDSInc and upperApproxDSInc. Ideally, these procedures would over-and under-approximate DS inclusion using d · ε-close and d · ε-far, respectively.

Unfortunately, that is not possible due to Theorem 7.1. Therefore, a challenge here is the design of these two subprocedures. At this point it is sufficient to know that in response to the challenge we design lowApproxDSInc so that it combines partial solutions of DS inclusion and d · ε-close approximation. Specifically, given approximation factor ε > 0, its outcomes are either P ⊆ Q = False or P ⊆ Q + d · ε = True. Similarly, the algorithm upperApproxDSInc is designed to return either P ⊆ Q = True or Consider the invocation with approximation factor 0 < ε < 1. First, lowApproxDSInc will be called. If it returns P ⊆ Q = False, then anytimeInclusion is terminated, and it returns the crisp outcome that P ⊆ Q = False. Otherwise, lowApproxDSInc must have returned P ⊆ Q+d·ε = True, i.e. the d·ε-close approximation holds. Therefore, if anytimeInclusion is interrupted here onward, it can return this approximate result.

But if anytimeInclusion is not interrupted, it proceeds to solve upperApproxDSInc.

If upperApproxDSInc returns P ⊆ Q = True, once again anytimeInclusion is terminated, and the crisp P ⊆ Q = Ture solution is returned. If anytimeInclusion is interrupted at this point, the algorithm returns the d · ε-close approximation result obtained from lowApproxDSInc. Finally, if the algorithm has not been interrupted yet, anytimeInclusion is invoked with lower approximation factor ε 2 . That completes the description of our anytime algorithm.

Finally, to see why this algorithm is co-recursively enumerable, observe that if P ⊆ Q = False, then there must be an approximation factor 0 < γ < 1 such that for all 0 < δ < γ, P ⊆ Q + d · δ = False. Therefore, as the approximation factor is halved in each invocation of anytimeInclusion, it will eventually be smaller than the aforementioned γ. When this happens, then subprocedure lowApproxDSInc will be forced to return P ⊆ Q = False. Hence, if P ⊆ Q = False, then the anytime algorithm will necessarily terminate.

This section describes Algorithm lowApproxDSInc. Recall, given inputs DS automata P and Q, discount factor 1 < d < 2 and approximation factor 0 < ε < 1, lowApproxDSInc(P, Q, d, ε) either returns P ⊆ Q does not hold or P ⊆ Q + d · ε holds. Note that these outcomes are not mutually exclusive, i.e., there exist input instances for which both of the outcomes may hold. In these cases, the algorithm may return either of the outcomes; the procedure will still be sound.

Intuitively, lowApproxDSInc solves whether P is f -included in Q, where f is an aggregate function that approximates the discounted-sum from below. Let this aggregate function, denoted DSLow, be defined such that given a weightsequence W , discount factor 1 < d < 2 and approximation factor 0 < ε < 1, 0 ≤ DS (W, d) − DSLow(W, k, p) < d · ε holds. Then, we argue that if P is DSLowincluded in Q, then P ⊆ Q + d · ε holds. Otherwise, if P is not DSLow-included in Q, then P ⊆ Q will not hold. Therefore, lowApproxDSInc solves DSLow-inclusion. We use techniques from regular comparators [25] to solve DSLow-inclusion.

DSLow is formally defined in § 7.4. 1 . Second, a regular comparator for DSLow is constructed in § 7.4.2. Finally, we use the regular comparator to design lowApproxDSInc § 7.4. 3 . Let k, p > 0 be positive rationals such that the discount factor 1 < d < 2 and approximation factor 0 < ε < 1 are expressed as d = 1 + 2 −k and ε = 2 −p , respectively.

This section defines lower approximation of discounted sum when 1 < d < 2. A consideration while defining the aggregate function is that its comparator should be regular. Therefore, our definition of lower approximation of discounted sum is motivated from the notion of recoverable gap [31] which is known to play an important role in guaranteeing regularity of comparators [26] .

The recoverable gap of a weight sequence W w.r.t discount factor d > 1 is d |W |−1 · DS (W, d). In other words, it is the normalized DS of a weight sequence. Intuitively, the recoverable gap of a weight sequence gives a measure of how far its discountedsum is from 0, and hence is a building block for designing the comparator automata.

A property of recoverable gap that results in the regularity of comparator for DS with integer discount factor is that the minimum non-zero difference between the recoverable gap of sequences is fixed. Specifically, this difference is 1 when the discount factor is an integer. But for non-integer discount factors, this difference can become arbitrarily small [10] . This explains why DS comparator are not regular for noninteger discount factors.

To this effect, we begin by defining an approximation of the recoverable gap such that the aforementioned difference is fixed under the new definition. This is guaranteed by rounding-off the recoverable gap to a fixed resolution r = (d − 1) · ε = 2 −(p+k) , where d = 1 + 2 −k is the discount factor and ε = 2 −p is the approximation factor.

Formally, let roundLow(x, k, p) denote the largest integer multiple of the resolution that is less than or equal to x, for x ∈ R. Then, roundLow(x, k, p) = i · 2 −(p+k) for an integer i ∈ Z such that for all integers j ∈ Z for which j · 2 −(p+k) ≤ x, we get that j ≤ i. Then, Lemma 7.1 Let k, p > 0 be rational-valued parameters. Then, for all real values

Proof 48 There exists a unique integer i ∈ Z and 0 ≤ b < 2 −(p+k) such that x = i · 2 −(p+k) + b. Then, roundLow(x, k, p) = i · 2 −(p+k) . Therefore, we get that 0 ≤

x − roundLow(x, k, p) < 2 −(p+k) . x ≤ y, then roundLow(x, k, p) ≤ roundLow(y, k, p).

There exist unique integers i, j, ∈ Z, and positive values 0 ≤ a, b < 2 −(p+k)

such that x = i · 2 −(p+k) + a and y = j · 2 −(p+k) + b. By definition of roundLow, roundLow(x, k, p) = i · 2 −(p+k) and roundLow(y, k, p) = j · 2 −(p+k) . Then, if x ≤ y then one of the two must have occurred:

• i < j. In this case, roundLow(x, k, p) < roundLow(y, k, p).

• i = j and a ≤ b. In this case, roundLow(x, k, p) = roundLow(y, k, p) Therefore, if x ≤ y then roundLow(x, k, p) ≤ roundLow(y, k, p).

Then, for all real values x ∈ R, 0 ≤ x − roundLow(x, k, p) < 2 −(p+k) . Then, lower gap is defined as follows: To complete the definition, we prove that the value computed by DSLow in Definition 7.4 corresponds to a value close to the discounted-sum. To prove that we first establish the following" Given discount factor d = 1 + 2 −k and approximation factor ε = 2 −p , a resolution sequence of length n > 0, denoted R n , is the n-length sequence in which all elements are r = 2 −(p+k) . Lemma 7.3 Let k, p > 0 be rational-valued parameters. Let d = 1 + 2 −k be the non-integer, rational discount factor and ε = 2 −p be the approximation factor. Let W be a finite non-empty weight sequence. Then 0 ≤ gap(W, d) − gapLower(W, k, p) < gap(R |W | , d).

Proof 50 We prove the above by induction on the length sequence W .

Base case: When |W | = 1. Let W = w 0 . In this case, gap(W, d) = w 0 and gapLower(W, k, p) = roundLow(w 0 , k, p). Then, from Lemma 7.1 we get that 0 ≤ gap(W, d) − gapLower(W, k, p) < r, which in turn is the same as 0 ≤ gap(W, d) − gapLower(W, k, p) < gap(R 1 , d)d.

Inductive hypothesis: For all weight-sequences W of length n ≥ 1, it is true that 0 ≤ gap(W, d) − gapLower(W, k, p) < gap(R n , d).

Induction step: We extend this result to weight-sequences of length n + 1. Let W be an n + 1-length weight-sequence. Then W = W [n] · w n , where W [n] is the n-length prefix of W and w n is n + 1-th element.

We first show that gap(W, d) − gapLower(W, k, p) ≥ 0:

Using monotonicity of roundLow and the inductive hypothesis, we get

From Lemma 7.1, we get the desired result.

Next, we show that gap(W, d) − gapLower(W, k, p) < gap(R n+1 , d).

From the inductive hypothesis, we get <d · gap(R n , d) + 2 −(p+k) where R n is the n-length resolution sequence =gap(R n+1 , d) where R is the n + 1-length resolution sequence This completes both sides of the proof. Theorem 7.2 Let d = 1 + 2 −k be the discount factor and ε = 2 −p be the approximation factor, for rationals p, k > 0. Then for all weight sequences W , 

Therefore, the lower approximation of DS is well defined in this section.

This section covers the construction of the comparator automaton for the lower approximation of discounted-sum from Defintion 7.4. We show that the comparator is regular by explicitly constructing its DFA. The construction will utilize the fixed non-zero minimum property of the lower gap.

We begin with formal definitions of the comparison language and comparator automata for lower DS. 

Let µ > 0 be an integer bound, and k, p be positive integers. The comparator automata for lower approximation of discounted sum with discount factor d = 1 + 2 −k , approximation factor ε = 2 −p , upper bound µ and inequality relation R ∈ {≤, ≥} is an automaton that accepts the corresponding comparison language.

Next, we construct a DFA for the comparator for lower DS.

The first observation towards the construction is that DSLow(W, k, p) R 0 iff gapLower(W, k, p) R 0, for all finite weight sequences W . Therefore, it is sufficient to construct a DFA that accepts weight sequence W iff gapLower(W, k, p) R 0. We achieve this by (a). Creating one state of the DFA for every possible value of lower gap. (b). Note that the definition of lower gap (Definition 7.3) is inductive on the length of the weight sequence. So, transitions between states is defined so that they obey the inductive definition. For (a), note that lower gap are of the form i · r where i ∈ Z. Therefore, for all i ∈ Z, we introduce a state i to represent the lower gap i · r.

For (b). we include a transition from state i to state j on symbol a ∈ Σ = {−µ, . . . , µ} iff j · r = gapLower(i · r + a, k, p), thereby following Definition 7. 3 . Note that this equation makes the transition relation deterministic as for all i ∈ Z and a ∈ Σ there is a unique j ∈ Z that satisfies it. Since gapLower(W, k, p) = 0 when |W | = 0, state 0 is made the initial state. Finally, a state i is an accepting state iff i R 0 holds.

The automaton created above has infinitely many states as every possible value of lower gap corresponds to a state. To obtain finitely many states, we show that it is sufficient to consider finitely many values of the lower gap: Lemma 7.4 (Bounds on lower gap-value) Let µ > 0 be an integer bound. Let k, p be positive integers s.t. d = 1 + 2 −k is the discount factor, and ε = 2 −p is the approximation factor. Let W be a finite and bounded weight sequence. 1 . If gapLower(W, k, p) ≤ −µ · 2 k then for all u ∈ {−µ, . . . , µ}, gapLower(W · u, k, p) ≤ −µ · 2 k . 2 . If gapLower(W, k, p) ≥ µ · 2 k + 2 −p then for all u ∈ {−µ, . . . , µ}, gapLower(W · u, k, p) ≥ µ · 2 k + 2 −p .

Recall, gapLower(W · u, k, p) = roundLow(d · gapLower(W, k, p)+u, k, p). From the definition of roundLow, we get that gapLower(W · u, k, p) ≤ d · gapLower(W, k, p) + u. Since gapLower(W, k, p) ≤ −µ · 2 k holds, we get

Since u is at most µ, we get that gapLower(W · u, k, p) ≤ −µ · 2 k .

Proof of (2.) follows similarly, and hence has been omitted.

The outcome of Lemma 7.4 is that it is sufficient to track the lower gap value only when it is between −µ · 2 k and µ · 2 k + 2 −p in the construction.

gap value is less than or equal to T l · r, and (c). if T u ≤ s f then the lower gap value is greater than or equal to T u · r.

The proof is very similar to Theorem 6.4, and hence its details have been left for the reader to fill in.

This section describes lowApproxDSInc. Recall, given DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor 2 −p , lowApproxDSInc returns either P ⊆ Q = False or P ⊆ Q + d · ε = True. In cases where both outcomes are possible, lowApproxDSInc may return either of the outcomes. In our design of lowApproxDSInc, it performs DSLow-inclusion between the DS automaton, as justified in Lemma 7.5-7. 6 . Subsequently, the algorithm uses the regular comparator for DSLow to design its inclusion procedure (Algorithm 5). Lastly, we illustrate a property of lowApproxDSInc that is necessary in our anytime procedure for DS inclusion to become co-recursively enumerable (Theorem 7.6).

Terminology: A run ρ P of word w in P is said to be dominated by Q if there exists a run ρ Q in Q on the same word such that DSLow(ρ P − ρ Q , k, p) ≤ 0.

Lemma 7.5 Given DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor ε = 2 −p for rational values k, p > 0.

1. If all runs in P are dominated by Q, then P ⊆ Q + d · ε holds.

If there exists a run in P that is not dominated by Q, then P ⊆ Q does not hold.

Proof 54 Proof of (1.): Let for all words w ∈ Σ * , for all runs of w ρ P ∈ P , there exists a run of w ρ Q ∈ Q such that DSLow(ρ P − ρ Q , k, p) ≤ 0 be true. Then DSLow(ρ P −

Since weight of a word is given by the maximum weight of its all runs, we get that for all word w ∈ Σ * , wt P (w) < wt Q (w) + d · ε. Therefore, P ⊆ Q + d · ε holds.

Proof of (2.): Let w ∈ Σ * be the word for which there exists a run of w ρ P ∈ P such that for all runs of wρ Q ∈ Q, DSLow(ρ P −ρ Q , k, p) > 0 holds.

Since weight of a word is given by the maximum weight of its all runs, we get that there exists word w ∈ Σ * , wt P (w) > wt Q (w). So, P ⊆ Q does not hold.

Therefore Lemma 7.6 Given DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor ε = 2 −p for rational values k, p > 0, lowApproxDSInc(P, Q, d, ε) returns

True iff all runs in P are dominated by Q.

Proof 55 From the algorithm, it is clear that there is a one-one correspondence between words (w, L) ∈P −wt and runs ρ P of word w in P for all words w ∈ Σ * .

From the algorithm, it is also clear that (w, L) ∈ dominated iff w ∈ Σ * , with a run ρ P ∈ P which has been labelled by L such that there exists a run ρ Q ∈ Q of w such that DSLow(ρ P − ρ Q , k, p) ≤ 0.

Then aux == True iffP −wt ⊆ dominated. By definition of language inclusion, this holds iff for all (w, L) ∈P −wt we get that (w, L) ∈ dominated. From the one-one Algorithm 5 lowApproxDSInc(P, Q, d, ε)

Inputs: DS automata P , Q, discount factor 1 < d < 2, approximation factor 0 < ε < 1 1:P ← AugmentWtAndLabel(P ) return P ⊆ Q = False 10: end if correspondence between words inP −wt and runs in P , we get that for all w ∈ Σ * , for all runs ρ P ∈ P of w, let (w, L) be its correspondence inP −wt then (w, L) ∈ dominated.

By the condition under which a word is a member of dominated we get that w ∈ Σ * , for all runs ρ P ∈ P of w, such that there exists a run ρ Q ∈ Q of w such that

For all inputs DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor ε = 2 −p for rational values k, p > 0, algorithm lowApproxDSInc is sound.

Proof 56 This follows directly from Lemma 7.5 and Lemma 7.6.

Given DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor ε = 2 −p for rational values k, p > 0. Let µ be the absolute value of the largest weight in P and Q. Then the worst case complexity of Lastly, we show that if P ⊆ Q does not hold, there exists a sufficiently small approximation factor ε > 0 such that when lowApproxDSInc is invoked with ε, it returns that P ⊆ Q does not hold. This property will be crucial in proving corecursive enumerability of our anytime procedure for DS inlcusion.

Theorem 7.6 (Bias) Given DS automata P , Q, and discount factor 1 < d < 2. If P ⊆ Q = False, there exists an approximation factor 0 < ε < 1 such that for all 0 < γ < ε, lowApproxDSInc(P, Q, d, γ) returns P ⊆ Q = False.

The core idea is that when P ⊆ Q does not hold, then there must exist a word w ∈ Σ * such that wt(w, P ) > wt(w, Q) + d · δ. Therefore, for a sufficiently low value of ε, P ⊆ Q + d · ε = False. Then, for these values of ε, lowApproxDSInc will necessarily return P ⊆ Q = False.

Since P ⊆ Q = False there exists a word w ∈ Σ * such that wt(w, P ) = wt(w, Q) + d · γ for a rational value γ > 0. Since weight of words is computed as the maximum of weight of its runs, there must exist a run ρ P of w in P such that for all runs ρ Q of w in Q, we get that such that

Let k, p > 0 be rational values such that d = 1+2 −k and γ = 2 −p . From Theorem 7.2, we get that for all q ≥ p + 1 DSLow(ρ P − ρ Q , k, q) > d · γ 2 . Therefore, from Lemma 7.5 we get that for all q ≥ p + 1, lowApproxDSInc(P, Q, d, q) = False.

This section describes Algorithm upperApproxDSInc -the second sub-procedures in our anytime algorithm for DS inclusion. Given inputs DS automata P and Q, discount factor 1 < d < 2 and approximation factor 0 < ε < 1, upperApproxDSInc (P, Q, d, ε) either returns P ⊆ Q holds or P ⊆ Q − d · ε does not hold. As earlier, these outcomes are not mutually exclusive. In these cases, the algorithm may return either of the outcomes as they are both sound. upperApproxDSInc is that if P ⊆ Q = True holds, then upperApproxDSInc cannot guarantee that for a small enough value of the approximation factor upperApproxDSInc with return P ⊆ Q = True. The core idea here is that if P ⊆ Q = True then the difference between words in P and in Q could be arbitrarily small. In particular, for every possible value of the approximation factor, there may be a word for which the difference in its weight in P and Q is smaller than the approximation factor. As a result, the option of P ⊆ Q − d · ε = False may get triggered, never returning the outcome that P ⊆ Q = True holds.

The rest of this section gives all details of upperApproxDSInc.

In the first stage we define the upper approximation of discounted-sum so that its recoverable gap obeys the bounded non-zero minimal difference property.

For a rational number x ∈ Q, let roundUpper(x, k, p) denote the smallest integer multiple of resolution that is more than or equal to x. Formally, roundUpper(x, k, p) = i · 2 −(p+k) for an integer i ∈ Z such that for all j ∈ Z, j · 2 −(p+k) ≥ x implies i ≤ j. The upper gap value and upper approximation of discounted sum are defined as follows:

Lemma 7.7 Let k, p > 0 be rational-valued parameters. Then, for all real values

x ∈ R, 0 ≤ roundUpper(x, k, p) − x < 2 −(p+k) .

Proof 59 There exists a unique integer i ∈ Z and 0 ≤ b < 2 −(p+k) such that x = i · 2 −(p+k) − b. Then, roundUpper(x, k, p) = i · 2 −(p+k) . Therefore, we get that 0 ≤ roundUpper(x, k, p) − x < 2 −(p+k) .

Lemma 7.8 (Monotonicity) Let k, p > 0 be rational-valued parameters. Then, if

x ≥ y, then roundLow(x, k, p) ≥ roundLow(y, k, p).

The proof of this is very similar to that of Lemma 7.2. Theorem 7.7 Let d = 1 + 2 −k be the discount factor and ε = 2 −p be the approximation factor, for rationals p, k > 0. Then for all weight sequences W ,

Proof 62 The proof argument makes use of Lemma 7.9 and closely follows that of Theorem 7.2.

This section constructs a regular comparator for the upper DS defined above. The d = 1 + 2 −k is the discount factor, and ε = 2 −p is the approximation factor. Let W be a finite and bounded weight sequence.

2. If gapUpper(W, k, p) ≥ µ · 2 k , then for all u ∈ {−µ, . . . , µ}, gapUpper(W · u, k, p) ≥ µ · 2 k .

Proof 63 Part 1. Let W and u be as defined above. Then gapUpper(W · u, k, p) = roundUpper(d·gapUpper(W, k, p)+u, k, p). From Lemma 7.7, we get that gapUpper(W · u, k, p) ≤ d·gapUpper(W, k, p)+u+2 −(p+k) . From our assumption, we further get that

Part 2. Let W and u be as defined above. Then gapUpper(W · u, k, p) = roundUpper(d·gapUpper(W, k, p)+u, k, p). From Lemma 7.7, we get that gapUpper(W · u, k, p) ≥ d · gapUpper(W, k, p) + u. Further, from our assumptions we get that

and approximation factor 0 < ε < 1, upperApproxDSInc(P, Q, d, ε) returns P ⊆ Q holds or P ⊆ Q − d · ε does not hold.

Once again, the intuition and algorithm design fo upperApproxDSInc resembles that of lowApproxDSInc. Intuitively, upperApproxDSInc solves whether P is f -included in Q where aggregate function f is the upper approximation of discounted-sum. Lemma 7.11 precisely states the intuition, and the algorithm is given in Algorithm 6.

We begin with formalizing the intuition. We say, a run ρ P of word w in P is dominated by Q if there exists a run ρ Q in Q on the same word such that DSUpper(ρ P − ρ Q , k, p) ≤ 0. Then, Lemma 7.11 Given DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor ε = 2 −p for rational values k, p > 0.

1. If all runs in P are dominated by Q, then P ⊆ Q holds.

If there exists a run in P that is not dominated by Q, then P ⊆ Q − d · ε does not hold.

Proof of (1.) Let for all words w ∈ Σ * , for all runs of w in ρ P ∈ P , there exists a run of word w ρ Q ∈ Q such that DSUpper(ρ P − ρ Q , k, p) ≤ 0 implies that DS (ρ P − ρ Q , d) ≤ 0. By arguing as in Lemma 7.5, we get that P ⊆ Q.

Proof of (2.) Let w ∈ Σ * be a word for which there exists a run of wρ P ∈ P such that for all runs of w ρ Q ∈ Q, DSUpper(ρ P − ρ Q , k, p) > 0 implies DS (ρ P − ρ Q , d) > −d · ε. Therefore, by arguing as done in Lemma 7.5, we get that P ⊆ Q − d · ε does not hold.

We design Algorithm 6 so that it resembles Algorithm 5 except that in this case the comparator refers to that of the upper approximation of discounted-sum:

Algorithm 6 upperApproxDSInc(P, Q, d, ε)

Inputs: DS automata P , Q, discount factor 1 < d < 2, approximation factor 0 < ε < 1 1:P ← AugmentWtAndLabel(P ) returns True iff all runs in P are dominated by Q.

The proof argument is similar to that in Lemma 7.6. Theorem 7.9 (Soundness) For all inputs DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor ε = 2 −p for rational values k, p > 0, algorithm upperApproxDSInc is sound.

The proof is similar to proof that Theorem 7.4. Theorem 7.10 (Complexity) Given DS automata P and Q, discount factor d = 1 + 2 −k and approximation factor ε = 2 −p for rational values k, p > 0. Let µ be the absolute value of the largest weight in P and Q. Then the worst case complexity of

The proof is similar to that of Theorem 7.5.

This section describes the core contribution of this work. We design an anytime algorithm for discounted-sum inclusion. On inputs DS automata P and Q, and discount factor 1 < d < 2, our algorithm DSInclusion(P, Q, d) either terminates and returns a crisp True or False answer to P ⊆ Q, or it establishes a d · ε-close approximation, where the approximation factor ε > 0 decreases with time. In addition, algorithm DSInclusion is co-computational enumerable i.e. if P ⊆ Q does not hold then DSInclusion(P, Q, d) is guaranteed to terminate with that outcome after a finite amount of time.

This section proves soundness, and co-computational enumerability of DSInclusion.

Finally, we evaluate the complexity of running DSInclusion upto a desired approximation (Theorem 7.13). The analysis reveals that as ε tends to 0, the worst-case complexity grows rapidly.

DSInclusion invokes anytimeInclusion with an initial approximation factor 0 < ε init < 1, Proof of (3.): anytimeInclusion does not terminate in the ε-th round only if lowApproxDSInc and upperApproxDSInc return P ⊆ Q+d·ε = True and P ⊆ Q−d·ε = False, respectively. Therefore, by soundness of lowApproxDSInc (Theorem 7.4), it holds that P is d · ε-close to Q.

Next, we prove that DSInclusion is co-computational enumerable. Proof 69 By soundness of upperApproxDSInc, we know that upperApproxDSInc will return P ⊆ Q − d · ε = False in every invocation. Hence, DSInclusion cannot terminate due to upperApproxDSInc. Therefore, if DSInclusion terminates, it must be because lowApproxDSInc returns P ⊆ Q = False for some ε > 0. It remains to show that such an approximation factor exists. But that is exactly the result in Theorem 7.6.

Therefore, if P ⊆ Q = False, then DSInclusion is guaranteed to terminate with the outcome P ⊆ Q = False.

Note that we cannot determine, apriori, the number of recursive invocations that will be conducted for a given input instance. If we could, then we could have proved the decidability of discounted sum inclusion for discount factor 1 < d < 2. As a result, one cannot determine the worst-case complexity of DSInclusion in general.

However, the worst-case complexity can be computed upto a certain precision. More precisely, if a user decides that it will run DSInclusion until either it terminates with a crisp solution or d · ε c -close approximation is established, where approximation factor ε c is pre-determined. Note that it is not sufficient to recursively only invoke lowApproxDSInc till the approximation factor is ε c since then we would never be able to generate the outcome that DS inclusion holds. The worst-case complexity upto ε c is computed as follows: 

This chapter investigates DS inclusion when the discount factor 1 < d < 2 is not an integer. 

Quantitative games with discounted-sum

On the analysis of quantitative games

This part continues the investigation of automata-based quantitative reasoning by studying their impact on quantitative games. We show that even here, comparator automata delivers the benefits of scalability, efficiency, and broader applicability.

Quantitative properties of systems are increasingly being explored in automated reasoning across diverse application areas, including software verification [19, 66, 68] , security [43, 56] , computer-aided education [53] , and even verification of deep neural networks [22, 82] . In decision making domains such as planning and reactive synthesis, quantitative properties have been deployed to obtain high-quality solutions [29] , describe hard and soft constraints [69] , constraints on cost and resource [58] , and the like. In most cases, planning and synthesis with quantitative properties is formulated into an analysis problem over a two player, finite-state game arena with a cost model that encodes the quantitative property [37] .

Optimization over such games is a popular choice of analysis in literature. Typically, optimization over these games has polynomial time algorithms. However, the degree of polynomial may be too high to scale in practice; resulting in limited applicability of the synthesis task at hand [30] . Furthermore, often times these synthesis tasks are accompanied with temporal goals. Under this extension, the games are required to generate an optimal solution while also satisfying the temporal objective.

However, prior works have proven that optimal strategies may not exist under extension with temporal goals, rendering analysis by optimization incompatible with temporal goals [41] .

To this end, we propose an alternate form of analysis in which the objective is to search for a solution that adheres to a given threshold constraint as opposed to generating an optimal solution. We call this analysis the satisficing problem, a well-established notion that we borrow from economics [1] . Our argument is that in several cases optimal solutions may be replaced with satisficing ones. For instance, a solution with minimal battery consumption can be substituted by one that operates within the battery life. This work contrasts between the optimization problem and the satisficng problem w.r.t. theoretical complexity, empirical performance, and temporal extensibility.

More specifically, this work studies the aforementioned contrast on two player, finite-state games with the discounted-sum aggregate function as the cost model, which is a staple cost-model in decision making domains [75, 78, 85] . In these games, players take turns to pass a token along the transition relation between the states. As the token is pushed around, the play accumulates weights along the transitions using the discounted-sum cost model. The players are assumed to have opposing objectives:

one player maximizes the cost while the other minimizes it. The optimization problem is to find the optimal cost of all possible plays in the game [83] . We define the satisficing problem as follows: Given a threshold value v ∈ Q, does there exist a strategy for the minimizing player such that the cost of all resulting plays is less than (or ≤) to the threshold v?

From prior work, one can infer that the optimization problem is pseudo-polynomial [57, 93] . Zwick In addition, we observe that VI can take as many as Θ(|V | 2 ) iterations to compute the optimal value. Note that this bound is tight, indicating that the scalability of VI will be limited only to games with a small number of states and transitions. We confirm this though an empirical analysis. We show that despite heuristics, VI-based algorithms cannot escape their worst-case behavior, adversely impacting its empirical performance ( § 8.4 ). Finally, it is known that these games may not have optimal solutions when extended with temporal goals [41] .

In contrast, our examination of the satisficing problem illustrates its advantages over optimization. We solve satisficing via an automata-based approach as opposed to an arithmetic approach ( § 8.3). Our approach is motivated by recent advances in automata-based reasoning of quantitative properties using comparator automata [25, 28] . We show that when the discount factor is an integer, satisficing problem can be solved in O(|V | + |E|) via an efficient reduction to safety/reachability games [87] .

Observe that there is a fundamental separation between the complexity of satisficing and the number of iterations required in VI, indicating a computational gain of our comparator-based solution for satisficing. As before, an empirical evaluation confirms this as well. Last but not the least, we show that unlike optimization, satisficing naturally integrates with temporal goals. The reason is that since both, saisficing and temporal goals, adopt automata-based solutions, the two can be seamlessly combined with one another ( § 8.6). Currently, our method works for integer discount factors only. We believe, these results can be extended to the non-integer case with some approximation guarantee.

The optimization problem can be solved by a VI algorithm ( § 8 

The VI algorithm plays a min-max game between the two players [93] . Let wt k (v) denote the optimal cost of a k-length game that begins in state v ∈ V . Then wt k (v)

can be computed using the following equations: The optimal cost of a 1-length game beginning in state v ∈ V is as follows:

Given the optimal-cost of a k-length game, the optimal cost of a (k + 1)-length game is computed as follows:

Then, it has been shown that the optimal cost of a k-length game beginning at state v ∈ V converges to the optimal cost of an infinite-length game beginning in state v ∈ V [83, 93] as k → ∞. In particular, let W be the optimal value (beginning in state v init ), then W = lim →k→∞ wt k (v init ).

The VI algorithm, as defined above, waits for convergence to terminate. Clearly, that is not sufficient for a worst-case analysis. To this end, in this section we establish a crisp bound on the number of iterations required to compute the optimal value.

Furthermore, we show that the bound we calculate is tight.

Upper bound on number of iterations.

We prove the upper bound in several steps, as described below:

Step 1 We show that the optimal value W must fall between an interval such that as the number of iterations k of the VI algorithm increases, the interval length converges to 0.

Step 2 We show that the optimal value W is a rational number with denominator at most bound W , where bound W is parameterized by numerator and denominator of the discount factor and the number of states in the graph game.

Step 3 Next, we show that the denominator of the minimum non-zero difference between possible values of the optimal cost is at most bound diff , where bound diff is also parameterized by numerator and denominator of the discount factor and the number of states in the graph game.

Step 4 Finally, we use the previous three Steps to prove the pseudo-polynomial bound.

Since the interval in Step 1 converges to 0, we can choose a k such that the interval is less than than 1/bound diff . In our computations below, we will see that 1 bound diff < 1 boundv . Therefore, there can be only one rational number with denominator bound W or less in the interval identified by the chosen k. Since this interval must also contain the optimal value W , the unique rational number with denominator less than or equal to bound W must be the optimal value W .

Our task is to compute the value of k.

We prove all of these steps one-by-one. We begin with proof of Step 1. We show that there is a finite-horizon approximation of the optimal value, i.e., for all k ∈ N the optimal value can be bounded using wt k (v init ) as follows:

Lemma 8.1 Let W be the optimal value of a graph game G. Let µ > 0 be the maximum of absolute value of cost on all transitions in G. Then, for all k ∈ N,

Proof 71 This holds because since W is the limit of wt k (v init ) as k → ∞, its value must lie in between the minimum and maximum cost possible if the k-length game is extended to an infinite-length game. The minimum possible extension would be when the k-length game is extended by iterations in which the cost incurred in each round is −µ. Therefore, the resulting lowest value is

Clearly, as k → ∞, the interval around the optimal value W converges to 0. This way we can find an arbitrarily small interval around the optimal value W .

To prove Step 2, we know that there exist memoryless optimal strategies for both players [83] . Therefore, there must exists an optimal play that is in the form of a memoryless strategies that result in the optimal value. The length of the simple lasso is at most |V |. Therefore, the length of the head and loop are at most |V | each.

Hence, the expression from Lemma 8.2 simplifies to (p |V | − q |V | ) · (p |V | ).

Similarly, Step 3 is resolved as follows:

q be the discount factor. Then the minimal non-zero difference between the cost on simple lassos is a rational number with denominator at most (p (|V | 2 ) − q (|V | 2 ) ) · (p (|V | 2 ) ).

The difference of two lassos l 1 and l 2 can be represented by another lasso l = l 1 × l 2 constructed from taking their product, and assigning the difference of their costs on each transition. If the maximum length of the lassos is |V |, then the maximum length of the difference lasso will be |V | 2 . Then, from Lemma 8.2 we immediately obtain that the upper bound of the denominator of the minimum non-zero difference of optimal plays is (p (|V | 2 ) − q (|V | 2 ) ) · (p (|V | 2 ) ).

Therefore, bound W = (p |V | − q |V | ) · (p |V | ) and bound diff = (p (|V | 2 ) − q (|V | 2 ) ) · (p (|V | 2 ) ).

Recall, in [93] Zwick-Paterson informally claim that a pseudo-polynomial algorithm can be devised to compute the optimal value of the game. We formalize their statement in the final step as follows:

be a graph game. The number of iterations required by the value-iteration algorithm or the length of the finite-length game to compute the optimal value W is 1 . O(|V | 2 ) when discount factor d ≥ 2,

d−1 + |V | 2 when discount factor 1 < d < 2.

Proof 75 Recall, the task is to find a k such that the interval identified by Step 1 is less than 1 bound diff . Note that bound W < bound diff . Therefore, 1 bound diff < 1 boundv . Hence, there can be only one rational value with denominator bound W or less in the small interval identified by the chosen k. Since the optimal value must also lie in this interval, the unique rational number with denominator bound W or less must be the optimal value. Let k be such that the interval from Step 1 is less than

The following cases occur depending how large or small the values are: Case 1. When d ≥ 2: In this case, both d and d |V | 2 are large. Then, 

This bound is tight. We show this by constructing a quantitative graph game for which it is necessary that the VI algorithm takes Ω(|V | 2 ) iterations. We give a sketch of this input instance in Fig 8. 1. Let all states in Fig 8. 1 belong to the maximizing player. Hence, the optimization problem reduces to searching for a path with optimal cost. The idea is to show that the cost of finite-length game with lesser than Ω(|V | 2 )length will result in an incorrect optimal cost. Therefore, Fig 8. 1 is designed in a way so that the path for optimal cost of a k-length game is along the right hand side (RHS) loop when k is small, but along the left hand side (LHS) loop when k is large. This way, the correct maximal value can be obtained only at a large value for k. Hence the VI algorithm runs for at least k iterations. when k is small, the optimal path is along the RHS loop since w > 0. However, since the RHS is larger, it accumulates cost slower than the LHS loop. As a result, as k becomes larger, for an appropriate assignment to w, the cost on the LHS will eventually be larger than that on the PHS. By meticulous reverse engineering of the size of both loops and the value of w, one can guarantee that k = Ω(|V | 2 ).

A concrete instance is as follows: Let the left hand side loop have 4n edges, the right hand side of the loop have 2n edges, and w = 1 d 3n + 1 d 7n + · · · + 1 d m·n−1 such that m · n − 1 = c · n 2 for a positive integer c > 0. One can show for a finite games of length (m · n − 1) or less, the optimal path arises from the loop to the right. But for games of length greater than (m · n − 1), the optimal path will be to due to the left hand side loop.

Finally, we present the complete worst-case complexity analysis. Since, VI algorithm is dominated by arithmetic operations, we take into account their cost as well. We work with the unit-cost and bit-cost models of arithmetic. We observe that this choice has a drastic impact to the worst-case complexities.

Under the unit-cost model of arithmetic, all arithmetic operations are assumed to take constant time. 

Proof 76 The cost of updating the cost of all vertices from iteration j to j + 1 is linear in |E| since every transition is traversed exactly once. Therefore, the worstcase complexity of computing the optimal values of a k-length game is O(k · |E|). In particular, the worst-case complexity of computing the optimal value is O(|V | 2 ·|E|) or O log(µ)·|E| d−1 + |V | 2 · |E| depending on whether the discount factor d ≥ 2 or 1 < d < 2, respectively (Theorem 8.1). To compute the cost of arithmetic in each iteration of the value-iteration algorithm, we define the cost of a transition (v, w) ∈ E in the k-th iteration as

Then, clearly, wt k (v) = µ{cost k (v, w)|w ∈ vE} if v ∈ V 0 and wt k (v) = min{cost k (v, w)|w ∈ vE} if v ∈ V 1 . Since, we compute the cost of every transition in each iteration, it is crucial to analyze the size and cost of computing cost.

Let G be a quantiative graph game. Let µ > 0 be the maximum of absolute value of all costs along transitions. Let d = p q be the discount factor. Then for all (v, w) ∈ E, for all k > 0

where n i ∈ Z such that |n i | ≤ µ for all i ∈ {1, . . . , k}.

Proof 77 Lemma 8.3 can be proven by induction on k.

Lemma 8.4 Let G be a quantiative graph game. Let µ > 0 be the maximum of absolute value of all costs along transitions. Let d = p q be the discount factor. For all (v, w) ∈ E, for all k > 0 the cost of computing cost k (v, w) in the k-th iteration is O(k · log p · µ{log µ, log p}).

We compute the cost of computing cost k (v, w) given that optimal costs have been computed for the (k − 1)-th iteration. Recall,

= γ(v, w) + q p · q k−2 · n 1 + q k−3 p · n 2 + · · · + p k−2 n k−1 p k−2 for some n i ∈ Z such that |n i | ≤ µ. Therefore, computation of cost k (v, w) involves four operations: 4. Addition of γ(v, w) · p k−1 with q · (q k−2 · n 1 + q k−3 p · n 2 + · · · + p k−2 n k−1 ). The cost is linear in their representations.

Therefore, the cost of computing cost k (v, w) is O(k · log p · µ{log µ, log p}).

Now, we can compute the cost of computing optimal costs in the k-th iteration from the k − 1-th iteration.

Lemma 8.5 Let G be a quantiative graph game. Let µ > 0 be the maximum of absolute value of all costs along transitions. Let d = p q be the discount factor. The worst-case complexity of computing optimal costs in the k-th iteration from the (k−1)th iteration is O(|E| · k · log µ · log p).

The update requires us to first compute the transition cost in the k-th iteration for every transition in the game. Lemma 8.4 gives the cost of computing the transition cost of one transition. Therefore, the worst-case complexity of computing transition cost for all transitions is O(|E| · k · log p · µ{log µ, log p}).

To compute the optimal cost for each state, we are required to compute the maximum transition cost of all outgoing transitions from the state. Since the denominator is same, the maximum value can be computed via lexicographic comparison of the numerators on all transitions. Therefore, the cost of computing maximum for all states is O(|E| · k · log µ · log p).

Therefore, total cost of computing optimal costs in the k-th iteration from the (k − 1)-th iteration is O(|E| · k · log p · µ{log µ, log p}).

Finally, the worst-case complexity of computing the optimal value of the quantitative game under bit-cost model for arithmetic operations is as follows: Theorem 8.3 Let G = (V, v init , E, γ) be a quantitative graph game. Let µ > 0 be the maximum of absolute value of all costs along transitions. Let d = p q > 1 be the discount factor. The worst-case complexity of computing the optimal value under bit-cost model for arithmetic operations is 1 . O(|V | 4 · |E| · log p · µ{log µ, log p}) when d ≥ 2,

Proof 80 This is the sum of computing the optimal costs for all iterations.

When d ≥ 2, it is sufficient to perform value iteration for O(|V | 2 ) times (Theorem 8.1). So, the cost is O((1 + 2 + 3 · +|V | 2 ) · |E| · log p · µ{log µ, log p}). This expression simplifies to O(|V | 4 · |E| · log p · µ{log µ, log p}).

A similar computation solves the case for 1 < d < 2.

Our analysis from Theorem 8.1 till Theorem 8.3 show that VI will not scale well despite being polynomial in size of the graph game. Even when the discount factor d > 1 is an integer (d ≥ 2), the algorithm requires Θ(|V | 2 ) iterations as Theorem 8.1 is tight. In this case, VI will be O(|V | 2 · |E|) and O(|V | 4 · |E|) under the unit-cost and bit-cost models for arithmetic operation, respectively (Theorem 8.2 and Theorem 8.3, respectively).

From a practical point of view, implementations of VI will use the bit-cost model as they may rely on multi-precision libraries in order to to avoid floating-point errors.

Therefore, VI for optimization will be expensive in practice. One may argue that the upper bounds from Theorem 8.3 may be tightened. But it must be noted that since VI requires Ω(|V | 2 ) iterations, even a tighter analysis will not significantly improve its performance in practice.

This section formally defines and investigates the satisficing problem. The intuition captured by satisficing is to determine whether a player can guarantee that the cost of all plays will never exceeds a given threshold value. Hence, satisficing can be perceived as a decision variant of optimization. In this section, we prove that when the discount factor is an integer, the satisficing problem can be solved in linear time in size of the game graph. Therefore, showing that satisficing is more scalable and efficient than optimization. A key feature of our solution for satisficing is that our solution relies on purely automata-based methods and avoids numerical computations.

We begin by formally defining the satisficing problem as follows:

Definition 8.1 (Satisficing problem) Given a quantitative graph game G and a threshold value v ∈ Q, the satisficing problem is to determine whether player P 1 has a strategy such that for all possible resulting plays of the game, the cost of all plays is less than (or ≤) to the threshold v, assuming that the objectives of P 0 and P 1 are to maximize and minimize the cost of plays, respectively.

This section is divided into two parts. § 8. 3 We begin with formal definitions: For sake of succinctness, we refer to the DS comparison language and DS comparator automata by comparison language and comparator, respectively.

Our finding here is all DS comparison languages are either safety langauges or cosafety langagues. More interestingly, the only parameter that decides whether a comparison langauge is safety/co-safety is the equality or inequality relation inc. The values of the discount factor and threshold have no implication on this property.

These observations are formally proven next: Recall definitions of safety/co-safety languages and bad-prefixes from Chapter 6. ≤ is a safety language.

The above intuition is formalized below:

By expansion of each term, we get

. Since the maximum value of discounted-sum of sequences bounded by µ is µ·d d−1 , we also get that

Putting it all together, for all i ≥ 0 we get

By definition of convergence, there exists an index k ≥ 0 such that 

Therefore, DS-comparator with ≤ is a safety comparator.

Next, we determine ω-regularity of comparison languages with threshold value v ∈ Q.

The critical parameter in this case is the discount factor d > 1. We observe that a comparison language is ω-regular iff the discount factor is an integer. Finally, while the threshold value does not affect the ω-regularity of a comparison language, it impacts the size of the ω-regular comparator automata (when discount factor d > 1 is an integer). First, we introduce notation. Since v ∈ Q, w. l Our first result is that a comparison language with v ∈ Q is not ω-regular if the discount factor is not an integer. The result follows immediately from prior work as it is known that comparison language with v = 0 is not ω-regular if the discount factor is not an integer [26] . Therefore: Theorem 8.5 Let µ > 0 be the integer upper bound, v ∈ Q be the threshold value, and inc ∈ {<, >, ≤, ≥, =, =} be an equality or inequality relation. Let the discount factor d > 1 be a non-integer. Then, the DS comparison language with µ, d, inc, and v is not ω-regular.

Next, we prove that DS comparison languages are ω-regular if the discount factor d > 1 is an integer. To prove this, we construct the Büchi automaton corresponding to the language. Since Büchi automata are closed under complementation, intersection and union, it is sufficient to construct the (Büchi) comparator automata for one equality or inequality relation inc. We do so for the relation ≤. The construction technique resembles the construction of comparator automata for threshold v = 0 for inequality relation ≤ presented in [28] . Due to space constraints, we present critical lemma statements, theorem statement and high-level intuition. Please refer to [28] or the supplemental material for details.

Since we know that the comparison language for ≤ is a safety language, we begin by characterizing the bad-prefixes of the comparison language with threshold v ∈ Q and relation ≤. For this, we introduce notation. Let W be a finite weight-sequence.

By abuse of notation, the discounted-sum of finite-sequence W with discount-factor d is defined as DS (W, d) = DS (W · 0 ω , d). The recoverable-gap of a finite weightsequences W with discount factor d, denoted gap(W, d), is its normalized discountedsum: If W = ε (the empty sequence), gap(ε, d) = 0, and gap(W, d) = d |W |−1 ·DS (W, d)

otherwise [31] . Lemma 8.6 Let µ > 0 be the integer upper bound, d > 1 be an integer discount factor, and the relation inc be the inequality ≤. Let v ∈ Q be the threshold value such 

. Next, we prove that if a finite weight sequence W is such that, the W is a bad prefix. Let Y be an arbitrary infinite but bounded weight sequence. Then

. By re-arrangement of terms we get that DS (W · Y , d)

Intuitively, Lemma 8.6 says that if an infinite length weight sequence A has a finite prefix for which its recoverable gap is too large, then the sequence A will not be present in the coveted language. This way, if we track the recoverable gap value of finite-prefixes of A, Lemma 8.6 gives a handle for when to reject A from the coveted language. It also says that if the recoverable gap of a finite-prefix exceeds the bounds in the lemma statement, then there is no need to track the recoverable gap of other finite-prefixes any further.

Similarly, we define very-good prefixes for the comparison language with µ, d, ≤

and v as follows: A finite and bounded weight-sequence W is a very good prefix for the aformentioned language if for all infinite, bounded extensions of W by Y , 

. By re-arrangement of terms we get that

Since maximal value of DS (Y, d) is µ d−1 , we get that DS (W · Y , d) ≤ v. Therefore, W is a very good prefix.

Intuitively, Lemma 8.7 says that if an infinite-length weight sequence A has a finite-prefix for which the recoverable gap is too small, A is present in the coveted language. This condition gives a handle on when to accept the weight sequence A by tracking on the recoverable gap of its finite prefixes. It also says that if the recoverable gap of a finite-prefix falls below the bound in Lemma 8.7 , there is no need to track recoverable gaps of finite-prefixes of A any further.

The question we need to answer to obtain the targeted Büchi automata is how to track the recoverable gaps of finite-prefixes using a finite amount of memory (or states). We already know that there are upper and lower bounds on which recoverable gaps are interesting: If a recoverable gap is too large (Lemma 8.6 ) or too small (Lemma 8.7), the recoverable gap value does not need to be tracked any more. Now note that since the discount factor is an integer, in our case the recoverable gap value is always an integer (from definition of recoverable gap). Therefore, between the upper and lower bounds for recoverable established by Lemma 8.6 and Lemma 8.7, there are only finitely many candidate values of recoverable gaps. These finitely many values will form the states of the automata.

The final question is how to determine the transition function of the automata. For this, observe that the recoverable-gap has an inductive definition i.e. gap(ε, d) = 0, where ε is the empty weight-sequence, and gap(W · w, d) = d · gap(W, d) + w, where w ∈ {−µ, . . . , µ}. Therefore, transitions between states are established by mimicking the inductive definition.

Therefore, the formal construction of the Büchi automata for comparison language with threshold v with relation ≤ is given as follows: Note that the (deterministic) Büchi automaton constructed in Theorem 8.6 is a safety automaton [63] . Safety/Co-safety automata for all other relations can be constructed by simple modifications to the one constructed above. Further, note that the number of states depends on the value v (number n). Lastly, this construction is tight, since prior work shows that when v = 0, the automaton constructed above is the minimal automata for its language [28] .

We arrive at the core result of this section. We show that when the discount factor is an integer, satisficing is reduced to solving a safety or reachability game. This method is linear in size of the game graph, as opposed to the higher polynomial solutions of optimization via VI. Hence, satisficing is more efficient and scalable alternative to analyze quantitative graph games for integer discount factors.

The key idea behind this reduction is as follows: Recall, the satisficing problem is to determine whether player P 1 has a strategy that can guarantee that all resulting plays will have cost less than (or less than or equal to) a given threshold value v ∈ Q.

When the discount factor is an integer, the criteria for satisficing to hold is the same as ensuring that every resulting play is accepted by the safety/co-safety comparator automata with the same discount factor inequality relation, and threshold value.

This lets us show that when the discount factor is an integer, the satisficing problem reduces to solving a new game obtained by taking the product of the quantitative graph game with the appropriate comparator. Finally, the resulting new game will be safety or reachability depending on whether the comparator is safety or co-safety, respectively.

The formal reduction is as follows: Let G = (V = V 0 V 1 , v init , E, γ) be a quantitative graph game. Let the maximum weight on the graph game be the integer µ > 0.

Let d > 1 be the integer discount factor. Suppose, v = v 0 v 1 · v m (v m+1 · · · v n ) ω is the rational threshold value for the satisficing problem, and suppose the inequality in the problem is inc ∈ {<, ≤}.

Then, first construct the ω-regular comparator automaton for µ, d, inc and v.

Let us denote it by A = (S, s I , Σ, δ, F). From Corollary 8.3 we know that A is a deterministic safety or co-safety automaton. Next, construct structure GA = (W = W 0 ∪ W 1 , s 0 × init, δ W , F W ) from G and A as follows:

• W = V × S. Specifically W 0 = V 0 × S and W 1 = V 1 × S.

Since V 0 and V 1 are disjoint, W 0 and W 1 are disjoint too. Let sets of states W 0 and W 1 belong to players P 0 and P 1 in GA.

• Let s 0 × init be the initial state of GA.

• Transition relation δ W = W × W is defined such that transition (w, w ) ∈ δ W where w = (v, s) and w = (v , s ) if transition (v, v ) ∈ δ, n = γ((v, v )), i.e., n is the cost of the transition in G, and -(s, n, s ) ∈ δ C is a transition in the comparator automata.

• F W = V × F Finally, let GA be a reachability game if the comparator A is a co-safety automaton, and let GA be a safety game otherwise. Correspondingly, F W will be referred to as accepting states or rejecting states.

Let us call a play in the quantitative graph game G to be winning play for P 1 if its cost relates to threshold v by inc. Recall, the definition of winning plays in reachability and safety games (Chapter 2). Then, it is easy to show the following correspondence by relating plays in G and those in GA, since A is deterministic:

There is a one-one correspondence between plays in G and plays in GA.

This correspondence preserves the winning condition. Next, let us call a strategy of player P 1 a winning strategy for P 1 in G if the strategy guarantees that all resulting plays in GA are winning for P 1 . Then, it is easy to see that the winning play preserving, one-one correspondence between plays in GA and G obtained in Lemma 8.8 can be lifted to winning strategies between both games. As a result, we get the following: Lemma 8.9 Let G be a quantitative graph game, and d > 1 be an integer discount factor. Then player P 1 has a winning strategy for the satisficing problem with threshold v ∈ Q and relation inc ∈ {<, ≤} iff player P 1 has a winning strategy in the reachability or safety game GA constructed as above.

The proof lifts Lemma 8.8 from plays to strategies. The same idea of winning runs-preserving one-one correspondence is lifted to complete this proof. Due to the similarities, we skip the proof here.

Then, the final statement for our reduction is: Proof 87 From Lemma 8.9 we know satisficing G is equivalent to solving the reachability/safety game GA In order to determine its complexity of solving GA, we need to know its number of states and edges. Clearly, there are O(|V | · µ · n) states in GA, since the comparator A has O(µ · n) states. We claim that GA has O(|E| · µ · v) edges.

For this, we observe that a state (v, s) in GA has the same number of outgoing edges as state v in G, as comparator A is deterministic. Since there are O(µ · n) copies of each state v in GA, there are O(|E| · µ · n) edges in GA. Therefore, the solving the reachability/safety game is O((|V | + |E|) · µ · n).

Therefore, we observe that when the discount factor is an integer, our comparatorbased solution for satisficing is more efficient than optimization via VI by degrees of magnitude in size of the game graph.

So far, we have observed that for integer discount factors, comparator-based satisficing is more efficient than VI-based optimization. However, a crucial question still remains unanswered: Does the complexity improvement between VI for optimization to comparators for satisficing arise from (a) solving the decision problem of satisficing instead of optimization, or (b) adopting the comparator approach instead of numerical methods? To answer this, we design and analyze a VI based algorithm for satisficing: If this algorithm shows improvement over VI for optimization, then the complexity gain would have occurred from solving satisficing. Otherwise, if this algorithm does not reflect any improvement, then the gain should have come from adhering to comparator based methods.

The VI based algorithm for satisficing is described as follows: Perform VI as was done for optimization in §3. 1 . Terminate the algorithm after whichever of the following two occurs first: (a) VI has performed the number of iterations as shown in Theorem 8.1, or (b) : In the k-th iteration, the threshold value v falls outside of the interval defined in Lemma 8. 1 . In either case, one can determine how the threshold value relates to the optimal value, and hence determines satisficing.

Clearly, termination by condition (a). results in the same number of iterations as optimization itself (Theorem 8.1). We show that condition (b) does not reduce the number of iterations either. On the contrary, it introduces non-robustness to the algorithm's performance (See § 8.1). The reason is that the number of iterations is based on the distance between the threshold value and the optimal value. So, if this distance is large, few iterations will be required since the interval length in Lemma 8.1 declines exponentially. But if the distance is small (say close to 0), then it will take as many iterations as taken by optimization itself. Formally, Theorem 8.8 Let d > 1 be an integer discount factor. Let G be a quantitative graph game with state set V . Let W be the optimal cost, and v ∈ Q be the threshold value.

Then number of iterations taken by VI for satisfaction is min{O(|V | 2 ), log µ |W |−v }.

Hence, VI for satisficing does not improve upon VI for optimization. This indicates comparator-based methods may be responsible for the improvement.

The goal of the empirical analysis is to determine whether the practical performance of these algorithms resonate with our theoretical discoveries.

CompSatisfice outperforms VIOptimal * in runtime and consequently in number of benchmarks solved The cactus plot in Fig 8. 2 clearly indicates that CompSatisfice is more efficient than VIOptimal. In fact, a closer look reveals that all benchmarks solved by VIOptimal have fewer than 200 states. In contrast, CompSatisfice solves all but two benchmarks. Thus solving benchmarks with many thousands of states.

To test scalability, we plot runtime of both tools on a set of scalable benchmarks.

For integer parameter i > 0, the i-th scalable benchmark has 3 · 2 i states. Fig 8. 3

plots the observations in log-log scale. Thus, the slope of the straight line indicates the degree of polynomial (in practice). We see that in practice CompSatisfice exhibits linear behavior (slope ∼1), whereas VIOptimal is much more expensive (slope >> 1) even for small values of weights and threshold.

CompSatisfice is more robust than VISatisfice. We compare CompSatisfice and

VISatisfice as the threshold value changes. This experiment is chosen due to Theorem 8.8 which proves that VISatisfice is non-robust. The goal is to see how VISatisfice fluctuates in practice, while CompSatisfice maintains steady performance owning to its low complexity. These are shown in Fig 8. 4.

In several synthesis tasks, the objective is to solve a quantitative game while also satisfying a given temporal goal. So, the goal is to ensure that the system has a strategy that also ensures that the temporal objective is met along all possible plays * Figures are best viewed online and in color resulting from the game. However, prior works have proven that optimal strategies may not exist under extension with temporal goals, rendering analysis by optimization incompatible with temporal goals [41] . In this section, we show that our comparatorbased solution for satisficing can extend to all kinds of temporal goals.

Intuitively, a quantitative game with temporal goals appends a quantitative game with a labeling function that assigns states to atomic propositions. Therefore, each play of the game is associated with the sequence of atomic propositions. It is over this sequence of atomic propositions that the given temporal goal is evaluated. In this section, we show that our comparator-based approach to solve satisficing on quantitative games can be extended to all temporal goals expressed by linear temporal logic (LTL) [77] . More generally, our approach extends to all ω-regular goals. This is in sharp contrast to existing work based on VI which extends to a limited subclass of temporal goals only, namely safety objectives [91] , since optimal solutions are not known to exist with ω-regular objectives [41] .

Formally, a quantitative game with temporal goals, denoted by G T , is the tuple (G, AP , L, ϕ) where G is a quantitative game, AP is a finite set of atomic propositions, L : V → 2 AP is a labeling function, and ϕ is an LTL formula over propositions AP . Plays, cost-sequences, cost and strategies are defined as earlier for quantitative games. The difference is that plays in a quantitative game with temporal goals are also accompanied with a sequence of propositions. If ρ = v 0 v 1 v 2 . . . is a play in game G T , then ρ AP = A 0 A 1 A 2 . . . is the proposition sequence where A i = L(v i ) ⊆ AP for all i ≥ 0. Given an LTL formula ϕ over AP , we say that a play ρ satisfies ϕ if its corresponding proposition sequence ρ AP satisfies ϕ.

Let us assume that the objectives of P 0 and P 1 are to maximize and minimize the cost of plays, respectively. Given threshold value v ∈ Q and inequality relation inc ∈ {≤, <}, a strategy for P 1 is said to be winning in a quantitative game with temporal goals G T , threshold v and relation inc if the strategy is winning for P 1 in quantitative game G with threshold v and relation inc, and every resulting play in the game satisfies ϕ.

The satisficing problem with temporal goals is defined as follows:

Definition 8.4 (Satisficing problem with temporal goals) Given a quantitative game with temporal goals G T = (G, AP , L, ϕ), a threshold value v ∈ Q, and an inequality relation R ∈ {≤, <}, the satisficing problem with temporal goals is to determine whether P 1 has a winning strategy in G T with threshold v and relation inc, assuming that the objectives of P 0 and P 1 are to maximize and minimize the cost of plays, respectively.

We show that solving the satisficing problem with temporal goals reduces to solving a parity game. The key insight behind this is that when the discount factor is an integer then both the satisficing criteria and the temporal goal are represented by NBAs. As a result of which the criteria for the player to win a satisficing problem with temporal goals can be represented by a single NBA that refers to the combination of both conditions. Thus, leading to a parity game. • If ϕ can be represented by a (deterministic) safety/co-safety automata A, then solving the satisficing problem with temporal goals is O((|V | + |E|) · µ · n · |A|),

where |A| = 2 2 O(|ϕ|) .

Proof 88 This reduction can be carried out in two steps. In the first step, we reduce the quantitative graph game G to a reachability/safety game GA by means of the comparator automata for DS with integer discount factor, as done in Section 8. 3 

There is one key difference in constructing GA. The labelling function of G T is extended to states in GA so that the label of state (v, s) in GA is identical to the label of state v in G T . In the second step, we incorporate the temporal goal into the safety/reachability game GA. For this, first the temporal goal ϕ is converted into its equivalent deterministic parity automaton (DPA) [87] . Next, take the product of the game GA with the DFA by synchronizing the labeling function of GA with transitions in the DPA. This follows standard operations for product constructions. The end result will be a parity game that is linear in the size of GA and the DPA for ϕ. Recall from Section 8. 3.2 , size of GA is linear in size of G, µ and n. Further, note that the DPA is double exponential in size of ϕ. Therefore, the final product game has size linear in |V | and µ, and double exponential in |ϕ|.

Finally, in the special case where the DPA for ϕ is actually a deterministic safety/co-safety automaton, the resulting product game can be solved in time linear in the size of the final product game. The reason behind this is that the final DPA will be a weak Büchi automata generated from the union of safety/co-safety automaton from the comparator, and a safety/co-safety autoamton from the temporal property (Theorem 6.1). Games with weak Büchi winning conditions can be solved in size linear to the underlying graph game.

Finally, the proof of correctness is identical to that of Lemma 8.9.

This work introduced the notion of satisficing for quantitative games with discountedsum aggregation function. This is proposed as a decision variant of the optimization problem. Following a thorough analysis of both problems, we show that satisficing theoretical, empirical and practical advantages over the optimization problem. In particular, we show that when the discount factor is an integer, then the satisficing problem can be solved by automata-based solutions. Not only is our automata-based solution for satisficing more efficient than existing solutions for optimization in size of the game graph, our solutions perform more scalablely and robustly in empirical evaluations. Furthermore, the practical applicability of quantitative games with temporal goals can be handled naturally with our automata-based solution for satisficing as opposed to the non-existence of optimal solutions with temporal goals. This works yet again presents the benefits of automata-based approaches in quantitative reasoning over traditional numerical approaches.

While this chapter explored the automata-based solution for an integer discount factor only, in theory these results could be extended to non-integer discount factors as well as done in Chapter 7. This may come at the cost of a small approximation factor, but we can envision solving approximate satisficing using the approximate comparators for non-integer discount factors. Of course, the critical question here is to investigate the impact of the approximation in practical usage. 

This thesis introduces comparator automata, a novel technique based on automata for quantitative reasoning. The use of automata for quantitative reasoning is unique in its own, and infact counterintuitive as far as earlier work is concerned.

The challenges of lack of genralizability and separation-of-techniques adversely affect the practical viability of quantitative reasoning, and also prevent a clear understanding of similarities and dissimilarities across aggregate functions. The motivation behind the development of comparator automata is to addresses these challenges.

Our investigations on quantitative inclusion and quantitative games, especially an in-depth examination of the discounted-sum aggregation function, have established key theoretical results, improved upon the state-of-the-art in empirical performance, and indicated promise of approach in applications.

An undercurrent in the progress of comparator-based approaches is that they have bridged quantitative reasoning to qualitative reasoning. This is in stark contrast to separation-of-techniques in prior work where the two are segregated from one another.

As demonstrated in this thesis, this digression will prove to be of immense importance in the long run, since it allows quantitative reasoning to leverage the advances in the theory, practice and applications of qualitative reasoning. To the best of our knowledge, this thesis is the first work to do so.

Theory: Building on the foundations.

ω-regular comparators advocate for the treatment of quantitative properties as any other ω-regular property. As a result, it is likely that problems that have found success with an ω-regular property, such as a temporal logic [77] , could do so with ω-regular comparators as well. The study of quantitative games with imperfect information gives a glimpse of the hypothesis. One avenue of interest are probabilistic systems. So far, probabilistic verification and synthesis involves techniques from linear programming, optimization, solving sets of equalities and inequalities, and so on. Symbolic methods have been gaining in popularity in qualitative reasoning due to their scalability and efficiency advantages over explicit methods [35] . In symbol methods, the state space is typically represented with a logarithmic number of bits. Following this success, the symbolic reasoning has been adopted in quantitative reasoning. In most cases, however, the symbolic structures extend their qualitative counterparts.

For example, Binary Decision Diagram (BDDs) [34] are a popular symbolic datastructure in qualitative domains. These have been extended to MTBDDs and ADDs to reason about quantitative properties. Since the development of quantitative symbolic data-structures and their algorithms are still in nascent stages, relative to their qualitative counterparts, their full impact is yet to be discovered.

While investigations on quantitative symbolic methods are underway, comparators offer an alternate perspective. Since comparators are automata, existing symbolic methods such as BDD, SAT and SMT can be applied. Albeit promising, only a comparative analysis can reveal the strengths and weakness of quantitative and comparator-based qualitative symbolic methods, and expose areas for improvement.

Application: White-boxing RL engines.

Owing to the unprecedented rise of machine learning algorithms, reinforcement learning (RL) has come to the forefront in the design of controllers for robotics, autonomous vehicles, and several control-related domestic appliances. Due to their safety-critical nature of these applications, it has become crucial to white-box the underlying RL algorithms. In response, the formal methods community has begun looking into the quantitative and qualitative aspects of RL algorithms.

This direction is particularly exciting for comparator automata. First of all, several RL engines make use of discounted-sum. Secondly, with comparators we can combine quantitative and qualitative properties with rigorous guarantees, as demonstrated in this thesis. Having said that, the capabilities of comparators will have to be extended further to be applicable to the domain. For example, RL algorithms have to work under the assumption that the reward function is known partially only. To address this, comparators will have to be effective at verification and synthesis under sum. Let n > 0 be the length of A, and ε = 2 −k .

By Cauchy-Schwarz inequality, we get

Since (1 + x) 1

x < e x , for all 0 < x ≤ 1 < n · µ 2 · (Σ n−1 i=0 (1 − 1 e ε·i ) 2 ) ≤ n · µ 2 · (Σ n−1 i=0 (1 − e −ε·i ) 2 ) ≤ n · µ 2 · n · (1 − e −ε·n ) 2 ) as i < n Since 1 − x < e −x holds for all x > 0 < n · µ 2 · n · (ε · n) 2 ) = ε · µ · n 2

The bound above is simply an upper bound. It will be good to tighten the bound to O(ε · µ · n).

111 6.2.1 Safety and co-safety automata

114 6.3.1 Safety and co-safety comparison languages and comparators

123 6.4.1 DS comparison languages and their safety/co-safety properties 124 6.4.2 Deterministic DS comparator for integer discount-factor . . . 126 6.4.3 QuIPFly: DS inclusion with integer discount factor

154 7.4.2 Comparator for lower approximation of DS

168 7.5.2 Comparator automata for upper approximation of DS

Anytime algorithm for DS inclusion

µ − 1, µ} • Transition function δ ⊆ S × Σ → S where (s, a, t) ∈ δ then: 1. If s = T l or s = T u , then t = s for all a ∈ Σ 2. Else, let roundUpper(d · s · 2 −(p+k) + a, k, p) = i · 2 −(p+k) for an integer i (a) If T l ≤ i ≤ T u , then t = i (b) If i > T h , then t = T u (c) If i < T l , then t = T l compUpper(µ, k, p, R) accepts a finite weight sequence W ∈ Σ * iff DSUpper

Recall, given inputs P , Q, discount factor 1 < d < 2 1. If P is not DS-included in Q and DSInclusion(P, Q, d) terminates, then

Q holds and DSInclusion(P, Q, d) terminates, then DSInclusion(P, Q, d) returns P ⊆ Q = True

If DSInclusion(P, Q, d) does not terminate in the ε-th round in which the approximation factor is ε > 0, then P ⊆ Q + d · ε holds

Proof 68 We reason about anytimeInclusion as that is sufficient. Proof of (1.) and (2.): anytimeInclusion terminates only if either lowApproxDSInc

Multiplication of q with (q k−2 · n 1 + q k−3 p · n 2 + · · · + p k−2 n k−1 ). The later is bounded by (k − 1) · µ · p k−1 since |n i | ≤ µ and p > q. The cost of this operation is O(log((k −1)·µ·p k−1 )·log(p)) = O(((k −1)

Multiplication of p with p k−2 . Its cost is O((k − 2) · (log p) 2 )

DS comparison languages are safety languages for relations

Proof 81 Due to duality of safety/co-safety languages, it is sufficient to show that DS-comparison language with ≤ is a safety language

DS (W, d) > v. By assumption, every i-length prefix W [i] of W can be extended to a bounded weightsequence W [i] · Y i such that DS (W [i] · Y i , d) ≤ v

The contribution of tail sequences W [i . . . ] and Y i to the discounted-sum of W and W [i] · Y i , respectively diminishes exponentially as the value of i increases. In addition, since and W and W [i] · Y i share a common i-length DS (Y, d)

All tools have been implemented in C++. To overcome floating-point errors in VIOptimal and VISatisfice, the tools invoke open-source arbitrary precision arithmetic library GMP (GNU Multi-Precision) , and hence are a better fit than randomly generated graph games. In all, we create 291 benchmarks. We use temporal specifications used in prior literature in synthesis from temporal specifications

Since all elements of B are bounded by d − 1, DS (B, d) can be represented as an ω-word as follows: Let B = B[0], B[1] . . . , then its ω-word representation in base d is given by + · (Int(DS (B, d), d), Frac(DS (B, d), d)) where Int(DS (B, d)

Therefore, application of transducer T to Büchi automaton B will result in a Büchi automaton over the alphabet Σ×AlphaRep(d) such that for all A ∈ Σ ω the automaton accepts (A, rep(DS (A, d), d)). This is exactly the DS-function automaton over input alphabet Σ and integer base d > 1. Therefore, the discounted-sum aggregate function with integer discount

Recall, this proof works only for the discounted-sum aggregate function with inte-A be a non-empty, bounded, finite, weightsequence

GLPK

Simulation subsumption in ramsey-based büchi automata universality and inclusion testing

Advanced ramsey-based büchi automata inclusion testing

On the theory of infinitely repeated games with discounting. Econometrica

Automatically generating problems and solutions for natural deduction

On the representation of numbers in a rational base

Whatś decidable about weighted automata?

Formalizing and reasoning about quality

Recognizing safety and liveness

An introduction to the streamqre language

On omega-languages defined by mean-payoff conditions

Reasoning about online algorithms with weighted automata

Fast equilibrium computation for infinitely repeated games

An improved algorithm for discounted payoff games

Probabilistic model checking

Model checking probabilistic systems

Principles of model checking

Quantitative verification of neural networks and its security applications

Algorithmic analysis of regular repeated games

Automata vs linear-programming discounted-sum inclusion

Comparator automata in quantitative verification

Comparator automata in quantitative verification (full version)

Hybrid compositional reasoning for reactive synthesis from finite-horizon specifications

Safety and co-safety comparator automata for discounted-sum inclusion

Better quality in synthesis through quantitative objectives

Graph games and reactive synthesis

Exact and approximate determinization of discounted-sum automata

The target discounted-sum problem

Faster algorithms for mean-payoff games. Formal methods in system design

Graph-based algorithms for boolean function manipulation. Computers

Symbolic model checking: 1020 states and beyond. Information and computation

Finite LTL synthesis as planning

Verifying quantitative properties using bound functions

Expressiveness and closure properties for quantitative languages

Quantitative languages

Meanpayoff parity games

Quantitative fair simulation games. Information and Computation

Regular real analysis

A static analysis for quantifying information flow in a simple imperative language

Automatic verification of finite-state concurrent systems using temporal logic specifications

Theory of ω-languages: Characterizations of ω-context-free languages

Model checking discounted temporal properties

Linear and branching metrics for quantitative transition systems

Discounting the future in systems theory

An analysis of time-dependent planning

Jean-François Raskin, and Szymon Toruńczyk. Energy and mean-payoff games with imperfect information

Antichain algorithms for finite automata

Handbook of weighted automata

Qlose: Program repair with quantitative objectives

Minimising deterministic büchi automata precisely using sat solving

Quantitative languages defined by functional automata

Model checking quantitative hyperproperties

Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor

Reactive synthesis for finite tasks under resource constraints

IEEE/RSJ International Conference on

Formal languages and their relation to automata

Syntax-guided synthesis with quantitative syntactic objectives

A characterization of the minimum cycle mean in a digraph

The equality problem for rational series with multiplicities in the tropical semiring is undecidable

Model checking of safety properties

Weak alternating automata are not that weak

Synthesis with incomplete informatio

Quantitative verification: Models, techniques and tools

Probabilistic symbolic model checking with PRISM: A hybrid approach

Advances and challenges of probabilistic model checking

This time the robot settles for a cost: A quantitative approach to temporal logic planning with partial satisfaction

Efficient minimization of deterministic weak ω-automata

Advanced automata minimization

Weighted automata algorithms

Weighted finite-state transducers in speech recognition

Linear automaton transformations

A course in game theory

Implementation of Symbolic Model Checking for Probabilistic Systems

The temporal logic of programs

Markov decision processes. Handbooks in operations research and management science

Principles of mathematical analysis

On the complexity of ω-automata

Relationships between nondeterministic and deterministic tape complexities

Formal specification for deep neural networks

Stochastic games

Automated feedback generation for introductory programming assignments

Introduction to reinforcement learning

Experimental evaluation of classical automata constructions

Automata, logics, and infinite games: A guide to current research

The büchi complementation saga

An automata-theoretic approach to automatic program verification

Dcsynth: Guided reactive synthesis with soft requirements

Correct-by-synthesis reinforcement learning with temporal logic constraints

A symbolic approach to safety ltl synthesis

The complexity of mean payoff games on graphs

Let µ > 0, d = 1 + 2 −k , ε = 2 −p be the upper bound, discount factor and approximation factor, respectively. Let T l be the largest integer such that T l · 2 −(p+k) ≤ −µ · 2 k (Lemma 7.4-Part 1). Let T u be the smallest integer such that T u ·2 −(p+k) ≥ µ·2 k +2 −p (Lemma 7.4 -Part 2). For relation R ∈ {≤, ≥}, construct DFA compLow(µ, k, p, R) = (S, s I , Σ, δ, F) as follows: 

The proof shows that the final state of a the run of a word represents its lower gap value. For this we show three things: Let s f be the final state of the run.(a). if T l < s f < T u then its lower gap value is s f · r, (b). if T l ≥ s f then the lower Proof 72 The result can proven by unrolling the expression of DS of a cost-sequence on a lasso path. The discounted sum of l is given as follows:Taking closed form expression of the term in the parenthesis, we getThe essence of Lemma 8.2 is that the DS of the cost-sequence of a lasso, simple or non-simple, is a rational number. From here, we can immediately derive Step 2:q be the discount factor. Then the optimal value of the game is a rational number with denominator at most (p |V | − q |V | ) · (p |V | )

The optimal value is obtained on a simple lasso as both players have Appendix A

We use the ω-regular comparator for DS-aggregate function for integer discount-factor to prove that discounted-sum with integer discount-factors is an ω-regular aggregate function.Theorem A.1 Let d > 1 be a non-integer, rational discount-factor. The discountedsum aggregate function with discount factor d is not ω-regular.Proof 89 Immediate from Lemma 4.1 and Theorem 4. 9 .Theorem A.2 Let d > 1 be an integer discount-factor. The discounted-sum aggregate function with discount-factor d is ω-regular under base d.