key: cord-0046129-qcpy9ycw
authors: Le Roux, Stéphane
title: Time-Aware Uniformization of Winning Strategies
date: 2020-06-24
journal: Beyond the Horizon of Computability
DOI: 10.1007/978-3-030-51466-2_17
sha: 4c5fc32338653a8751c7c33e0aa2703682e90b32
doc_id: 46129
cord_uid: qcpy9ycw

Two-player win/lose games of infinite duration are involved in several disciplines including computer science and logic. If such a game has deterministic winning strategies, one may ask how simple such strategies can get. The answer may help with actual implementation, or to win despite imperfect information, or to conceal sensitive information especially if the game is repeated. Given a concurrent two-player win/lose game of infinite duration, this article considers equivalence relations over histories of played actions. A classical restriction used here is that equivalent histories have equal length, hence time awareness. A sufficient condition is given such that if a player has winning strategies, she has one that prescribes the same action at equivalent histories, hence uniformization. The proof is fairly constructive and preserves finiteness of strategy memory, and counterexamples show relative tightness of the result. Several corollaries follow for games with states and colors.

In this article, two-player win/lose games of infinite duration are games where two players concurrently and deterministically choose one action each at every of infinitely many rounds, and "in the end" exactly one player wins. Such games (especially their simpler, turn-based variant) have been used in various fields ranging from social sciences to computer science and logic, e.g. in automata theory [5, 15] and in descriptive set theory [12] .

Given such a game and a player, a fundamental question is whether she has a winning strategy, i.e. a way to win regardless of her opponent's actions. If the answer is positive, a second fundamental question is whether she has a simple winning strategy. More specifically, this article investigates the following strategy uniformization problem: consider an equivalence relation ∼ over histories, i.e. over sequences of played actions; if a player has a winning strategy, has she a winning ∼-strategy, i.e. a strategy prescribing the same action after equivalent histories? This problem is relevant to imperfect information games and beyond.

This article provides a sufficient condition on ∼ and on the winning condition of a player such that, if she has a winning strategy, she has a winning ∼-strategy. The sufficient condition involves time awareness of the player, but perfect recall (rephrased in Sect. 2) is not needed. On the one hand, examples show the tightness of the sufficient condition in several directions; on the other hand, further examples show that the sufficient condition is not strictly necessary.

The proof of the sufficient condition has several features. First, from any winning strategy s, it derives a winning ∼-strategy s • f . The map f takes as input the true history of actions, and outputs a well-chosen virtual history of equal length. Second, the derivation s → s • f is 1-Lipschitz continuous, i.e., reactive, as in reactive systems. (Not only the way of playing is reactive, but also the synthesis of the ∼-strategy.) Third, computability of ∼ and finiteness of the opponent action set make the derivation computable. As a consequence in this restricted context, if the input strategy is computable, so is the uniformized output strategy. Fourth, finite-memory implementability of the strategies is preserved, if the opponent action set is finite. Fifth, strengthening the sufficient condition by assuming perfect recall makes the virtual-history map f definable incrementally (i.e. by mere extension) as the history grows. This simplifies the proofs and improves the memory bounds.

The weaker sufficient condition, i.e. when not assuming perfect recall, has an important corollary about concurrent games with states and colors: if any winning condition (e.g. not necessarily Borel) is defined via the colors, a player who can win can do so by only taking the history of colors and the current state into account, instead of the full history of actions. Finiteness of the memory is also preserved, if the opponent action set is finite. Two additional corollaries involve the energy winning condition or a class of winning conditions laying between Büchi and Muller.

Both the weaker and the stronger sufficient conditions behave rather well algebraically. In particular, they are closed under arbitrary intersections. This yields a corollary involving the conjunction of the two aforementioned winning conditions, i.e., energy and (sub)Muller.

Finding sufficient conditions for strategy uniformization may help reduce the winning strategy search space; or help simplify the notion of strategy: instead of expecting a precise history as an input, it may just expect an equivalence class, e.g. expressed as a simpler trace.

The strategy uniformization problem is also relevant to protagonist-imperfectinformation games, where the protagonist cannot distinguish between equivalent histories; and also to antagonist-imperfect-information games, where the protagonist wants to behave the same after as many histories as possible to conceal information from her opponent or anyone (partially) observing her actions: indeed the opponent, though losing, could try to lose in as many ways as possible over repeated plays of the game, to learn the full strategy of Player 1, i.e. her capabilities. In connection with the latter, the longer version [11] of this article studies the strategy maximal uniformization problem: if there is a winning strategy, is there a maximal ∼ such that there is a winning ∼-strategy? A basic result is proved (there but not here) and examples show its relative tightness.

Related Works. The distinction between perfect and imperfect information was already studied in [16] for finite games. Related concepts were clarified in [9] by using terms such as information partition and perfect recall: this article is meant for a slightly more general setting and thus may use different terminologies.

I am not aware of results similar to my sufficient condition for universal existence, but there is an extensive literature, starting around [18] , that studies related decision problems of existence: in some class of games, is the existence of a uniform winning strategy decidable and how quickly? Some classes of games come from strategy logic, introduced in [4] and connected to information imperfectness, e.g., in [2] . Some other classes come from dynamic epistemic logic, introduced in [8] and connected to games, e.g., in [20] and to decision procedures, e.g., in [14] . Among these works, some [13] have expressed the need for general frameworks and results about uniform strategies; others [3] have studied subtle differences between types of information imperfectness.

Imperfect information games have been also widely used in the field of security, see e.g. the survey [19, Section 4.2] . The aforementioned strategy maximal uniformization problem could be especially relevant in this context.

Structure of the Article. Section 2 presents the main results on the strategy uniformization problem; Sect. 3 presents various corollaries about games with states and colors; and Sect. 4 presents the tightness of the sufficient condition in several directions. Proofs and additional sections can be found in [11] .

The end of this section discusses many aspects of the forthcoming definitions and results.

Definitions on Game Theory. In this article, a two-player win/lose game is a tuple A, B, W where A and B are non-empty sets and W is a subset of infinite sequences over A × B, i.e. W ⊆ (A × B) ω . Informally, Player 1 and Player 2 concurrently choose one action in A and B, respectively, and repeat this ω times. If the produced sequence is in W , Player 1 wins and Player 2 loses, otherwise Player 2 wins and Player 1 loses. So W is called the winning condition (of Player 1).

The histories are the finite sequences over A × B, denoted by (A × B) * . The opponent-histories are B * . The runs and opponent-runs are their infinite versions.

A Player 1 strategy is a function from B * to A. Informally, it tells Player 1 which action to choose depending on the opponent-histories, i.e. on how Player 2 has played so far.

The induced history function h : ((B * → A) × B * ) → (A × B) * expects a Player 1 strategy and an opponent-history as inputs, and outputs a history.

It is defined inductively: h(s, ) := and h(s, β · b) := h(s, β) · (s(β), b) for all (β, b) ∈ B * × B. Informally, h outputs the very sequence of pairs of actions that are chosen if Player 1 follows the given strategy while Player 2 plays the given opponent-history. Note that β → h(s, β) preserves the length and the prefix relation, i.e. ∀β,

The function h is extended to accept opponent-runs (in B ω ) and then to output runs: h(s, β) is the only run whose prefixes are the h(s, β ≤n ) for n ∈ N, where β ≤n is the prefix of β of length n. A Player 1 strategy s is a winning

Definitions on Equivalence Relations over Histories. Given a game A, B, W , a strategy constraint (constraint for short) is an equivalence relation over histories. Given a constraint ∼, a strategy s is said to be a ∼-

Informally, a ∼-strategy behaves the same after equivalent histories that are compatible with s.

Useful predicates on constraints, denoted ∼, are defined below.

Note that the first three predicates above constrain only (the information available to) the strategies, while the last two constrain also the winning condition. 

If M is finite, s is said to be a finite-memory strategy.

Word Pairing: for all n ∈ N, for all u, v ∈ Σ n , let u v :

If moreover Q is finite, the constraint is said to be 2-tape-regular. Recognition of relations by several tapes was studied in, e.g., [17] . Note that 2-tape regularity of ∼ was called indistinguishability-based in [3] .

Main Results. Let us recall additional notions first. Two functions of domain Σ * that coincide on inputs of length less than n but differ for some input of length n are said to be at distance 1 2 n . In this context, a map from strategies to strategies (or to B * → B * ) is said to be 1-Lipschitz continuous if from any input strategy that is partially defined for opponent-histories of length up to n, one can partially infer the output strategy for opponent-histories of length up to n. Theorem 1. Consider a game A, B, W and a constraint ∼ that is time-aware and closed by adding a suffix. The two results below are independent.

If ∼ is also perfectly recalling and weakly W -closed, there exists a map f : Lemma 1 below shows that the five constraint predicates behave rather well algebraically. It will be especially useful when handling Boolean combinations of winning conditions. 1. If ∼ i is time-aware (resp. closed by adding a suffix, resp. perfectly recalling)

Comments on the Definitions and Results. In the literature, Player 1 strategies sometimes have type (A × B) * → A. In this article, they have type B * → A instead. Both options would work here, but the latter is simpler.

In the literature, Player 1 winning strategies are often defined as strategies winning against all Player 2 strategies. In this article, they win against all opponent runs instead. Both options would work here, but the latter is simpler.

Consider a game A, B, W and its sequentialized version where Player 1 plays first at each round. It is well-known and easy to show that a Player 1 strategy wins the concurrent version iff she wins the sequential version. I have two reasons to use concurrent games here, though. First, the notation is nicer for the purpose at hand. Second, concurrency does not rule out (semi-)deterministic determinacy of interesting classes of games 1 as in [1] and [10] , and using a sequentialized version of the main result to handle these concurrent games would require cumbersome back-and-forth game sequentialization that would depend on the winner. That being said, many examples in this article are, morally, sequential/turn-based games.

Strong W -closedness is indeed stronger than weak W -closedness, as proved in [11] . Besides these two properties, which relate ∼ and W , the other predicates on ∼ alone are classical when dealing with information imperfectness, possibly known under various names. For example, Closedness by adding a suffix is sometimes called the no-learning property.

However strong the strong W -closedness may seem, it is strictly weaker than the conjunction of perfect recall and weak W -closedness, as proved in [11] . This justifies the attributes stronger/weaker assumptions in Theorem 1. Note that the definition of strong W -closedness involves only the implication ρ ∈ W ⇒ ρ ∈ W , as opposed to an equivalence.

The update functions of memory-aware implementations have type M ×B → M , so, informally, they observe only the memory internal state and the opponent's action. In particular they do not observe for free any additional state of some system.

The notion of 2-tape recognizability of equivalence relations is natural indeed, but so is the following. An equivalence relation ∼ over Σ * is said to be 1-tape recognizable using memory Q if there exists an automaton (Σ, Q, q 0 , δ) such that u ∼ v iff δ + (u) = δ + (v). In this case there are at most |Q| equivalence classes. If Q is finite, ∼ is said to be 1-tape regular. When considering time-aware constraints, 2-tape recognizability is strictly more general, as proved in [11] , and it yields more general results. A detailed account can be read in [3] .

Here, the two notions of recognizability require nothing about the cardinality of the state space: what matters is the (least) cardinality that suffices. The intention is primarily to invoke the results with finite automata, but allowing for infinite ones is done at no extra cost.

In the memory part of Theorem 1.1, the Cartesian product involves only the accepting states F , but it only spares us one state: indeed, in a automaton that is 2-tape recognizing a perfectly recalling ∼, the non-final states can be safely merged into a trash state. In Theorem 1, however, the full M ∼ is used and followed by a powerset construction. So by Cantor's theorem, the memory bound in Theorem 1.2 increases strictly, despite finiteness assumption for B.

Generally speaking, 1-Lipschitz continuous functions from infinite words to infinite words correspond to (real-time) reactive systems; continuous functions correspond to reactive systems with unbounded delay; and computable functions to reactive systems with unbounded delay that can be implemented via Turing machines. Therefore both 1-Lipschitz continuity and computabilty are desirable over continuity (and both imply continuity).

In Theorem 1.1, the derived ∼-strategy is of the form s • f s , i.e. it is essentially the original s fed with modified inputs, which are called virtual opponenthistories. Theorem 1.1b means that it suffices to know s for opponent-history inputs up to some length to infer the corresponding virtual history map f s for inputs up to the same length. Theorem 1.1c means that for each fixed s, the virtual opponent-history is extended incrementally as the opponent-history grows. The assertations 1b and 1c do not imply one another a priori, but that they both hold implies Theorem 1.1d indeed; and Theorem 1.1d means that one can start synthesizing a ∼-strategy and playing accordingly on inputs up to length n already when knowing s on inputs up to length n. This process is even computable in the setting of Theorem 1.1e.

In Theorem 1.2, the derived ∼-strategy has a very similar form, but the f s no longer preserves the prefix relation since the perfect recall assumption is dropped. As a consequence, the virtual opponent-history can no longer be extended incrementally: backtracking is necessary. Thus there is no results that correspond to Theorems 1.1b and 1.1c, yet one retains both 1-Lipschitz continuity of the selfmap and its computability under suitable assumptions: Theorems 1.2b and 1.2c correspond to Theorems 1.1d and 1.1e, respectively.

In Lemma 1, constraints intersection makes sense since the intersection of equivalence relations is again an equivalence relation. This is false for unions; furthermore, taking the equivalence relation generated by a union of equivalence relations would not preserve weak or strong W -closedness.

It is sometimes convenient, for intuition and succinctness, to define a winning condition not as a subset of the runs, but in several steps via states and colors.

Given the current state, a pair of actions chosen by the players produces a color and determines the next state, and so on. The winning condition is then defined in terms of infinite sequences of colors. A, B, Q, q 0 , δ, C, col such that -A and B are non-empty sets (of actions of Player 1 and Player 2), -Q is a non-empty set (of states), (is a coloring function) .

Providing an arena with some W ⊆ C ω (a winning condition for Player 1) defines a game.

In such a game, a triple in Q × A × B is informally called an edge because it leads to a(nother) state via the udpate function δ. Between two states there are |A × B| edges. Note that the colors are on the edges rather than on the states. This is generally more succinct and it is strictly more expressive in the following sense: in an arena with finite Q, infinite A or B, and colors on the edges, infinite runs may involve infinitely many colors. However, it would never be the case if colors were on the states. The coloring function col is naturally extended to finite and infinite sequences over A × B. By induction, col ++ ( ) := , and col ++ (a, b) := col (q 0 , a, b) , and col ++ (ρ(a, b) ) := col ++ (ρ)col(δ + (ρ), a, b) . Then col ∞ (ρ) is the unique sequence in C ω such that col ++ (ρ ≤n ) is a prefix of col ∞ (ρ) for all n ∈ N. Note that |col ++ (ρ)| = |ρ| for all ρ ∈ (A × B) * .

The histories, strategies, and winning strategies of the game with states and colors are then defined as these of A, B, (col ∞ ) −1 [W ] , which is a game as defined in Sect. 2. Conversely, a game A, B, W may be seen as a game with states and colors A, B, Q, q 0 , δ, C, col, W where C = A × B, and Q = {q 0 }, and col(q 0 , a, b) = (a, b) for all (a, b) ∈ A × B.

Recall that the update functions of memory-aware implementations have type M × B → M , so they do not observe the states in Q for free. This difference with what is customary in some communities is harmless in terms of finiteness of the strategy memory, though.

A Universal Result for Concurrent Games. Corollary 1 below considers games with states and colors. Corollary 1.1 (resp. 1.2) is a corollary of Theorem 1.1 (resp. Theorem 1.2). It says that if there is a winning strategy, there is also one that behaves the same after histories of pairs of actions that yield the same sequence of states (resp. the same current state) and the same sequence of colors. Note that no assumption is made on the winning condition in Corollary 1: it need not be even Borel. Consider a game with states and colors G = A, B, Q, q 0 , δ, C, col, W where Player 1 has a winning strategy s. The two results below are independent.

s) that satisfies the following for all β, β ∈ B * .

Furthermore, if s can be implemented via memory space M , so can s ; and if B is finite and ∼ is computable, s is obtained in a computable manner from s. 2. Then Player 1 has a winning strategy s (obtained in a Lipschitz manner from s) that satisfies the following for all β, β ∈ B * .

Furthermore, if s can be implemented via memory M , then s can be implemented via memory size 2 |M |(|Q| 2 +1) ; and if B is finite and ∼ is computable, s is obtained in a computable manner from s.

On the one hand, Corollary 1 exemplifies the benefit of dropping the perfect recall assumption to obtain winning strategies that are significantly more uniform. On the other hand, it exemplifies the memory cost of doing so, which corresponds to the proof-theoretic complexification from Theorem 1.1 to Theorem 1.2, as is discussed in [11] .

To prove Corollary 1.2 directly, a natural idea is to "copy-paste", i.e., rewrite the strategy at equivalent histories. If done finitely many times, it is easy to prove that the derived strategy is still winning, but things become tricky if done infinitely many times, as it should.

Note that in Corollary 1, assumptions and conclusions apply to both players: indeed, since no assumption is made on W , its complement satisfies all assumptions, too.

A consequence of Corollary 1 is that one could define state-color strategies as functions in (Q * × C * ) → A or even (Q × C * ) → A, while preserving existence of winning strategies. How much one would benefit from doing so depends on the context.

In the remainder of this section, only the weaker sufficient condition, i.e., Theorem 1.2 is invoked instead of Theorem 1.1.

Between Büchi and Muller. In Corollary 1 the exact sequence of colors mattered, but in some cases from formal methods, the winning condition is invariant under shuffling of the color sequence. Corollary 2 below provides an example where Theorem 1 applies (but only to Player 1). G = A, B, Q, q 0 , δ, C, col, W with finite Q and C, and where W is defined as follows: let (C i ) i∈I be subsets of C, and let γ ∈ W if there exists i ∈ I such that all colors in C i occur infinitely often in γ.

If Player 1 has a winning strategy, she has a finite-memory one that behaves the same if the current state and the multiset of seen colors are the same.

Note that the games defined in Corollary 2 constitute a subclass of the concurrent Muller games, where finite-memory strategies suffice [7] , and a superclass of the concurrent Büchi games, where positional (aka memoryless) strategies suffice. In this intermediate class from Corollary 2, however, positional strategies are not sufficient: indeed, consider the three-state one-player game in Fig. 1 where q 1 and q 2 must be visited infinitely often. As far as I know, Corollary 2 is not a corollary of well-known results, although the complement of the winning condition therein can be expressed by a generalised Büchi automaton. Energy Games. The energy winning condition relates to real-valued colors. It requires that at every finite prefix of a run, the sum of the colors seen so far is nonnegative. More formally, ∀ρ ∈ (A × B) ω , ρ ∈ W ⇔ ∀n ∈ N, 0 ≤ col ++ (ρ ≤n ). Corollary 3 is weaker than the well-known positional determinacy of turnbased energy games, but its proof will be reused in that of Corollary 4. Conjunction of Winning Conditions. Corollary 4 below strengthens Corollary 2 (finite-memory aside) by considering the conjunction of the original Muller condition and the energy condition, which works out by Lemma 1.2.

Consider a game with states and colors G = A, B, Q, q 0 , δ, C × R, col, W with finite Q and C and where W ⊆ (C × R) ω is defined as follows: let (C i ) i∈I be subsets of C, and let γ ∈ W if there exists i ∈ I such that all colors in C i occur infinitely often in π 1 (γ) (the sequence of the first components) and if the energy level (on the second component) remains non-negative throughout the run, i.e. π 2 • col ++ (γ ≤n ) for all n ∈ N. If Player 1 has a winning strategy, she has one that behaves the same if the current state, the multiset of seen colors, and the energy level are the same.

The diversity of the above corollaries which follow rather easily from Theorem 1, especially Theorem 1.2, should suggest its potential range of application. Limitations and Opportunity for Meaningful Generalizations. Despite its relative tightness, Theorem 1 does not imply all known results that can be seen as instances of strategy uniformization problems, so there is room for meaningful generalizations. E.g., due to time awareness requirement, Theorem 1 does not imply positional determinacy of parity games [5, 15] , where two histories are equivalent if they lead to the same state. Nor does it imply countable compactness of first-order logic [6] , which is also an instance of a uniformization problem: Let (ϕ n ) n∈N be first-order formulas, and define a turn-based game: Spoiler plays only at the first round by choosing m ∈ N. Then Verifier gradually builds a countable structure over the signature of (ϕ n ) n∈N . More specifically, at every round she either chooses the value of a variable, or the output value of a function at a given input value, or the Boolean value of a relation for a given pair of values. Only countably many pieces of information are needed to define the structure, and one can fix an order (independent of m) in which they are provided. Verifier wins if the structure she has defined is a model of ∧ 0≤k≤m ϕ k . Let all histories of equal length be ∼-equivalent. Compactness says that if each ∧ 0≤k≤m ϕ k has a model, so does ∧ 0≤k ϕ k . Said otherwise, if Verifier has a winning strategy, she has a winning ∼-strategy, i.e. independent of Spoiler's first move. This ∼ satisfies all the conditions of Theorem 1.1 but weak W -closedness: the premise (∀n ∈ N, ρ ≤n ∼ ρ ≤n ) holds by universality, but the conclusion (ρ ∈ W ⇔ ρ ∈ W ) is false since a model for ∧ 0≤k≤m ϕ k need not be a model for ∧ 0≤k≤m+1 ϕ k .

Determinacy in discrete-bidding infinite-duration games

Strategy logic with imperfect information

Observation and distinction. representing information in infinite games

Strategy logic

Tree automata, mu-calculus and determinacy

Die vollständigkeit der axiome des logischen funktionenkalküls

Trees, automata, and games

Knowledge and belief: an introduction to the logic of the two notions

Extensive games. Proc. Nat. Acad. Sci

Concurrent games and semi-random determinacy. In: 43rd International Symposium on Mathematical Foundations of Computer Science, MFCS 2018

Time-aware uniformization of winning strategies

Borel determinacy

A general notion of uniform strategies

Reachability games in dynamic epistemic logic

Games with forbidden positions

Theory of Games and Economic Behavior

Finite automata and their decision problems

The complexity of two-player games of incomplete information

A survey of game theory as applied to network security

Games in dynamic-epistemic logic

Acknowledgment. This article benefited from a useful conversation with François Schwarzentruber and Stéphane Demri, and from careful reading and comments by several anonymous referees.