key: cord-0054929-eo0yr1nq
authors: Löding, Christof; Pirogov, Anton
title: Ambiguity, Weakness, and Regularity in Probabilistic Büchi Automata
date: 2020-04-17
journal: Foundations of Software Science and Computation Structures
DOI: 10.1007/978-3-030-45231-5_27
sha: 73d50341d2b787fad06412d9f093d7c545d3ecc8
doc_id: 54929
cord_uid: eo0yr1nq

Probabilistic Büchi automata are a natural generalization of PFA to infinite words, but have been studied in-depth only rather recently and many interesting questions are still open. PBA are known to accept, in general, a class of languages that goes beyond the regular languages. In this work we extend the known classes of restricted PBA which are still regular, strongly relying on notions concerning ambiguity in classical [Formula: see text] -automata. Furthermore, we investigate the expressivity of the not yet considered but natural class of weak PBA, and we also show that the regularity problem for weak PBA is undecidable.

Probabilistic finite automata (PFA) are defined similarly to nondeterministic finite automata (NFA) with the difference that each transition is equipped with a probability (a value between 0 and 1), such that for each pair of state and letter, the probabilities of the corresponding outgoing transitions sum up to 1. PFA have been investigated already in the 1960ies in the seminal paper of Rabin [18] . But while the development of the theory of automata on infinite words also started around the same time [7] , the model of probabilistic automata on infinite words has first been studied systematically in [3] . The central model in this theory is the one of probabilistic Büchi automata (PBA), which are syntactically the same as PFA. The acceptance condition for runs is defined as for standard nondeterministic Büchi automata (NBA): a run on an infinite word is accepting if it visits an accepting state infinitely often (see [23, 24] for an introduction to the theory of automata on infinite words). In general, for probabilistic automata one distinguishes different criteria of when a word is accepted. In the positive semantics, it is required that the probability of the set of accepting runs is greater than 0, in the almost-sure semantics it has to be 1, and in the threshold semantics it has to be greater than a given value λ between 0 and 1. It is easy to see that PFA with positive or almost-sure semantics can only accept regular languages, because these conditions correspond to the fact that there is an accepting run or This work is supported by the German research council (DFG) Research Training Group 2236 UnRAVeL that all runs are accepting. For infinite words the situation is different, because single runs on infinite words can have probability 0. Therefore, the existence of an accepting run is not the same as the set of accepting runs having probability greater than 0 (similarly, almost-sure semantics is not equivalent to all runs being accepting). And in fact, it turns out that PBA with positive (or almostsure) semantics can accept non-regular languages [3] . This naturally raises the question under which conditions a PBA accepts a regular language.

In [3] a subclass of PBA that accept only regular languages (under positive semantics) is introduced, called uniform PBA. The definition uses a semantic condition on the acceptance probabilities in end components of the PBA. A syntactic class of PBA that accepts only regular languages (under positive and almost-sure semantics) are the hierarchical PBA (HPBA) introduced in [8] . The state space of HPBA is partitioned into a sequence of layers such that for each pair of state and letter there is at most one transition that does not increase the layer. Decidability and expressiveness questions for HPBA have been studied in more detail in [11, 10] . While HPBA accept only regular languages for positive and almost-sure semantics, it is not very hard to come up with HPBA that accept non-regular languages under the threshold semantics [8, 11] (see also the example in Figure 2 (a) on page 531). Restricting HPBA further such that there are only two layers and all accepting states are on the first layer leads to a class of PBA (called simple PBA, SPBA) that accept only regular languages even under threshold semantics [9] .

In this paper, we are also interested in the question under which conditions PBA accept only regular languages. We identify syntactical patterns in the transition structure of PBA whose absence guarantees regularity of the accepted language. These patterns have been used before for the classification of the degree of ambiguity of NFA and NBA [25, 19, 16] . The degree of ambiguity of a nondeterministic automaton corresponds to the maximal number of accepting runs that a single input word can have. For NBA, the ambiguity can (roughly) be uncountable, countable, or finite. For positive semantics, we show that PBA whose transition structure corresponds to at most countably ambiguous NBA, accept only regular languages. For almost-sure semantics, we need a slightly stronger condition for ensuring regularity. But both classes that we identify are easily seen to strictly subsume the class of HPBA. For the emptiness and universality problems for these classes we obtain the same complexities as the ones for HPBA. In the case of threshold semantics, we show that finite ambiguity is a sufficient condition for regularity of the accepted language, generalizing a corresponding result for PFA from [12] . The class of finitely ambiguous PBA strictly subsumes the class of SPBA.

Besides the relation between regularity and ambiguity in PBA, we also investigate the class of weak PBA (abbreviated PWA). In weak Büchi automata, the set of accepting states is a union of strongly connected components of the automaton. We show that PWA with almost-sure semantics define the same class of languages as PBA with almost-sure semantics (which implies that with positive semantics PWA define the same class as probabilistic co-Büchi automata). This is in correspondence to results for non-probabilistic automata: weak automata with universal semantics (a word is accepted if all runs are accepting) define the same class as Büchi automata with universal semantics, and nondeterministic weak automata correspond to nondeterministic co-Büchi automata (see, e.g., [17] , where weak automata are called weak parity automata). Furthermore, it is known that universal Büchi automata, respectively nondeterministic co-Büchi automata, can be transformed into equivalent deterministic automata (with the same acceptance condition). An analogue of deterministic automata in the probabilistic setting are the so-called 0/1 automata, in which each word is either accepted with probability 0 or with probability 1. It is known that almost-sure PBA can be transformed into equivalent 0/1 PBA (see the proof of Theorem 4.13 in [4] ). Concerning weak automata, a language can be accepted by a deterministic weak automaton (DWA) if, and only if, it can be accepted by a deterministic Büchi and by a deterministic co-Büchi automaton (this follows from results in [14] , see [6] for a more direct construction). We show an analogous result in the probabilistic setting: The class of languages defined by 0/1 PWA corresponds to the intersection of the two classes defined by PWA with almostsure semantics and with positive semantics, respectively. It turns out that this class contains only regular languages, that is, 0/1 PWA define the same class as DWA.

We also show that the regularity problem for PBA is undecidable (the problem of deciding for a given PBA whether its language is regular). For PBA with positive semantics this is not surprising, as for those already the emptiness problem is undecidable [4] . However, for PBA with almost-sure semantics the emptiness and universality problems are decidable [1, 2, 8] . We show that regularity is undecidable already for PWA with almost-sure or with positive semantics. The proof also yields that it is undecidable for a fixed regular language whether a given PWA accepts this language.

This work is organized as follows. After introducing basic notations in Section 2 we first characterize various regular subclasses of PBA that we derive from ambiguity patterns in Section 3 and then we derive some related complexity results in Section 4. In Section 5 we present our results concerning weak probabilistic automata and in Section 6 we conclude.

First we briefly review some basic definitions.

If Σ is a finite alphabet, then Σ * is the set of all finite and Σ ω is the set of all infinite words w = w 0 w 1 . . . with w i ∈ Σ. For a word w we denote by w(i) the i-th symbol w i .

Classical automata used in this work have usually the shape (Q, Σ, ∆, Q 0 , F ), where Q is a finite set of states, Σ a finite alphabet, ∆ ⊆ Q × Σ × Q is the transition relation and Q 0 , F ⊆ Q are the sets of initial and final states, respectively.

We write ∆(p, a) := {q ∈ Q | (p, a, q) ∈ ∆} to denote the set of successors of p ∈ Q on symbol a ∈ Σ, and ∆(P, w) for P ⊆ Q, w ∈ Σ * with the usual meaning, i.e., states reachable on word w from any state in P .

A run of an automaton on a word w ∈ Σ ω is an infinite sequence of states q 0 , q 1 , . . . starting in some q 0 ∈ Q 0 such that (q i , w(i), q i+1 ) ∈ ∆ for all i ≥ 0. We say that a set of runs is separated (at time i) when the prefixes of length i of those runs are pairwise different.

As usual, an automaton is deterministic if |Q 0 | = 1 and |∆(p, a)| ≤ 1 for all p ∈ Q, a ∈ Σ, and nondeterministic otherwise. For deterministic automata we may use a transition function δ : Q × Σ → Q instead of a relation.

Probabilistic automata we consider have the shape (Q, Σ, δ, µ 0 , F ), i.e., the transition relation is replaced by a function δ : Q × Σ × Q → [0, 1] which for each state and symbol assigns a probability distribution on successor states (i.e. q∈Q δ(p, a, q) = 1 for all p ∈ Q, a ∈ Σ), and µ 0 : Q → [0, 1] with q∈Q µ 0 (q) = 1 is the initial probability distribution on states. The support of a distribution µ is the set supp(µ) := {x | µ(x) > 0}. Similarly as above, we may write δ(µ, w) and mean the resulting probability distribution after reading w ∈ Σ * , when starting with probability distribution µ.

For a probabilistic automaton A the underlying automaton A is given by recovering the transition relation ∆ := {(p, x, q) | δ(p, x, q) > 0} of positively reachable states and the initial state set Q 0 := supp(µ 0 ).

As usual, a run of an automaton for finite words is accepting if it ends in a final state. For automata on infinite words, run acceptance is determined by the Büchi (run visits infinitely many final states) or Co-Büchi (run visits finitely many final states) conditions. We write p x → q if there exists a path from p to q labelled by x ∈ Σ + and p → q if there exists some x such that p x → q. The strongly connected component (SCC) of p ∈ Q is scc(p) := {q ∈ Q | p = q or p → q and q → p}. The set SCCs(A) := {scc(q) | q ∈ Q} is the set of all SCCs and partitions Q. An SCC is accepting (rejecting) if all (no) runs that stay there forever are accepting. An SCC is useless if no accepting run can continue from there. An automaton is weak, if the set of final states is a union of its SCCs. In this case, Büchi and Co-Büchi acceptance are equivalent and we treat weak automata as Büchi automata.

A classical automaton is trim if it has no useless SCCs, whereas a probabilistic automaton is trim if it has at most one useless SCC, which is a rejecting sink that we canonically call q rej . We assume w.l.o.g. that all considered automata are trim, which also means that in an underlying automaton the sink q rej is removed.

We call transitions of probabilistic automata that have probability 1 deterministic and otherwise branching. If there are transitions p a → q and p a → q with q = q , we call this pattern a fork. Every branching transition clearly has at least one fork. We call a (p, q, q ) fork intra-SCC, if p, q, q are all in the same SCC, otherwise it is an inter-SCC fork. A run of an automaton is deterministic if it never goes through forks, and limit-deterministic if it goes only through finitely many forks. We say that two deterministic runs merge when they reach the same state simultaneously. For a finite run prefix ρ, we call all valid runs with this prefix continuations of ρ.

A classical automaton A accepts w ∈ Σ ω if there exists an accepting run on w, and the language L(A) recognized by A is the set of all accepted words. If P is a set of states of an automaton, we write L(P ) for the language accepted by this automaton with initial state set P . For sets consisting of one state q, we write L(q) instead of L({q}).

For a probabilistic automaton A and an input word w (finite or infinite), the transition structure of A induces a probability space on the set of runs of A on w in the usual way. We do not provide the details here but rather refer the reader not familiar with these concepts to [4] . In general, we write Pr(E) for the probability of a measurable event E in a probability space. For probabilistic automata, we consider positive, almost-sure and threshold semantics, i.e., an automaton accepts w if the probability of the set of accepting runs on w is > 0, =1 or >λ (for some fixed λ ∈]0, 1[), respectively. For an automaton A these languages are denoted by L >0 (A), L =1 (A) and L >λ (A), respectively, whereas L(A) := L(A ) is the language of the underlying automaton. A probabilistic automaton is 0/1 if all words are accepted with either probability 0 or 1 (in this case the languages with the different probabilistic semantics coincide).

To denote the type of an automaton, we use abbreviations of the form XYA (γ) where the type of transition structure is denoted by X ∈ { D (det.), N (nondet.), P (prob.) }, the acceptance condition is specified by Y ∈ { F (finite word), B (Büchi), C (Co-Büchi), W (Weak) }, and for probabilistic transitions the semantics for acceptance is given by γ ∈ {>0,=1,>λ, 0/1}.

By L (γ) (XYA) we denote the whole class of languages accepted by the corresponding type of automaton. If L is a set of languages, then L denotes the set of all complement languages (similarly, for a language L, we denote by L its complement), and BCl(L) the set of all finite boolean combinations of languages in L. We use the notion of regular language for finite words and for infinite words (the type of words is always clear from the context).

Ambiguity of automata refers to the number of different accepting runs on a word or on all words. An automaton is finitely ambiguous (on w) if there are at most k different accepting runs (on w) for some fixed k ∈ N, and in case of at most one accepting run it is called unambiguous. If on each word there are only finitely many accepting runs, but no constant upper bound over all words, then it is polynomially ambiguous if the number of different run prefixes that are possible for any word prefix of length n can be bounded by a polynomial in n, and otherwise exponentially ambiguous. Finally, if if there exist words that have infinitely many runs, but no word on which there are uncountably many accepting runs, then it is countably ambiguous, and otherwise it is uncountably ambiguous.

In [16] (see also [19] ), a syntactic characterization of those classes is presented for NBA by simple patterns of states and transitions. We define those patterns here and refer to [16] for further details. An automaton A has an IDA pattern if there exist two states p = q and a word v ∈ Σ * such that p v → p, p v → q and q v → q. If additionally q ∈ F , then this is also an IDA F pattern. Finally, A has an EDA pattern if there exists a state p and v ∈ Σ * such that there are two different paths p v → p, and if additionally p ∈ F , this is also an EDA F pattern. If a PBA has no EDA pattern, we call it flat, reflecting the naming of a similar concept in other kinds of transition systems (e.g. [15] ). The names IDA and EDA abbreviate "infinite/exponential degree of ambiguity", which they indicated in the original NFA setting, and we keep those names for consistency.

By k-NBA, n k -NBA, 2 n -NBA, ℵ 0 -NBA we denote the subsets of at most finitely, polynomially, exponentially and countably ambiguous NBA (and similarly for other types of automata). When speaking about ambiguity of some PBA A, we mean the ambiguity of the trimmed underlying NBA A .

In [8] , hierarchical PBA (HPBA) were identified as a syntactic restriction on PBA which ensures regularity under positive and almost-sure semantics. A PBA with a unique initial state is hierarchical, if it admits a ranking on the states such that at most one successor on a symbol has the same rank, and no successor has a smaller rank. A HPBA has k levels if it can be ranked with only k different values. Simple PBA (SPBA) were introduced in [9] and are restricted HPBA with two levels such that all accepting states are on level 0. Fig. 1 : Illustration of the automata classes with restricted ambiguity as presented for NBA in [16] , which are characterized by the absence of the state patterns IDA, IDA F , EDA, and EDA F and their relation to the restricted classes called "Hierarchical PBA" (HPBA) [8] and "Simple PBA" (SPBA) [9] . We identify classes in this hierarchy which can be seen as extensions "in spirit" of respectively SPBA and HPBA, subsuming them while also preserving their good properties, as e.g. definition by syntactic means, regularity under different semantics and several complexity results.

First, we show how HPBA relate to the ambiguity hierarchy, which can easily be derived by inspection of the definitions. A visual illustration is given in Figure 1 .

Proposition 1 (Relation of HPBA and the ambiguity hierarchy).

Starting from these observations, this work was motivated by the question whether the ambiguity restrictions, which were only implicit in HPBA and SPBA, can be used explicitly to get larger classes with good properties. In the following we will positively answer this question.

First, we observe that probabilistic automata can recognize regular languages even under severe ambiguity restrictions. Proof. As A is a (w.l.o.g. complete) DBA, there exists exactly one run on each word and all transitions when seen as PBA must have probability 1. Clearly this unique natural 0/1 PBA obtained from A accepts the same language under both probable and almost-sure semantics and it is trivially unambiguous.

Limit-deterministic NBA (LDBA) are NBA which are deterministic in all non-rejecting SCCs. The natural mapping of LDBA into PBA [4, Lemma 4.2] already trivially yields countably ambiguous automata (because the deterministic part of the LDBA cannot contain an EDA F pattern, which implies uncountable ambiguity [16] ). The following result shows that already unambiguous PBA under positive semantics suffice for all regular languages. Theorem 1. Let L ⊆ Σ ω be a regular language. Then there exists an unambiguous PBA B such that L >0 (B) = L.

Proof (sketch). Let A = (Q, Σ, δ, q 0 , c) be a deterministic parity automaton accepting L, i.e., a finite automaton with priority function c : Q → {1, . . . , m} such that w ∈ L(A) iff the smallest priority assigned to a state on the unique run of A on w which is seen infinitely often is even.

We construct an unambiguous LDBA for L, which then easily yields a PBA >0 by assigning arbitrary probabilities ([4, Lemma 4.2]) without influencing the ambiguity. If the parity automaton A has m priorities, the LDBA B can be obtained by taking m+1 copies, where m of them are responsible for one priority each, and one is modified to guess which priority i on the input word is the most important one appearing infinitely often along the run of A, and correspondingly switch into the correct copy. This switching is done unambiguously for the first position after which no priority more important than i appears.

First we establish a result for flat PBA, i.e. PBA that have no EDA pattern. In automata without EDA pattern there are no states which are part of two different cycles labeled by the same finite word. Even though we defined flat PBA by using an ambiguity pattern, the set of flat PBA does not correspond to an ambiguity class, but it is useful for our purposes due to the following property: Lemma 1. If A is a flat PBA and w ∈ Σ ω , then the probability of a run of A on w to be limit-deterministic is 1.

Proof. Let Runs(A, w) denote the set of all runs of A on w and nldRuns(A, w) denote the subset containing all such runs that are not limit-deterministic. As A is flat, it has no EDA and thus also no EDA F pattern, hence A is at most countably ambiguous (by [16] ). Moreover, there are not only at most countably many accepting runs on any word, but also countably many rejecting runs (which can be seen by a simple generalization of [16, Lemma 4] ). But as all runs are disjoint events, each run ρ that uses infinitely many forks has probability 0, and the total number of runs is countable, we can see that

The following lemma characterizes acceptance of PBA under extremal semantics with restricted ambiguity and is crucial for the constructions in the following sections:

Lemma 2 (Characterizations for extremal semantics). Let A be a PBA.

1. If A is at most countably ambiguous, then w ∈ L >0 (A) ⇔ there exists an accepting run on w that is limit-deterministic. 2. If there are finitely many accepting runs of A on w, then w ∈ L =1 (A) ⇔ all runs on w are accepting and limit-deterministic. 3. If A is flat, then w ∈ L =1 (A) ⇔ there is no limit-deterministic rejecting run on w.

Proof. (1.) : For contradiction, assume that every accepting run on w goes through forks infinitely often. But then the probability of every individual accepting run on w is 0. Each run is a measurable event (it is a countable intersection of finite prefixes) and clearly disjoint from other runs, as two different runs must eventually differ after a finite prefix. But as the number of accepting runs is countable by assumption, by σ-additivity it follows that the probability of all accepting runs is also 0, contradicting the fact that w ∈ L >0 (A).

For the other direction, pick a limit-deterministic accepting run ρ of A on w and let uv = w and q ∈ Q such that the state of ρ after reading u is q and there are no forks visited on v. Clearly, the probability to be in q after u in a run of A is positive (because u is finite), and the probability that A continues like ρ from q on v is 1. Hence, the probability of ρ is positive.

(2.) : The (⇐) direction is obvious. We now proceed to show (⇒). Take some time t after which all accepting runs on w separated. Assume that some accepting run ρ is not limit-deterministic. But then ρ goes through infinitely many forks after t which with positive probability lead to a successor from which the probability to accept is 0, and the probability of following ρ is also 0. As the probability to follow ρ until time t is positive, but after that the probability to accept is 0, this implies that there is a positive probability that A rejects w. Therefore, all accepting runs on w must be limit-deterministic. Now assume that some run ρ on w is rejecting. Following this run until the time at which ρ is separated from all accepting runs has positive probability and all continuations must be also rejecting, so A must reject w.

(3.) : Clearly (⇒) holds, because a limit-deterministic rejecting run has positive probability, i.e., if such a run exists on w, then A cannot accept almost surely. For (⇐), observe that because A is flat, we know by Lemma 1 that with probability 1 runs are limit-deterministic. Hence, if there exists no limitdeterministic rejecting run on w (which would have positive probability), then with probability 1 runs are limit-deterministic and accepting.

Using these characterizations, we can provide simple constructions from probabilistic to classical automata.

Theorem 2. Let A be a PBA that is at most countably ambiguous.

Then L >0 (A) is a regular language.

Proof (sketch). An NBA construction taking two copies of the PBA, where in the first copy no state is accepting and the second copy has no forks, with the purpose of guessing a limit-deterministic accepting run.

is not regular, then it contains an EDA F pattern.

Theorem 3. Let A be a PBA that is at most exponentially ambiguous or flat. Then L =1 (A) is regular and recognizable by DBA.

Proof (sketch). Both cases (exp. ambiguous or flat) shown using a deterministic breakpoint construction resulting in a DBA. In one case it checks whether all runs are accepting, in the other it checks that there are no limit-deterministic rejecting runs.

is not regular, then A contains both an EDA and an IDA F pattern.

The corollaries above follow directly from the theorems and the syntactic characterization of ambiguity classes [16] . The following proposition states that these characterizations of regularity in terms of the ambiguity patterns are tight. 

where # x (w) denotes the number of occurrences of x ∈ Σ in w ∈ Σ ω . (b) A family of PBA P λ from [4] such that L >0 (P λ ) is not regular for any λ ∈ R. (c) A family of PWAP λ (closely related to [4, Fig. 6 ]) such that L =1 (P λ ) is not regular for any λ ∈ R.

There exist PBA... Fig. 3 ], depicted in Figure 2 (b), accepts non-regular languages under positive semantics and clearly contains an EDA F pattern, e.g. there are two different paths from p 0 to p 0 on the word aab.

(2.) The automata family depicted in Figure 2 (c) is a simple modification of the PBA family depicted in [4, Fig. 6 ] and recognizes the same non-regular languages under almost-sure semantics. It does not contain an EDA F pattern, because the accepting state is a sink, but it does contain an IDA F and an EDA pattern (both e.g. on aab), so it is countably ambiguous and not flat.

This completes our classification of regular subclasses of PBA under extremal semantics that are defined by ambiguity patterns, showing that going beyond the restricted classes presented above (by allowing more patterns) in general leads to a loss of regularity.

Notice that the presented constructions do not track exact probabilities, just whether transitions have a probability > 0 or = 1. This is a noteworthy observation, as in general, the probabilities do matter for PBA, as shown in [4, Thm 

In this section we consider PBA under threshold semantics and we will see that in this setting, we lose regularity much earlier than in the case of extremal semantics, but there is still the large and natural subclass of finitely ambiguous PBA that retains regularity. Before we can show this, we need to derive a suitable characterization of such languages.

We derive it from the following simple observation, which was also used more implicitly in the proof that Simple HPBA with threshold semantics are equivalent to DBA in [9] . 

Proof. Observe that given a finite set of real numbers R ⊂ [0, 1], the set R ≥λ := {r | r = i r i ≥ λ, r i ∈ R} must be finite, as in any sequence p 1 p 2 . . . of p i ∈ R, only at most m = log λ (max R) values can be < 1 and such that the product of the sequence remains ≥ λ. In our case, let R be the set of distinct probabilities assigned to edges (including the initial edges) in A. As every finite run prefix by definition has the probability given by the product of the edge probabilities, this implies the statement.

If there is just one accepting run (i.e., the automaton is unambiguous), one can easily construct a nondeterministic automaton that guesses an accepting run and tracks it along with its probability value, of which there are only finitely many above the threshold. In the case that there are multiple accepting runs, for acceptance only the sum of their probabilities matters. As individual runs can in principle have arbitrarily small probability values, it is not obvious that the same approach (tracking a set of runs) can work. Determining a suitable cut-off point is not as simple, because it is not apparent when a single run becomes so improbable that it does not matter among the others. However, we will now show that such a cut-off point must exist: Lemma 4. Let A be a PBA, λ ∈]0, 1] a threshold and k ∈ N. There exists ε k ∈ ]0, λ] such that for all sets

Proof. We prove this by induction on the number of runs k. For k = 1, i.e. a single run prefix, let V ≥λ be the finite (by Lemma 3) set of different probability values ≥ λ and let E be the set of distinct probabilities in the automaton A. Then clearly v max,<λ := max{a · b | a · b < λ, a ∈ V ≥λ , b ∈ E} is the largest probability value < λ that can correspond to a finite run prefix in A. Hence, we can just pick an ε 1 < λ − v max,<λ and immediately get that for any run prefix with probability v < λ, we have that v ≤ v max,<λ < λ − ε 1 . Now assume the statement holds for all sets with at most k run prefixes. Let R t be a set of k + 1 of different run prefixes of the same length such that Pr(R t ) < λ and let ε := ε k . Then we know that for every subset S of at most k runs of R t we have Pr(S) < λ − ε. Also, every single run prefix can by Lemma 3 have one of only finitely many probability values in V ≥ε that are ≥ ε and there exists a value v max,<ε denoting the largest possible probability value < ε that a single run prefix can have.

If there exists a run prefix ρ ∈ R t with probability value v < ε, then we know that Pr(R t ) = Pr(R t \ {ρ}) + v < (λ − ε) + v max,<ε < λ. If every run in R t has a probability value ≥ ε, then every run prefix in R t has as probability one of the values in V ≥ε . Consider all sums of k values from V ≥ε , which are finitely many, and pick the largest sum s which is < λ. Choose ε k+1 such that ε k+1 < min(ε − v max,<ε , λ − s) to account for both cases.

From this we can derive the following characterization of languages accepted by finitely ambiguous PBA under threshold semantics:

Lemma 5. Let A be a k-ambiguous PBA and λ ∈]0, 1] a threshold. There exists an ε ∈ ]0, λ] such that for all w ∈ Σ ω : w ∈ L >λ (A) iff there exists a set R of limit-deterministic accepting runs of A on w with Pr(R) > λ, Pr(S) ≤ λ for all S ⊂ R and at most one run ρ ∈ R with Pr(ρ) < ε.

Proof. Clearly (⇐) holds, as then w is accepted with probability ≥ Pr(R) > λ. We now show (⇒). In a finitely ambiguous PBA there are only finitely many different accepting runs on each word. Furthermore, as after finite time all accepting runs have separated and each accepting run that visits forks infinitely often has probability 0, accepting runs that visit forks infinitely often do not contribute positively to the acceptance probability and thus can be ignored. Hence, if w ∈ L >λ (A), there is a number of accepting runs that eventually all become deterministic and each such run has a positive probability, which must in total be > λ.

Let R be a set of different limit-deterministic accepting runs of A on w such that Pr(R) > λ and Pr(S) ≤ λ for all S ⊂ R. As there are only finitely many accepting runs, such a set R must exist. Furthermore, notice that each limitdeterministic run has a finite prefix which has the same probability as the whole run, so there exists a time t such that the probability of the set of all different prefixes of runs in R of length t is exactly Pr(R), so that Lemma 4 applies. Now pick an ε := ε k given by Lemma 4. We claim that at most one run ρ ∈ R can have a probability less than ε. If there is no such run in R, we are done. Otherwise let ρ be a run with Pr(ρ) =: p < ε and notice that by choice of R, we have that Pr(R \ {ρ}) =: s ≤ λ. It cannot be the case that s < λ, as then by Lemma 4 we have s < λ − ε, which implies that Pr(R) = s + p < λ, which is a contradiction. Hence, now assume that s = λ. But then, if there is any ρ = ρ ∈ R such that Pr(ρ ) =: p < ε, by the same argument we get the contradiction that s − p < λ − ε and hence s < λ. Therefore, no other run in R can have a probability < ε.

Theorem 4. L >λ (A) is regular for each k-ambiguous PBA A and λ ∈]0, 1[. Proof (sketch). We use the characterization of Lemma 5 to construct a generalized Büchi automaton accepting L >λ (A). Intuitively, the new automaton just guesses at most k different runs of A and verifies that the guessed runs are limitdeterministic and accepting. The automaton additionally tracks the probability of the runs over time, to determine whether the individual runs and their sum have enough "weight". The automaton rejects when the total probability of the guessed runs is ≤ λ, one of the runs goes into the rejecting sink q rej or a run does not see accepting states infinitely often.

By Lemma 5 we only need to consider sets of runs with at most one run that has a probability < ε, where ε := ε k is given by Lemma 4. For this single run we also do not need to track the exact probability value, as its only purpose is to witness that the acceptance probability is strictly greater than λ, whereas all other runs must have one of the finitely many different probabilities which are ≥ ε and must sum to λ.

This generalizes the corresponding result for PFA [12, Theorem 3] . The proof in [12] uses similar concepts, though a rather different presentation. In the setting of infinite words we additionally have to deal with a single run that has arbitrarily low probability, and we have to ensure that this probability remains positive.

After seeing that finitely ambiguous PBA retain regularity, we show that this is the best we can do under threshold semantics: Corollary 3. There are polynomially ambiguous PBA A, that is, with an IDA pattern and no EDA, IDA F patterns, such that L >λ (A) is not regular even for rational thresholds λ ∈]0, 1[.

Follows from the fact that the PWA A from Figure 2(a) , which recognizes a non-regular language (and is used to show Proposition 6), has just an IDA pattern in the underlying NBA, but no EDA or IDA F patterns.

This completes our characterization of languages which are recognized by PBA that are restricted by forbidden ambiguity patterns, so that we can state our main result of this section (see Figure 1 for a visualization):

Theorem 5. The following results hold about PBA with restricted ambiguity:

Proof. The statements follow from the following inclusion chains: 

In this section, we state some upper and lower bounds on the complexity for deciding emptiness and universality for PBA with restricted ambiguity, derived from the characterizations and constructions presented above. Theorem 6. 1. the emptiness problem for ℵ 0 -PBA >0 is in NL 2. the universality problem for ℵ 0 -PBA >0 is in PSPACE 3. the universality problem for at most exp. ambiguous or flat PBA =1 is in NL Proof. (1. + 2.) : By Theorem 2 the languages of ℵ 0 -PBA >0 are regular. The construction of an NBA just uses two copies of the given PBA. For emptiness, it thus suffices to guess an accepted ultimately periodic word and verify that it is accepted by the NBA, which can be done in NL. Since universality for NBA in in PSPACE [21] , we also obtain (2.).

(3.): If the automaton is at most exponentially ambiguous, there are only finitely many accepting runs on each word and as we know by Lemma 2 that w ∈ L =1 (A) iff all runs are accepting, it suffices to guess a rejecting run in A , which implies that the ultimately periodic word w labelling that run can not be in L =1 (A). If the automaton is flat, then we know that for each rejected word there must exist a limit-deterministic rejecting run in the underlying NBA, which we also can guess.

Type regular? Emptiness Observe that ℵ 0 -PBA >0 subsume HPBA >0 and the union of flat PBA =1 and exp. ambiguous PBA =1 subsumes HPBA =1 , while preserving the same complexity of the emptiness and universality problems. A summary of the main results from Theorem 5 and Theorem 6 is presented in Table 1 .

We conclude with an observation relevant to the question about feasibility of PBA with restricted ambiguity for the purpose of application in e.g. modelchecking or synthesis.

Proposition 5 (Relationship to classical formalisms).

-There is a doubly-exponential lower bound for translation from LTL formula to countably ambiguous PBA with positive semantics. -There is an exponential lower bound for conversion from NBA to countably ambiguous PBA with positive semantics.

Proof. It is known [20, Theorem 2] that there is a doubly-exponential lower bound from LTL to LDBA. It is also known that LTL to NBA has an exponential lower bound (e.g. [5, Theorem 5 .42]), which implies an exponential lower bound from NBA to LDBA. By Theorem 2 there is a polynomial transformation from countably ambiguous PBA with positive semantics into LDBA, which together with the aforementioned bounds implies the claimed lower bounds.

In this section we investigate the class of probabilistic weak automata (PWA), establishing the relation between different classes defined by PWA as shown in Figure 3 (see also the description of our contribution in the introduction).

As a first remark, notice that PWA can be "complemented" by inverting accepting and rejecting states and switching between dual semantics, e.g., for a PWA A we have L >0 (A) = L =1 (A), where A is just A with inverted accepting state set F = Q \ F .

Since the overarching theme of this paper is trying to find regular subclasses of PBA, we will next establish the following result, showing that there is no hope to find a complete syntactical characterization of regularity in PBA: Theorem 7. The regularity of PWA (and therefore of PBA) under positive, almost-sure and threshold semantics is an undecidable problem.

Proof (sketch). Since L >λ (PWA) ⊇ L >0 (PWA) (see Theorem 10), L >0 (PWA) = L =1 (PWA), and the class of regular ω-languages is closed under complement, it suffices to show the statement for PWA =1 . We do this by reduction from the value 1 problem for PFA, which is the question whether for each ε > 0 there exists a word accepted by the PFA with probability > 1−ε. This problem is known to be undecidable [13] . We consider a slightly modified version of the problem by assuming that no word is accepted with probability 1 by the given PFA. The problem remains undecidable under this assumption, because one can check if a PFA accepts a finite word with probability 1 by a simple subset construction.

Given some PFA A, we construct a PWA =1 B by taking a copy of A and extending it with a new symbol # such that from accepting states of A the automaton is "restarted" on #, while from non-accepting states # leads into a L >λ (PBA) new part which ensures that infinitely many # are seen and contains the only accepting state of B. We show that L =1 (B) = (Σ * #) ω \ R, where R = ∅ if A does not have value 1, and R is non-empty but does not contain an ultimately periodic word, otherwise. This implies that L =1 (B) is regular iff A does not have value 1.

We will now show that PWA with almost-sure semantics are as expressive as PBA, and with positive semantics as expressive as PCA. Proof (sketch). It suffices to show the first statement. The second then follows by duality, i.e., we can interpret a PBA =1 A recognizing L as a PCA >0 recognizing L and just apply the construction to get a PWA >0 B for L, such that B (with inverted accepting and rejecting states) is a PWA =1 for L. In the first statement the ⊆ inclusion is trivial, hence we only need to show that L >0 (PCA) ⊆ L >0 (PWA).

We construct a PWA >0 consisting of two copies of the original PCA >0 , a guess copy and a verify copy. In the first copy, the automaton can guess that no final states will be visited anymore and switch to the verify copy, which is accepting, but where all transitions into final states are redirected to a rejecting sink.

Next, we show that languages that can be accepted by both, a PWA with almost-sure semantics, and by a PWA with positive semantics, are regular and can be accepted by a DWA. For the proof, we rely on a characterization of DWA languages in terms of the Myhill-Nerode equivalence relation from [22] . So we first define this equivalence, and show that languages defined by PBA with positive semantics have only finitely many equivalence classes. Then we come back to the result for PWA.

For L ⊆ Σ ω , define the Myhill-Nerode equivalence relation ∼ L ⊆ Σ * × Σ * by u ∼ L v iff uw ∈ L ⇔ vw ∈ L for all w ∈ Σ ω . Then the following holds:

Lemma 6 (Finitely many Myhill-Nerode classes). Languages in L >0 (PBA) have finitely many Myhill-Nerode equivalence classes.

Proof. Let A = (Q, Σ, δ, µ 0 , F ) be some PBA >0 and u ∈ Σ * some word and let µ u := δ * (µ 0 , u) be the probability distribution on states of A after reading u. Pick any w ∈ Σ ω and notice that uw ∈ L = L >0 (A) iff there exists some state q such that µ u (q) > 0 and the probability to accept w from q is also > 0, as the product of two positive numbers clearly still is positive. But then, for any two u, v ∈ Σ * we have that whenever µ u (q) > 0 ⇔ µ v (q) > 0 for all q, then we have uw ∈ L ⇔ vw ∈ L for all w ∈ Σ ω by the reasoning above, as the exact value does not matter for acceptance, and therefore u ∼ L v. But as there are only at most 2 |Q| different possibilities how values in a distribution µ over Q are either equal to or greater than 0, this is an upper bound on the number of different equivalence classes. So let L be a language from L >0 (PWA) ∩ L =1 (PWA). We want to show that L can be accepted by a DWA. We use the following characterization of DWA languages [22, Theorem 21] : The DWA languages are precisely the languages with finitely many Myhill-Nerode classes in the class G δ ∩ F σ in the Borel hierarchy. The classes G δ and F σ of the Borel hierarchy are often also referred to as Π 2 and Σ 2 . We do not introduce the details of this hierarchy here, but rather refer the reader not familiar with these concepts to [22] and [8] .

We already know that L has finitely many Myhill-Nerode classes by Lemma 6 (as PWA are special cases of PBA). It remains to show that L is in the class G δ ∩ F σ . It is known that PBA with almost-sure semantics define languages in G δ [8, Lemma 3.2] . Hence L is in G δ . Since L is accepted by a PWA with positive semantics, the complement of L is accepted by a PWA with almostsure semantics (as noted at the beginning of this section). We obtain that the complement of L is also in G δ again by [8, Lemma 3.2] . This means that L is in F σ , which by definition consists of the complements of languages from G δ .

Concluding this section, we show a result about weak automata with threshold semantics, which (not surprisingly) turn out to be even more expressive. A careful analysis of the PWA A in Fig. 2(a) shows the following result: Proposition 6. For all thresholds λ ∈]0, 1[ there exists a PWA A such that L >λ (A) is not regular and not P BA >0 recognizable.

Putting things together, we can say the following about threshold PWA, establishing the relation of L >λ (PWA) to the other classes in Figure 3 :

Theorem 10 (Expressive power of threshold PWA). Proof. (1.) L >0 (PWA) ⊆ L >0 (PBA) by definition and L >0 (PWA) ⊆ L >λ (PWA), as any PWA >0 can be modified to a PWA >λ recognizing the same language by just adding an additional accepting sink and modifying the initial distribution, just as described in [4, Lemma 4.16] for general PBA.

(2.) By Proposition 6, there are languages recognized by PWA >λ that cannot be recognized with PBA >0 . To show that there are languages accepted by PBA >0 that cannot be accepted by PWA >λ we can give a topological characterization of languages accepted by PWA by a simple adaptation of [8, Lemma 3.2] and combine it with other results shown in [8] to show that there are PBA >0 that accept languages that cannot be accepted by PWA >λ .

(3.) The first inclusion was discussed in (1.), the strictness follows from Proposition 6 and the fact that L >0 (PWA) = L =1 (PBA) ⊂ BCl(L =1 (PBA)) = L >0 (PBA), where the first equality is Theorem 8 and the second is shown in [8] . The second inclusion of the statement follows from (2.) and the fact from [4] that L >0 (PBA) ⊂ L >λ (PBA).

For the dual class L ≥λ (PWA) one can show symmetric results that correspond to statements (1.) and (2.) above, for statement (3.) however there is no proof yet for the strictness of the inclusions (especially the second one), whereas the statement L =1 (PWA) ⊆ L ≥λ (PWA) ⊆ L ≥λ (PBA) is obvious. We leave this issue as an open question. Another interesting question is whether > λ is equivalent to < λ (or dually for ≥ / ≤).

By using notions from ambiguity in classical Büchi automata, we were able to extend the set of easily (syntactically) checkable PBA which are regular under some or all of the usual semantics. As a consequence, ambiguity appears to be an even more interesting notion in the probabilistic setting, as here it in fact has consequences for the expressive power of automata, whereas in the classical setting there is no such effect. Our results also indicate that to get non-regularity, one requires the use of certain structural patterns which at least imply the existence of the ambiguity patterns that we used. It is an open question whether it is possible to identify more fine-grained syntactic characterizations, patterns or easily checkable properties which are just over-approximated by the ambiguity patterns and are required for non-regularity.

On decision problems for probabilistic büchi automata

Probabilistic automata over infinite words: Expressiveness, efficiency, and decidability

Recognizing omega-regular languages with probabilistic automata

Probabilistic ω-automata

Principles of model checking

An effective decision procedure for linear arithmetic over the integers and reals

On a decision method in restricted second order arithmetic

Power of randomization in automata on infinite strings

Probabilistic Büchi automata with non-extremal acceptance thresholds

Emptiness under isolation and the value problem for hierarchical probabilistic automata

Decidable and expressive classes of probabilistic automata

Probabilistic automata of bounded ambiguity

Probabilistic automata on finite words: Decidable and undecidable problems

Decision problems for ω-automata

On flatness for 2-dimensional vector addition systems with states

On finitely ambiguous Büchi automata

Alternating automata and logics over infinite words

Probabilistic automata

Complementation of finitely ambiguous Büchi automata

Limit-deterministic Büchi automata for linear temporal logic

The complementation problem for Büchi automata with applications to temporal logic (extended abstract)

Finite-state ω-languages

Automata on infinite objects

Languages, automata, and logic

On the degree of ambiguity of finite automata

), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use