key: cord-0048161-crmqo9si
authors: Dafnis, Spiros D.; Makri, Frosso S.; Koutras, Markos V.
title: Generalizations of Runs and Patterns Distributions for Sequences of Binary Trials
date: 2020-07-27
journal: Methodol Comput Appl Probab
DOI: 10.1007/s11009-020-09810-0
sha: 4274bba511962db79a8e67aff780d5b43958bace
doc_id: 48161
cord_uid: crmqo9si

In the present paper we study the distributions of families of patterns which generalize runs and patterns distributions extensively examined in the literature during the last decades. In our analysis we assume that the sequence of outcomes under investigation includes independent, but not necessarily identically distributed trials. An illustration is also provided how our new results could be exploited to enrich a new system, still in research, related to patients’ weaning from mechanical ventilation.

In several applications the study of the complement of E 1 is of significant interest. Introducing the following compound pattern E 2 E 2 : a string of outcomes containing k S s in which at least one pair of successive S s is interrupted by at least r consecutive F s (the typical element of E 2 is of the form SF ...F l 1 SF ...F l 2 S...SF ...F l k−1 S, where l i ∈ N and l i ∈ [r, +∞) for at least one i, i = 1, 2, ..., k−1), it is clear that E 2 is the complement of E 1 , with r replaced by r −1. Note that the cardinality of E 1 equals | E 1 |= (r + 1) k−1 while E 2 is a compound pattern containing infinitively many simple patterns.

The patterns contained in E 1 and E 2 contain exactly k successes. In order to allow for more than k successes in the patterns under enumeration we shall also consider the next two families generated by adding a number of S s at the end of the typical elements of E 1 and E 2 . More specifically, the typical element of the family E 3 is of the form SF ...F for at least one i, i = 1, 2, ..., k − 1. A phrasal description of E 3 and E 4 follows: E 3 : a string of outcomes containing at least k S s in which each pair of successive S s in the first k S s may be interrupted by at most r consecutive F s, E 4 : a string of outcomes containing at least k S s in which at least one pair of successive S s in the first k S s is interrupted by at least r consecutive F s. Each one of the compound patterns E 3 and E 4 contains infinitively many simple patterns.

To make the definitions of E 3 and E 4 more clear, we should note that, in both cases when enumerating a simple pattern, the occurrence of an F is permitted only among the first k S's and not among the S's which possibly follow. A new pattern may start counting from scratch only after the appearance of an F . Let us denote by N (i) n,k,r the number of non-overlapping occurrences of patterns included in E i (i = 1, 2, 3, 4) in the sequence of outcomes Z 1 , Z 2 , . . . , Z n (n ≥ k).

As already mentioned, the compound pattern E 1 was first introduced by Dafnis et al. (2019) , who studied the distribution of N (1) n,k,r , in the case of independent trials, motivated by a reliability evaluation problem in a consecutive-type system. Special cases of the compound patterns E 1 , E 2 have also been considered in Dafnis et al. (2012) , where the distributions of N (i) n,2,r (i = 1, 2) have been obtained when Z 1 , Z 2 , . . . is a sequence of independent and identicallly distributed (i.i.d.) trials; for related results see, among others, Makri and Psillakis (2012 , 2017 .

It is noteworthy that, when r = 0, some of the aforementioned random variables, reduce to well-known success runs enumerating random variables. Philippou and Makri (1986) , Hirano (1986) and Goldstein (1990) have proceeded to extensive studies of the random variables N (1) n,k,0 and N (3) n,k,0 , which enumerate the number of occurrences of non-overlapping success runs of length k and the number of success runs of length at least k, respectively. If the underlying sequence Z 1 , Z 2 , . . . of binary outcomes comes from Bernoulli trials, then the distributions are called binomial distributions of order k. A convenient reference for the theory and applications of run-related distributions is Balakrishnan and Koutras (2002) . A very popular method for deriving the distribution of success runs occurrences is by establishing an appropriate Markov chain and expressing the probability mass function (pmf) of the variable of interest through the transition probability matrix of the chain. The definite reference for this approach is the monograph of Fu and Lou (2003) .

For k = 2, some of the random variables introduced in the present article reduce to random variables enumerating patterns. Dafnis et al. (2012) studied the distributions of N (i) n,2,r (i = 1, 2) employing a Markov chain imbedding technique. These distributions were also studied earlier by Sen and Goyal (2004) , for r ≥ 1, by the aid of combinatorial methods.

During the last decades, many publications have dealt with success run distributions and their generalizations, including scans and patterns distribution; see for example Antzoulakos et al. (2003) , Bersimis et al. (2014 ), Chatziconstantinidis et al. (2000 , Demir and Eryilmaz (2010) , Eryilmaz (2005) , Fu and Koutras (1994b) , Inoue and Aki (2003) , , Makri and Philippou (2005) , Makri et al. (2007) , Psillakis (2011, 2015) , Mytalas and Zazanis (2013) , Yalcin and Eryilmaz (2014) .

In the present paper we employ the Markov chain imbedding technique to study the distributions of N (i) n,k,r (i = 2, 3, 4), in the case of independent, but not necessarily identically distributed, trials. An illustration is also provided how the developed outcomes can be adapted to Markov dependent trials. Our results, could be used to establish flexible decision criteria for monitoring a patient's mechanical support, thereof enriching the tools that can be practiced in biomedical engineering for this purpose (see Buliev et al. (2006) , Dermitzakis et al. (2008) , Dojat et al. (2000) , Esteban et al. (1994) ).

Throughout the paper we denote by [x] the greater integer which is less than or equal to x and by δ i,j the Kronecker's Delta function, i.e.

Let Z 1 , Z 2 , ... be a sequence of binary outcomes taking on the values 1 (success, S) or 0 (failure, F ) and denote by E any pattern (simple or compound) with a prespecified composition. The number of appearances of E in the sequence Z 1 , Z 2 , ..., Z n (n a fixed integer) will be denoted by X n . In a series of papers (Antzoulakos et al. (2003) , Dafnis et al. (2012) , Fu and Koutras (1994a) , Koutras (2003) , Koutras and Alexandrou (1995) ) a Markov chain imbedding method was developed for the study of the exact distribution of enumerating random variables defined on sequences of binary (or multistate) trials. Since we are going to make use of this approach in the forthcoming sections, we shall first review the key features of it, bringing in at the same time all necessary terminology. We first introduce the notion of a Markov chain imbeddable variable of binomial type. Let X n (n a non-negative integer) be a non-negative finite integer-valued random variable and let n = sup{x : Pr(X n = x) > 0} its upper end point.

The random variable X n will be called Markov chain imbeddable variable of binomial type (MVB) if (a) there exists a Markov chain {Y t , t ≥ 0} defined on a discrete state space which can be partitioned as

It follows from condition (b) of Definition 1 that the only feasible transitions for {Y t , t ≥ 0} are the ones carried out within the same substate set C x or from substate C x to C x+1 . Those transitions give birth to the next two s × s transition probability matrices

the initial probabilities of the Markov chain. In most of the applications where MVB's have been exploited the convention Pr (X 0 = 0) = 1 can be used; an immediate consequence of it is that

and π x 1 = 0 for 1 ≤ x ≤ n (1 = (1, 1, ..., 1) is the row vector of R s with all its entries being 1).

The following lemma (see Koutras and Alexandrou (1995) ) provides recursive relations for the probability vectors f t (x).

For an MVB X n , the sequence f t (x), t = 1, 2, . . . , n, satisfies the recurrence

with initial conditions f 0 (x) = π x , 0 ≤ x ≤ n .

Manifestly, the pmf f n (x) of X n can be expressed as f n (x) = f n (x)1 , x = 0, 1, . . . , n .

(2.1)

Next, let ϕ n (z) and (z, w) denote the single and double generating functions

In the case of a homogeneous MVB (i.e. when A t (x) = A and B t (x) = B for all x and t), we have the next theorem (see Koutras and Alexandrou (1995) ).

for all t ≥ 1 and x ≥ 0, then the double generating function of X n can be expressed as

In this section we shall deduce the exact distributions of N (i) n,k,r , i = 2, 3, 4. Before we proceed to the development of our results, we shall elucidate the structure of the enumerating random variables by a simple example. Consider k = 3 and r = 2. Then, E 1 ={SSS, SF SS, SF SF S, SF SF F S, SF F SS, SF F SF S, SF F SF F S, SSF S, SSF F S}, while E 2 can be represented as a union of disjoint pattern sets as follows

It should be mentioned that the aforementioned sets contain all the patterns generated for the indicated range of l, l 1 , l 2 , i.e. the elements of E and E 4 can be represented as

The remark stated above for E 2 applies for the decomposition of E 4 as well, i.e. its disjoint components contain all the patterns produced by varying l, l 1 , l 2 over the indicated ranges.

Should one be interested in the random variables N (i) n,k,r = N (i) n,3,2 , i = 1, 2, 3, 4 in the following sequence of n = 19 binary outcomes: SF SSSSF SF F F SSF F SSSS, it can be readily checked that

19,3,2 = 2, N

19,3,2 = 2, N

19,3,2 = 1.

As already stated, Dafnis et al. (2019) studied the distribution of N (1) n,k,r to obtain the reliability of a consecutive-type system. The following theorem and corollary originate from that work and provide expressions for the pmf of N (1) n,k,r for r ≥ 1 and r = 0, respectively.

n,k,r (r ≥ 1) is given by (2.1), where f t (x) are probability vectors satisfying the recursive relations of Lemma 2.1, n = n k , s = (k − 1)r + k, A t is an s × s matrix which has all its entries 0 except for the entries:

and B t is an s × s matrix with all its elements vanishing except from the first column which is given by 1 − A t 1 .

n,k,0 is given by (2.1), where n = n k , s = k, A t is a k × k matrix which has all its entries 0 except for the entries:

and B t is a k × k matrix that has all its entries 0 except for the entry (k, 1), which equals p t .

As already mentioned in the introduction, for r = 0, the distribution of the random variable N (1) n,k,0 reduces to the much studied binomial distribution of order k (see e.g. Fu and Koutras (1994b) , Philippou and Makri (1986) or the monograph by Balakrishnan and Koutras (2002) . In addition, the special case k = 2, i.e. the random variable N (1) n,2,r , has been recently studied by Dafnis et al. (2012) (see also Sen and Goyal (2004) ).

In order to give a simple example in the case of identically distributed trials, we will calculate P (N (1) 

An alternative method for the calculation of the pmf f n (x) of N (1) n,3,2 is provided by deriving recurrence relations from the probability generating function (pgf) of it which can be easily obtained by Theorem 2.1. Applying Theorem 3.1 for the special case k = 3, r = 2 we can obtain the entries of matrices A, B and a straightforward application of Theorem 2.1 yields the following expression for the double generating function of N

(3.1) The last expression leads to the following expression for the pgf of the random variable N (1) n,3,2 .

(1) n,3,2 satisfies the recursive scheme ϕ n (z) = qϕ n−1 (z) + p 3 zϕ n−3 (z) + pq(pq(q 2 (q + pz)ϕ n−7 (z) + q(q + 2pz)ϕ n−6 (z) +(q + 3pz)ϕ n−5 (z)) + (q 2 + 2p 2 z)ϕ n−4 (z)), n ≥ 7, with initial conditions

The expression described in Corollary 2 can be exploited for establishing very efficient computational schemes for the calculation of the pmf f n (x) = P (N (1) n,3,2 = x) and the moments μ n,m = E [(N (1) n,3,2 ) m ], m = 1, 2, . . . of N (1) n,3,2 . These schemes are provided in the following two corollaries.

(1) n,3,2 satisfies the recursive scheme

with initial conditions f n (x) = δ x,0 , n = 0, 1, 2 and

Proof Follows immediately from Corollary 2 by replacing ϕ n (z) = ∞ x=0 f n (x)z x and picking up the coefficients of z x in both sides of the power series expansions that result.

Needless to say, applying Corollary 2 for n = 5, we can easily deduce the expressions deduced earlier for P (N (1) 5,3,2 = x), x = 0, 1 via the probability vector approach.

Corollary 4 The moments μ n,m , m ≥ 1, of the random variable N (1) n,3,2 satisfy the recursive scheme

with initial conditions

μ n,m = 0, f or n < 3 and m ≥ 1,

n,3,2 can be expressed as M(z) = ϕ n (e z ). Therefore, replacing z by e z in the recursive scheme provided by Corollary 2 we may easily derive a recursive scheme satisfied by M(z). Recalling next that 

The next theorem is analogue to Theorem 3.1 and provides the structure of the matrices A t (x), B t (x) that permit the evaluation of the pmf of the random variable N (2) n,k,r through the Markov chain imbedding method.

which has all its entries 0 except for the entries:

and B t is an s × s matrix with all its elements vanishing except from the first column which is given by 1 − A t 1 .

Proof According to Lemma 2.1 it suffices to prove that N (2) n,k,r is an MVB. We set n = [n/(k + r)] and introduce the state space = n x=0 C x where C x , x = 0, 1, . . . , n are disjoint subspaces with | C x |= (k − 1)r + k, x = 0, 1, . . . , n , elements labelled as follows

We introduce next a Markov chain {Y t , t ≥ 0} on as follows: (x, i) , if at the first t outcomes Z 1 , Z 2 , . . . , Z t x occurrences of patterns contained in E 2 have been observed, and (a) i = 0, if (1) x = 0 and t j =1 (1 − Z j ) = 1 (no S s have appeared till the t-th outcome) or (2) x ≥ 1, Z t = 0 and no S s have appeared since the k-th S of the x-th occurrence of a simple pattern contained in E 2 or (3) x ≥ 1 and the t-th outcome is the k-th S (Z t = 1) of a simple pattern contained in E 2 (i.e. a simple pattern is completed at the t − th trial). (b) i = j , j = 1, 2, . . . , k − 1, if the t-th outcome is the last S (Z t = 1) of a string of j S s not interrupted by at least r consecutive F s between any two successive S s.

, following a string of j S s not interrupted by at least r consecutive F s between any two successive S s. (d) i = L j , j = 1, 2, . . . , k − 1, if a new simple pattern in E 2 will be completed in the next k − i S s (whenever they occur).

Under this set up, the random variable N

n,k,r becomes an MVB with initial probability vector π 0 = (1, 0, 0, . . . , 0) 1×((k−1)r+k) , and matrices A t and B t having the entries described in the theorem.

The result follows by Lemma 2.1 and Definition 1.

The principles of Theorem 3.2 can be better illustrated by an example. Let us consider the special case k = 3, r = 2 and the following sequence of n = 18 binary outcomes:

Then the evaluation of the imbedded Markov chain, defined in Theorem 3.2, is as follows: 1) . The matrices A t , B t of the Markov chain reduce to:

In the special case of identically distributed trials (p t = p and q t = q, for all t), we shall have A t = A, B t = B for t = 1, 2, . . . and the calculation of P (N (2) 5,3,2 = x), x = 0, 1 may be easily accomplished by a direct application of Lemma 2.1 as follows: f 0 (0) = (1, 0, 0, 0, 0, 0, 0), f 1 (0) = f 0 (0) · A = (q, p, 0, 0, 0, 0, 0), f 2 (0) = f 1 (0) · A = (q 2 , pq, p 2 , pq, 0, 0, 0), f 3 (0) = f 2 (0) · A = (q 3 , pq 2 , p 3 + 2p 2 q, pq 2 , p 2 q, pq 2 , 0), f 4 (0) = f 3 (0)·A = (q 4 , pq 3 , p 4 +3p 3 q+2p 2 q 2 , pq 3 , p 3 q+2p 2 q 2 , 2pq 3 , 2p 2 q 2 ), f 5 (0) = f 4 (0)·A = (q 5 , pq 4 , p 5 +4p 4 q+4p 3 q 2 +2p 2 q 3 , pq 4 , p 4 q+3p 3 q 2 +2p 2 q 3 , 3pq 4 , p 3 q 2 + 6p 2 q 3 ), f 1 (1) = f 0 (1) · A + f 0 (0) · B = (0, 0, 0, 0, 0, 0, 0), f 2 (1) = f 1 (1) · A + f 1 (0) · B = (0, 0, 0, 0, 0, 0, 0), f 3 (1) = f 2 (1) · A + f 2 (0) · B = (0, 0, 0, 0, 0, 0, 0), f 4 (1) = f 3 (1) · A + f 3 (0)·B = (0, 0, 0, 0, 0, 0, 0), f 5 (1) = f 4 (1)·A+f 4 (0)·B = (2p 3 q 2 , 0, 0, 0, 0, 0, 0). Thus, P (N (2) 5,3,2 = 0) = f 5 (0) · 1 = p 5 + 5p 4 q + 8p 3 q 2 + 10p 2 q 3 + 5pq 4 + q 5 , P (N (2) 5,3,2 = 1) = f 5 (1) · 1 = 2p 3 q 2 .

It is worth mentioning that, in the special case k = 2, the random variable N (2) n,2,r−2 reduces to the variable N

n,r studied by Dafnis et al. (2012) ; see also Sen and Goyal (2004) .

In this subsection we turn our attention to the random variable N

n,k,r . Theorem 3.3 provides the form of the matrices A t , B t for accomplishing its Markov chain imbedding, while Corollary 5 deals with the special case r = 0. n,k,r (r ≥ 1) is given by (2.1), where f t (x) are probability vectors satisfying the recursive relations of Lemma 2.1, n = [(n + 1)/(k + 1)] , s = (k − 1)r + k + 1, A t is an s × s matrix which has all its entries 0 except for the entries:

• (1, 1), which equals q t ,

• (i, 1), i = k + jr + 1, j = 0, 1, ..., k − 1, which are all equal to q t ,

all equal to p t , • (k + i, k + i + 1) for r ≥ 2, (j − 2)r + 2 ≤ i ≤ (j − 1)r, j = 2, 3, . . . , k, which are all equal to q t ,

and B t is an s × s matrix with all its elements vanishing except from the k + 1-th column which is given by 1 − A t 1 .

Proof We shall prove that the random variable N

n,k,r is an MVB with n = [(n + 1)/(k + 1)]. Introducing the l n sets of states

with cardinality | C x |= (k − 1)r + k + 1 for each one we may establish a Markov chain {Y t , t ≥ 0} on = n x=0 C x as follows: Y t ∈ c x,i = {(x, i)}, or equivalently Y t = (x, i), if at the first t outcomes Z 1 , Z 2 , . . . , Z t x occurrences of patterns contained in E 3 have been observed, and (a) i = 0, if (1) x = 0 and t j =1 (1 − Z j ) = 1 (no S s have appeared till the t-th outcome) or (2) x ≥ 1, Z t = 0 and no S s have appeared since the l-th (l ≥ k) S that completed the x-th simple pattern contained in E 3 or (3) for t ≥ r +1, r i=0 (1−Z t−i ) = 1 (the length of the current failure run is greater than r).

(b) i = j , j = 1, 2, . . . , k − 1, if the t-th outcome is the last success (Z t = 1) of a string of j S s not interrupted by more than r consecutive F s between any two successive S s. (c) i = k, if the last outcome is the last success of a string of at least k S s not interrupted by more than r consecutive F s between any two of the first k successive S s.

, following a string of j S s not interrupted by more than r consecutive F s between any two successive S s.

Under this set up, the random variable N

n,k,r becomes an MVB with initial probability vector π 0 = (1, 0, 0, . . . , 0) 1×((k−1)r+k+1) , and matrices A t and B t having the entries described in the theorem. The result follows by Lemma 2.1 and Definition 1.

Corollary 5 n,k,0 may be obtained through (2.1), where n = [(n + 1)/(k + 1)] , s = k + 1, A t is an s × s matrix that has all its entries 0 except for the entries:

and B t is an s × s matrix that has all its entries 0 except for the entry (k, k + 1), which equals p t .

Proof Making use of the structure described in Theorem 3.3 we may easily observe that for r = 0 all the states i (j ) , i = 1, 2, . . . , k − 1, j = 1, 2, . . . , r of the Markov chain can be accumulated at state 0. Thus, N

n,k,0 becomes an MVB with π 0 = (1, 0, 0, . . . , 0) 1×(k+1) , and matrices A t , B t reduce to (k + 1) × (k + 1) matrices with their non-vanishing entries being the ones provided in the statement of Corollary 5.

Since we have proved that N

n,k,0 is an MVB, our result follows directly by applying Lemma 2.1.

For illustration purposes we consider k = 3, r = 2 and the following sequence of n = 18 binary outcomes: Z 1 = 0, Z 2 = Z 3 = Z 4 = Z 5 = 1, Z 6 = 0, Z 7 = 1, Z 8 = 0, Z 9 = Z 10 = Z 11 = 1, Z 12 = 0, Z 13 = Z 14 = 1, Z 15 = Z 16 = Z 17 = 0, Z 18 = 1. Then the imbedded Markov chain Y t , t = 1, 2, . . . , 18 resulting from Theorem 3.3, takes on the values 1, 1 (1) ), Y 9 = (1, 2), Y 10 = (2, 3), Y 11 = (2, 3), Y 12 = (2, 0), Y 13 = (2, 1), Y 14 = (2, 2), Y 15 = (2, 2 (1) ), Y 16 = (2, 2 (2) ), Y 17 = (2, 0), Y 18 = (2, 1). Should one wish to calculate the pmf P (N (3) 5,3,2 = x), x = 0, 1, the form of the matrices A t , B t is as follows

The calculation procedure is similar to the one presented in detail in the examples of Section 3.2.

It is of interest to note that for r = 0 the random variable N

n,k,0 enumerates the number of success runs of length at least k; this variable is traditionally denoted by G n,k and has been extensively studied (see Balakrishnan and Koutras (2002) , Makri et al. (2007) , Makri and Psillakis (2011) ). For the form of the matrices A t and B t in this special case, the interested reader may refer to Koutras and Alexandrou (1995) .

We finally focus on the random variable N (4) n,k,r . The approach depends on essentially the same considerations as were used in obtaining the distributions of N (i) n,k,r , i = 1, 2, 3 with some minor modifications in the definition of the Markov chain states. n,k,r (r ≥ 1) is given by (2.1) , where f t (x) are probability vectors satisfying the recursive relations of Lemma 2.1, n = n+1 k+r+1 , s = (k − 1)r + k + 1, A t is an s × s matrix which has all its entries 0 except for the entries:

• (1, 1) and (k + 1, 1) which are both equal to q t , • (2 + i, k + 2 + max{1, r − 1}i), i = 0, 1, ..., k − 2, which are all equal to q t , • (k + 1 + (r − 1)j + 1 + i, k + 3 + (r − 1)j + i), for r ≥ 3, i = 0, ..., max{0, r − 3}, j = 0, ..., k − 2, which are all equal to q t , • (k + r + (r − 1)i, (k − 1)r + 3 + i), for r ≥ 2, i = 0, ..., k − 2, which are all equal to q t , • ((k − 1)r + 2 + i, (k − 1)r + 2 + i), i = 1, ..., k − 1, which are all equal to q t , • (i, i + 1), i = 1, ..., k − 1, which are all equal to p t , • (i, i), i = k, k + 1, which are both equal to p t ,

• (k + 1 + (r − 1)j + i, 3 + j), for r ≥ 2, j = 0, ..., max{0, k − 3}, i = 1, ..., r − 1, which are all equal to p t , • (k + 1 + (k − 2)(r − 1) + i, k), for r ≥ 2 and k ≥ 3, i = 1, ..., r − 1, which are all equal to p t , • ((k − 1)r + 2 + i, (k − 1)r + 3 + i), for k ≥ 3, i = 1, ..., k − 2, which are all equal to p t . and B t is an s × s matrix with all its elements vanishing except from the k + 1-th column which is given by 1 − A t 1 .

Proof We introduce a Markov chain {Y t , t ≥ 0} on = n x=0 C x where n = n+1 k+r+1 and C x = {c x0 , c x1 , . . . , c xk , c x1 (1) , c x1 (2) , . . . , c x1 (r−1) , c x2 (1) , c x2 (2) , . . . , c x2 (r−1) , c x,(k−1) (1) , . . . , c x,(k−1) (r−1) , c xL 1 , c xL 2 , . . . , c xL k−1 }, (x, i) , if at the first t outcomes Z 1 , Z 2 , . . . , Z t x occurrences of patterns contained in E 4 have been observed, and

(1 − Z j ) = 1 (no S s have appeared till the t-th outcome) or (2) x ≥ 1, Z t = 0 and no S s have appeared since the k-th S of the x-th occurrence of a simple pattern contained in E 4 or (b) i = j , j = 1, 2, . . . , k − 1, if the t-th outcome is the last S (Z t = 1) of a string of j S s not interrupted by at least r consecutive F s between any two successive S s. (c) i = k, if the last outcome is the last S of a string of at least k S s interrupted by at least r consecutive F s between any two of the first k successive S s.

, following a string of j S s not interrupted by at least r consecutive F s between any two successive S s. (e) i = L j , j = 1, 2, . . . , k − 1, if a new simple pattern contained in E 4 will be completed in the next k − i S s (whenever they occur).

Under this set up, the random variable N

n,k,r becomes an MVB with initial probability vector

and matrices A t and B t having the entries in the theorem. The result follows by Lemma 2.1 and Definition 1.

Considering again, for illustration purposes, the special case k = 3, r = 2 one may easily verify that, if the following sequence of n = 18 binary outcomes is observed, Z 1 = 0, Z 2 = Z 3 = Z 4 = 1, Z 5 = 0, Z 6 = 1, Z 7 = Z 8 = 0, Z 9 = Z 10 = 1, Z 11 = 0, Z 12 = 1, Z 13 = Z 14 = Z 15 = 0, Z 16 = Z 17 = 1, Z 18 = 0, the states attained by the imbedded Markov chain are Y 1 = (0, 0), Y 2 = (0, 1), Y 3 = (0, 2), Y 4 = (0, 2), Y 5 = (0, 2 (1) ), Y 6 = (0, 2), Y 7 = (0, 2 (1) ), Y 8 = (0, L 2 ), Y 9 = (1, 3), Y 10 = (1, 3), Y 11 = (1, 0), (2, 0) . The corresponding matrices read now:

and deploying a calculation procedure similar to the one presented earlier in the examples of Section 3.2, we can easily deduce exact formulas for P (N (4) 5,3,2 = x), x = 0, 1.

In this section we are going to illustrate how our theoretical results can be exploited in the enhancement of patients' mechanical ventilation system. Although the majority of patients receiving mechanical ventilation can be safely disconnected from the mechanical support after a successful test trial of spontaneous breathing, approximately 20% of ventilated patients need a gradual reduction of mechanical support while resuming spontaneous breathing (see Esteban et al. (1994) ).

The process of decreasing slowly the amount of ventilator support of a patient and allowing a gradually increasing overall ventilation is called weaning from mechanical ventilation. The term weaning is often used to describe any method of discontinuing mechanical ventilation. In any case, a very significant fraction of a patient's time in the intensive care unit (ICU) is typically taken up with weaning.

For the majority of mechanically ventilated patients this process can be accomplished quickly and easily. There is, however, a significant percentage of patients in whom weaning fails. Part of the problem probably results from the fact that even for experienced physicians it is difficult to estimate when a patient is ready to wean; it is also true that the clinical approach to weaning, if poorly organized, adds additional time to the duration of mechanical ventilation.

Currently, weaning tends to be dictated by the experience and intuition of the attending physician who tries to maintain the patient in a state of "comfort". Nevertheless, most health experts advocate that the weaning procedure will become much more efficient if directed according to some specific protocol.

Motivated by the aforementioned concerns, attempts have been made to formulate the weaning process as an algorithm, which could be implemented on a computer system (see, for example, Dohat et al. (2000) ). In a new system, proposed by Buliev et al. (2002) and subsequently improved by Dermitzakis et al. (2008) , a large number of respiratory and cardio-circulatory parameters are taken into account and monitored during the weaning process; those include, among others, respiratory rate (RR), tidal volume (VT), the ratio RR/VT, pulse oxygen saturation (SpO2), end-tidal CO2 partial pressure (PETCO2), heart rate (HR), systolic arterial blood pressure (BP SAP), mean arterial blood pressure (BP MAP), and the end-expiratory pressure (PEEP). For each of these parameters a comfort zone (CZ) is defined by the physicians, specifying the range where the monitored parameters should lie in order for the patient to be in a comfortable state. All the data are fed in a Fuzzy Logic Controller, which decides if the patient's support level should be decreased or increased. The decision procedure is quite involved and takes into account, among others, whether one or more of the monitored parameters are in or out of the CZ.

It is still an open issue which the decision criteria should be. In Dafnis and Philippou (2011) it was suggested that the waiting time for the occurrence of the event E 1 , in the special case k = 2, is a reasonable criterion. The authors focused on one of the parameters and assumed that values of the respiratory or cardio-circulatory parameter were collected every 20 sec. They labelled by 0 and 1 the occurrence of a value in and out of the CZ, respectively and suggested interpreting the occurrence of two consecutive 1's as a sign of a stabilized bad condition; the observation of that event could speak for the increase of mechanical support. According to the approach taken by Dafnis and Philippou (2011) , the bad condition alert should also be launched if the two consecutive 1's are separated by at most r 0's (the rational behind that is that the number of values observed inside the CZ are not enough to compensate for the out of range values).

The random variables N (i) n,k,r , i = 2, 3, 4 studied in the present paper, combined with N (1) n,k,r , offer alternative flexible alarm criteria which could be used separately or jointly for setting up an efficient monitoring scheme of the weaning process. Manifestly, the occurrence of a sting of k consecutive 1's provides a strong evidence of a stabilized bad condition and could speak for a need for increased mechanical support. On the other hand, if the k consecutive 1's are interrupted by at most r consecutive 0's should possibly lead to the same decision in the sense that the number of values observed in the CZ is not enough to compensate for the ones that lie outside it. Apparently, this is not the case when the k consecutive 1's are interrupted by at least r consecutive 0's. Thus, a combination of the random variables N (i) n,k,r , i = 1, 2, 3, 4 may provide more rational decision criteria in favor or against the increase in the level of patient's mechanical support. What k and r should be depends on the clinical evidence and the level of false alarms one is willing to accept.

For illustration purposes, we provide some numerical results for the distributions of all four variables N (i) n,k,r , i = 1, 2, 3, 4. Suppose that one is collecting values of the respiratory or cardio-circulatory parameter every 20 sec for 30 minutes (n = 90). Consider k = 7 and r = 4. Using Theorems 3.1, 3.2, 3.3, 3.4, we calculated the entries of Tables 1, 2, 3, 4, respectively which provide the whole distribution, the mean and the variance of N 

In the present paper we introduced three new discrete distributions associated to enumerating random variables generalizing runs' and patterns' binomial distributions. The new distributions can be potentially used for assessing the effectiveness of improved decision criteria incorporated in a new system, that will be exploited for automated monitoring of the weaning process. The main advantage of this approach is that, the system can be easily described in an algorithmic form, a fact that permits an easy implementation of it on a computer system. Illustrative numerical results are also presented. In our short term research activity, properties of the new distributions will be explored. For example, numerical results speak in favour of unimodality; so it is a nice challenge to establish it theoretically or look for conditions under which unimodality holds true. Finally, the new decision criteria have to be tested and compared to the previous ones, using real clinical data.

In closing, it is worth noting that an extension of the new results in the case of first order Markov dependent trials is straightforward. Let us consider a two-state time-homogeneous Markov chain {Z t , t ≥ 0} with state space {0, 1}, transition probabilities p ij = Pr(Z t = j | Z t−1 = i), t ≥ 1, (i, j) ∈ {0, 1}, and initial probabilities p j = P (Z 0 = j), j = 0, 1.

Then, for example, the number N

n,3,2 of occurrences of the compound pattern E 2 , in Z 1 , Z 2 , . . . , Z n (n ≥ 1), is an MVB with π 0 = (p 0 , p 1 , 0, ..., 0) 1×9 , matrix A reads , and B is a 9 × 9 matrix with all its elements vanishing, except for the element (8, 4) which equals p 11 , and (9, 4), which equals p 01 .

On the distribution of the total number of run lengths

New York Bersimis S, Koutras MV, Maravelakis PE (2014) A compound control chart for monitoring and controlling high quality processes

Computer-based system for automated weaning of conscious patients from assisted mechanical ventilation: preliminary tests

The reliability of a generalized consecutive system

Distributions of patterns with applications in engineering

Distributions of patterns of two successes separated by a string of k − 2 failures

Run statistics in a sequence of arbitarily dependent binary trials

Fuzzy logic controller for weaning of conscious patients from mechanical ventilation: a simulation study

Clinical evaluation of a computer controlled pressure support mode

On the distribution and expectation of success runs in nonhomogeneous Markov dependent trials

Modes of mechanical ventilation. A national survey of Spanish hospitals

Distribution theory of runs: a Markov chain approach

Poisson approximations for 2-dimensional patterns

Distribution theory of runs and patterns and its applications: a finite Markov imbedding approach

Some properties of the distributions of order k. Fibonacci Numbers and Their Applications

Generalized binomial and negative binomial distributions of order k by the l-overlapping enumeration scheme

Applications of Markov chains to the distribution theory of runs and patterns. Handbook of Statistics

Runs, scans and urn model distributions: a unified Markov chain approach

Runs on a circle

On binomial and circular binomial distributions of order k for l-overlapping success runs of length k

Success run statistics defined on an urn model

On success runs of length exceeded a threshold

Counting certain binary strings

Exact distributions of constrained (k, l) strings of failures between subsequent successes

On the expected number of limited length binary strings derived by certain urn models

On l-overlapping runs of ones of length k in sequences of independent binary random variables

On limited length binary strings with an application in statistical control

Central limit theorem approximations for the number of runs in Markovdependent binary sequences

Successes, runs and longest runs

Distributions of patterns of two failures separated by success runs of length k

) q-geometric and q-binomial distributions of order k

The authors wish to thank the referees for the thorough reading, useful comments and suggestions which helped to improve the article.Work funded by National Matching Funds 2016 − 2017 of the Greek Government, and more specifically by the General Secretariat for Research and Technology (GSRT), related to EU project "ISMPH: Inference for a Semi-Markov Process" (GA No 329128).