key: cord-0044824-mws8upoe
authors: Garmendia, Luis; Gómez, Daniel; Magdalena, Luis; Montero, Javier
title: Analyzing Non-deterministic Computable Aggregations
date: 2020-05-15
journal: Information Processing and Management of Uncertainty in Knowledge-Based Systems
DOI: 10.1007/978-3-030-50143-3_43
sha: dd2fc6b7322cbebaf7d9ec28ed7eb5cb470b2595
doc_id: 44824
cord_uid: mws8upoe

Traditionally, the term aggregation is associated with an aggregation function, implicitly assuming that any aggregation process can be represented by a function. However, the concept of computable aggregation considers that the core of the aggregation processes is the program that enables it. This new concept of aggregation introduces the scenario where the aggregation can even be non-deterministic. In this work, this new class of aggregation is formally defined, and some desirable properties related with consistency, robustness and monotonicity are proposed.

Aggregation is a fundamental part of any decision, compression or summarization process of complex information [1] [2] [3] [4] . For many years, aggregation has become one of the most relevant topics in soft computing, with multiple applications to decision making, artificial intelligence, data science, and image processing among many others. Aggregation processes have been associated in literature with aggregation functions. An aggregation process was usually represented by means of a function or a family of functions so that the aggregation result associated to a vector of elements was obtained through the image of the vector by the function.

However, this association between aggregation processes and functions was broken in [5] with the definition of computable aggregations. In that paper, the authors put the emphasis in the programs enabling the aggregation processes, not necessarily being expressed in funct terms of functions. In that sense, the core of an aggregation process is the program allowing its computation, and not only the function that describes it. Given a function that describes an aggregation process, it can be implemented in many ways and the way in which this process is carried out is relevant. This new idea of aggregation, allow us to classify classical aggregation operators according to their algorithmic complexity [5] , or to classify aggregation functions according to new ideas of recursivity [6, 7] in terms of the programs instead of the functions.

The rupture between functions and aggregation operators opens the domain of aggregation processes to a field not yet analyzed in this discipline: nondeterministic computable aggregations. Aggregation processes where the same input can produce different outputs. This type of aggregation is very common in statistics, where, due to the volume of information to be processed, it is frequent to choose a representative sample on which the aggregation is operated. Obviously, replicating the process does not imply obtaining the same sample and consequently the result can change. Obviously, these types of aggregation processes can never be modeled by functions, as a result of the intrinsic definition of function. But leaving the well known arena of functions, studying the desirable properties for these new concept of aggregations represents a significant challenge.

The present paper focuses on the definition of properties related to robustness, stability, boundary conditions, and monotony, on nondeterministic computable aggregations.

Aggregation is a fundamental part of science. The process of aggregating the information is a key tool for most knowledge based systems. In general, we can say that aggregation has the aim of merging different pieces of information to come to a conclusion or a decision. Several research communities consider this kind of tools, such as multi-criteria community, decision-sensor fusion community, decision making community, data mining community, among many others.

Although this is not a necessary assumption, aggregation operators [4, [8] [9] [10] [11] ] are associated to the use of membership functions, and this is the reason why they are usually defined as follows:

Definition 1 [12] . An aggregation operator is a mapping Ag : [0, 1] n → [0, 1] that satisfies:

Let us observed that this definition presents an aggregation operator as a function that is linked to the value of n. There is a different function for each n. Other authors (see [12] for example) present an aggregation operator as a function that considers any cardinality for the set of items to be aggregated, defining Ag as a function Ag : ∪ n [0, 1] n → [0, 1]]. It is also possible (see for example [13] [14] [15] ) to define the concept of family of aggregation functions as a set {Ag n } assuming some additional constraints for the relations between functions Ag k , Ag l of different cardinality.

The original definition of aggregation functions work on the unit interval. This classical definition has been extended to a more general class of situations replacing the lattice [0, 1] into a more general scenario T . Another extension of aggregation operators was done by relaxing the monotonicity or boundary conditions. One of these new classes of aggregation operators are the pre-aggregation functions in which the concept of directional monotonicity was introduced [16] . These pre-aggregations could be extended replacing the unit interval for a general T as it is done with the classical aggregation functions. We can find some interesting studies in this line in [17] .

It is even possible in some cases to define aggregation processes going beyond functions by considering methods that do not match with the concept of function. To analyze this option let us first remind the concept of Computable aggregation, as well as that of function.

In [5] , it was introduced the concept of computable aggregation focusing on the idea that in aggregation processes we should pay our attention in the program that makes possible the aggregation instead of a generic function that has not yet been implemented.

Would it make sense to talk about an aggregation process where the considered function could not be implemented?. From the actual definition of aggregation operator the answer is yes of course. Or even more, given the same aggregation function, it is not equivalent at all to implement it in one way or another. And therefore, it would make sense to analyze the properties of the implementation (program), although from the functional point of view both implementations coincide. The main contribution in [5] was to separate the strong association that existed between "aggregation processes" and explicit functions.

In order to formally introduce computable aggregations it is necessary first to introduce what we understand by a program, a list and/or an algorithm.

A list L is an abstract data type (ADT) that represents a sequence of values. A list can be defined by its behavior, and its implementation must provide at least the following operations: test whether a list is empty, add a value, remove a value, and compute the length of a list (number of elements). Another definition that we need to give here is what we understand by algorithm and computer program. These concepts are necessary to formally define computable aggregations. Definition 3 [18] . In mathematics and computer science, an algorithm is a self-contained step-by-step set of operations to be performed.

Algorithms can be expressed in many kinds of notation, including natural languages, pseudocode, flowcharts, drakon-charts, programming languages or control tables.

Definition 4 [19] . A computer program, or simply a program, is a sequence of instructions written to perform a specified task on a computer. Now, we are able to define the concept of a computable aggregation as it was defined in [5] .

It is important to emphasize that the computable aggregation paradigm broadens the set of possible aggregation methods, being in this case not limited to those methods for which there is a function that explains the aggregation process.

In [7] , it was demonstrated that any classical aggregation process (aggregation operators, pre-aggregations, fusion functions or extended fusion functions) can be implemented by an algorithm and by a program. However the opposite is not always true.

As was mentioned in [5] , given an aggregation operator Ag in any of the previous settings, it is possible to build a program P that for any list L of T , Ag |L| (L) = P (L).

But let us note that the opposite is not always true. Given a program P , there exist some situations in which it is not possible to build a function associated to the program P .

Suppose that the objective of the aggregation process is to estimate the average value of X. If n is too large considering the available time for computing the aggregation, a reasonable solution would be to estimate the average value through statistical sampling. In such a situation, k elements of the population will be drawn at random to further calculate the average, i e., given the set {x 1 , ..x n }, we choose {x i1 , . . . , x ik } at random, computing the arithmetic mean of these k elements. Obviously, this aggregation process can't be defined by means of an explicit function since the same input does not always produce the same output. This is a computable aggregation that can't be modeled as an aggregation function, and it is important to integrate such a situation in aggregation processes.

From previous example we can distinguish between those computable aggregations in which there exist an explicit function, from those in which this explicit function does not exist. Formally we can partition the set of computable aggregation programs into two groups. In order to establish this partition we will introduce first the concept of deterministic algorithm and program.

It is important to notice that in this framework, the concept of determinism is approached in two different ways:

-A determinism that only considers input/output relations. -A determinism that considers the whole process, including the internal states.

With the first approach, a deterministic algorithm will be an algorithm which, given a particular input, will always produce the same output. On the other hand, the second approach assumes also that the underlying machine always passes through the same sequence of states.

From the point of view of computer programs as implementation of aggregation processes, our interest relies on input-output relations. Consequently we will focus on the first conception when defining a deterministic program.

It would be possible adding to this definition of deterministic program, a concept similar to the same sequence of states considered for the deterministic algorithm. In fact, according to [20] , in a deterministic program there is at most one instruction to be executed next, so that from a given initial state only one execution sequence is generated. This could be a good option in case we were interested in the verification of a program, being the execution process a key aspect. But it is obviously too restrictive for our purpose, since we are only interested in the result of the Program. Consequently, we will consider Definition 6 as our conception of a deterministic program.

No matter which of the two possible definitions we consider, when applying a deterministic program P we can refer to the output produced for a particular input (a list L), as P (L). That is because the same input (L) will always produce the same output (P (L)), generating a mapping.

Obviously, a program or an algorithm are non-deterministic when they do not match with the previous definitions.

It is clear according to the definitions that (when having the same approach to determinism) a non-deterministic algorithm will always produce a nondeterministic program. Consequently, when analyzing if a computable aggregation is deterministic or non-deterministic we should simply consider the corresponding Program.

From this definition we will say that a computable aggregation P is deterministic when the program P is deterministic, implying that the underlying algorithm is also deterministic.

Let us denote by P D the class of deterministic computable aggregations and by P N D the set of non deterministic computable aggregations.

From now on, we will analyze the case in T = [0, 1].

It is obvious that in many cases the non-deterministic behavior of a program relates to an inappropriate coding that generates an unexpected problem. This is usually the case when we have a deterministic algorithm that being wrongly programed produces a non-deterministic output. But this is not the kind of non-determinism we are interested in. Our interest relies on programs describing aggregation processes that are intrinsically non-deterministic, processes where, as an example, random or probabilistic decisions are involved.

In this subsection, we will define some interesting cases of non deterministic computable aggregations (NDCAs). Note that the computable aggregation P Ag,p=1 coincides with the program associated with the family of aggregation operators {Ag n , n ≥ 2} and is deterministic since the list l p = l when p = 1.

Obviously, P Ag,p as described in Definition 8 is not only a computational aggregation, it is in fact a generic approach that induces as many different computable aggregation as the possible families of aggregation operators {Ag n }. In particular, we will analyze in this paper three cases: the arithmetic mean, maximum and minimum aggregation operators. From now on, we will denote the three of them as: P M,p , P M ax,p and P M in,p described as the sample mean or average, the sample maximum and the sample minimum respectively.

Another way to build a class of Non deterministic computable aggregations should be to fix a value k, and randomly select k elements from the list, as those to be aggregated. Given a list of n elements of [0, 1], l = {x 1 , . . . , x m } ∈ <[0, 1]>, let us denote by Sel k a program that randomly chooses a sample (without resampling) of k elements of the list if m ≥ k, and maintains the same list in other case.

Given a family of aggregation operators {Ag n n ≥ 2}, the computable aggregation P Ag,k , with k ≤ n, is defined as the program that for a given list l of m elements, first applies the procedure l = Sel k (l), and then computes the value Ag |l | (l ).

It is very easy to see that the computable aggregation P Ag,k is also a nondeterministic computable aggregation (being deterministic when k ≥ m). The main difference with the previous computable aggregation is that here the dimension of the list to be aggregated is upper bounded by k, consequently the family of aggregation operators is also bounded in dimension, no matter the dimensions of the list to be aggregated. N (μ, σ) and a family of aggregation operators {Ag n : [0, 1] n −→ [0, 1], n > 2}, we define the noisy computable aggregation P N (μ,σ),Ag as the program that for any list l ∈ <[0, 1]>, with |l| ≤ n, returns P N (μ,σ),Ag = Ag(l N ), where l N is the list generated by replacing each element in l with the same element after being modified by adding noise generated by the normal distribution (truncated to 0 or 1 in case that the resulting value was out of the [0, 1] interval). Definition 11. The pure random computable aggregation P P R is a program that for any list l ∈ <[0, 1]> returns a random value in [0, 1].

Proposition 1. The computable aggregation operators: P M,p , P M ax,p , P M in,p P Ag,k , P N (μ,σ),Ag , P P R , P BR are non deterministic computable aggregation operators if σ > 0 and p < 1.

An example of C++ program P implementing P M,p and P M ax,p is described in Fig. 1 . 

Given a computable aggregation P that aggregates a list l ∈ L<T >, let us denote by P(l) the theoretical distribution after all possible realizations of the program P over the fixed list l. Obviously, if P ∈ P D , the associate P(l) for any list will be a single value. For non deterministic programs, we will have here a probability distribution P(l) for each fixed value of l.

In general, given a non deterministic computable aggregation P , it is not possible to know the theoretical distribution P(l). Nevertheless, we could try to approximate it by making many realizations of P (l). In the following definition we distinguish between the empirical and theoretical distribution.

Definition 13. Given a computable aggregation P and given a list l ∈ L<T >, the distribution of results obtained after n executions of the program P over the list l, will be referred us the empirical distribution with size n of the program P over the list l, represented by DP n,l .

DP n,l = P(l)

To analyze the aggregation process previously described and implemented by an NDCA we have several components to consider. There is a list l ∈ L<T > of elements to be aggregated. There is also a family of aggregation operators ({Ag n }) underlying in the non deterministic aggregation process. Finally, two distributions describe the aggregation process: the theoretical distribution P(l) and the empirical distribution (DP n,l ). It is obvious that the interactions and relations among these elements will describe and characterize an NDCA.

Some of the questions to be considered are obvious, as the relations between the properties of both distributions (the theoretical and the empirical). But it could also be interesting to consider potential relations between some properties of the list l (analyzed as a distribution) and the corresponding properties of the theoretical/empirical distribution. Another important question could be to compare the result produced by the deterministic underlying aggregation, that is Ag |l| (l), and some properties of the distributions (mean, median, etc) generated by the NDCA.

As said before, it is in general not possible to know the theoretical distribution generated by a non deterministic computable aggregation P when applied on a list l (P(l)). We will replace it with an empirical distribution (DP n,l ), and we need both to have similar properties.

In that sense we can define the concept of robustness of an NDCA as a measure of the similarity of both distributions (theoretical and empirical).

A non deterministic computable aggregation P is said to be Robust, if and only if, for any l ∈ L<[0, 1]>, for any t ∈ [0, 1], and for any > 0, there exists n 0 such that the absolute difference between the empirical distribution function DP l,n (t) = number of elements inDP l,n ≤ t n and the real distribution function F P (l) (t) is lower than for n ≥ n 0 .

Given a non deterministic computable aggregation P = agg and a list l, Fig. 2 presents a C++ program generating the empirical distribution DP n,l for a list l with length 1000. Figure 3 presents the empirical distribution (DP n,l ) for some of the previously defined NDCAs 1 , with n = 1000 and being l a list of 10000 random values 2 generated either using a uniform(0,1) distribution or a Normal(0.5,0.1) bounded in [0, 1] 3 .

Note that the sample and the noisy mean NDCA have a normal distribution and the pure random and bounded pure random NDCA have uniform distribution. Note also that the sample min and the sample max NDCA have more dispersion in the N (0.5, 0.1 population generated list that with the U (0, 1) population generated list. 

Three properties were initially considered in aggregation operators theory: two boundary conditions plus monotonicity property. Taking into account now that for a non-deterministic computable aggregation the output for a fixed input does not necessarily have to be a single-point value, in this section we will try to extend (or at least give possible extensions) of how these concepts could be generalized in the context of non deterministic computable aggregations. In the following definition, we present the pure case in which a computable aggregation satisfies the boundary conditions and also the idempotent property. Firs at all, let us denote by l a a list with all elements equal to a, i.e l a = (a, . . . , a) . Let us note that if a computable aggregation is idempotent then it satisfies the two boundary condition.

-The computable aggregation P Ag,p is idempotent. -The computable aggregation P Ag,k is idempotent.

As a consequence of the previous proposition the following corollary holds. In previous section we introduced robustness as the way to establish that both the theoretical and the empirical distributions are similar. This idea will allow us to work on the basis of DP l,n , considering that in most cases the theoretical distribution is unknown.

We will introduce now the idea of consistency of an NDCA, as the property that considers how close is the behavior of P , with that of Ag, the underlying aggregation process considered by the NDCA.

A non deterministic computable aggregation P is said to be robust in φ, if and only if, for any l ∈ L<[0, 1]>, and for any > 0, there exists n 0 such that the absolute difference between φ(π(DP l,n )) and φ(l) is lower than for any n ≥ n 0 , where (π(DP l,n )) is the vector representation of the set DP l,n .

If p ∈ (0, 1], the following holds:

Proof. For random sample theory, it can be seen that the convergence of M (π(DP l,n ) is M (l), Max(π(DP l,n ) is Max(l) and Min(π(DP l,n ) is Min(l).

It would be also important to consider how the dispersion of DP l,n evolves with n. Ideally we would like the dispersion to reduce when n increases.

A non deterministic computable aggregation P is said to concentrate when the dispersion of DP l,n decreases (is non increasing) with n. Now let us define what we understand as monotonicity in non deterministic computable aggregation, since for the deterministic the definition can be reproduced in the same way. A definition of monotony for any function f : X −→ Y required first the definition of an order in the spaces X and Y , in such a way if we have two inputs l, l ∈ X with l ≤ X l then monotonicity implies that

Taking into account this, if we want to define some class of monotonicity in the case of non-deterministic computable aggregations we have to define first an order in the input space Although this definition is perfectly valid, it would be interesting to study in deep different possibilities that will produce different ideas of monotonicity. In particular, in this article we provide a possible order but others could be defined.

The problem of establishing a possible order among the set of ordered lists does not seem very complex if we focus on the case in which the lists presents equal size, since the order relationship coincides with the natural order relationship in [0, 1] k , so we will consider that natural order. The case of order over finite sets (even if they have the same size) is much more complex.

In this paper, we propose a possible idea of order that is related with a majority rule. A finite set S ≤ T S if and only if the following holds:

|{(x, y) ∈ S × S / x ≤ y}| ≥ |{(x, y) ∈ S × S / y ≤ x}|.

Just to put an example we have that the set S = {0.1, 0.3, 0.5} is lower than S = {0.2, 0.6, 0.7} since from the 9 pairwise comparison the elements of S win in 7 of the cases. Taking into account this consideration the following monotonicity definition is given: Definition 20 Tournament monotonicity. A non deterministic computable aggregation P over the domain T is Tournament monotonic if and only if for any pair of lists l,l of T with the same cardinal such that l ≤ l there exist n 0 in which the following holds DP n,l ≤ T DP n,l for n ≥ n 0 .

The definition of Computable aggregation implies a rupture between functions and aggregation processes allowing the incorporation of a new class of aggregations: the non deterministic computable aggregation. This new class of computable aggregations is characterized by aggregations where the result of the process could change even if we fix the information that has to be aggregated. Obviously, no function can model this class of aggregation process since by definition the output of a function is always the same for a fixed input value.

It is important to emphasize that there are many real situation in which the information is aggregated in a non deterministic way. For example, any inference based on random sampling is a well-known example of this. If the population is huge, it is very frequent to obtain a sample and operate over it. But this is not the only case, any aggregation process in which randomness appear could be understood as a non deterministic process. Many artificial intelligence or machine learning aggregation process could be classified also as non deterministic.

In this paper, we have tried to define some desirable properties for these new class of aggregation process as robustness, stability, boundary conditions, and monotonicity. This questions open the gate for further research.

Aggregation and Fusion of Imperfect Information

Fuzzy Sets and Their Extensions: Representation, Aggregation and Models. Studies in Fuzziness and Soft Computing

A discussion on aggregations operators

Consistency and stability in aggregation operators, an application to missing data problems

Computable aggregations

Complexity of increasing phi-recursive computable aggregations

2019 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE 2019

Aggregation operators: properties, classes and constructions methods

Strictly stable families of aggregation operators

Approaches to learning strictly-stable weights for data missing values

A generalization of stability for families of aggregation operators

Aggregations Functions: A Guide for Practitioners

Recursive families of OWA operators

Hierarchical aggregation of owa operators: basic measures and related computational problems

Recursive connective rules

Preaggregation functions: construction and an application

Generalized pre-aggregations

Computation: Finite and Infinite Machines

Algorithms + Data Structures = Programs

Verification of Sequential and Concurrent Programs. Graduate Texts in Computer Science