key: cord-0044833-9k0myxr1 authors: Shinde, Krushna; Feissel, Pierre; Destercke, Sébastien title: Dealing with Inconsistent Measurements in Inverse Problems: An Approach Based on Sets and Intervals date: 2020-05-16 journal: Information Processing and Management of Uncertainty in Knowledge-Based Systems DOI: 10.1007/978-3-030-50153-2_34 sha: 900020f91fbf839ab7ffd1fc31cde45ef816de6c doc_id: 44833 cord_uid: 9k0myxr1 We consider the (inverse) problem of finding back the parameter values of a physical model given a set of measurements. As the deterministic solution to this problem is sensitive to measurement error in the data, one way to resolve this issue is to take into account uncertainties in the data. In this paper, we explore how interval-based approaches can be used to obtain a solution to the inverse problem, in particular when measurements are inconsistent with one another. We show on a set of experiments, in which we compare the set-based approach with the Bayesian one, that this is particularly interesting when some measurements can be suspected of being outliers. Identifying the parameters of a physical model from a set of measurements is a common task in many fields such as image processing (tomographic reconstruction [1] ), acoustic (source identification [2] ), or mechanics (material properties identification [3] ). Such a problem is known as the inverse problem and is the converse of the so-called forward problem. While the forward problem is usually well-posed, it is not the case of the inverse problem. Indeed, whenever there is noise in the measurements or error in the model, such a problem may well end-up having no solutions [4] . Common recourse to this issue that have been proposed in the literature is to consider either Least-square minimization techniques [5] or Bayesian approaches [6] modeling the noise in measurements. Both these approaches, however, can be quite sensitive to outliers [7, 8] or aberrant measurements. In addition to that, a lot of researchers argued that probabilistic methods such as Bayesian inverse methods are not well suited for representing and propagating uncertainty when information is missing or in case of partial ignorance [9] [10] [11] . In contrast, interval-valued methods [12] [13] [14] [15] make a minimal amount of assumptions about the nature of the associated uncertainties, as they only require to define the region in which should be the measurement. In this paper, we propose an inverse strategy relying on interval analysis to deal with uncertain measurements and to detect inconsistent measurements (outliers). We apply the proposed strategy in experimentation concerning the identification of material elastic parameters in the presence of possibly inconsistent measurements (here, full-field displacements [16] ). This paper is organized as follows. Section 2 describes the identification strategy based on a set-valued approach and outlier detection method to select a set of consistent measurements as well as the numerical implementation of the identification strategy, including the discrete description of sets [11] . In Sect. 3, we present an application to static tensile tests of homogeneous plates to identify material parameters using the proposed outlier detection method and we compare it to the Bayesian inference method with sensitive data. This section is composed of four parts. In Sect. 2.1, we introduce the inverse problem. The identification strategy with a set-valued approach based on intervals is described in Sect. 2.2. Section 2.3 introduces the outlier detection method to select a subset of consistent measurements. Section 2.4 describes the numerical implementation of the identification strategy with the discrete description of sets. We consider an inverse problem where we want to identify some parameters of a model y = f (θ) from N measurements made on quantity y. The model f yields the relationship between the M model parameters θ ∈ R M and the measured quantity. We will denote byỹ ∈ R N the measurements made on y. A typical example introduced in Sect. 3 is where θ corresponds to elastic Lamé parameters (λ and μ) and y is full-field displacement data obtained after applying a given load on the material specimen. In this paper, we consider the case where the discrepancy between f (θ) andỹ is mainly due to measurement errors, i.e., we leave the issue of model error to future investigations. In this Section, we propose a set-valued inverse problem strategy based on the interval-valued measurements. is the lower bound and x is the upper bound [17] . In our work, we choose to describe uncertainty on the measurements in interval form, as such a description requires almost no assumption regarding the nature and source of uncertainty [14] . To describe prior information about parameters, we use a multidimensional extension of intervals, i.e. hypercube or box of R n defined as the Cartesian product of n intervals. For example, in the case of two parameters, x 1 and x 2 , information on them is described by set X such that X = [ Boxes are the easiest way to describe multidimensional sets. Identification Strategy. In the proposed approach, intervals describe the uncertainty on the measurements and an hyper-cube describes the prior information about parameters. Hence, the solution of the inverse problem can be obtained thanks to a set inversion process [17] . The uncertainty in the measurements is described through the set S y . Each measurement is described with its lower boundỹ k and an upper boundỹ k . Prior information about parameter is described through S 0θ ⊂ R M , i.e., with the box. Given a set S y ⊂ R N describing the uncertainty onỹ and prior parameter set S 0θ ⊂ R M , the set S θ ⊂ R M describing the solution of the inverse problem is defined as follows: In the current work, it is possible to obtain a solution set for each measurement as follows: where y k (θ) represents k th response of the model y = f (θ) and then S θ can be obtained as the intersection of the S k θ : In case of inconsistent measurements, the set-valued inverse method gives an empty solution set S θ = ∅ corresponding to N k=1 S k θ = ∅. There may be several reasons for the inconsistency of the measurements with respect to the model such as presence of measurement outliers or model error. We illustrate the set-valued inverse problem on a toy example. We consider a spring-mass system shown in Fig. 1 which can be described mathematically as where F represents the force applied on the spring in Newton (N ), p is the spring stiffness constant (N/m) and the parameter to estimate, and y is the measured displacement of the spring in meter (m). We consider a case where a force F = 100 N is applied on the spring and a displacement,ỹ = 0.01 m is measured. Here, inverse problem consists of determining the parameter from the measurementỹ. To do this, we describe uncertainty on the prior knowledge about the parameter and the measurement in the interval form such that S 0θ i.e., In case of inconsistency, a way to restore consistency is to remove incompatible measurements, i.e., possible outliers. To do this, our method relies on measures of consistency that we introduce now. For any two solution sets S k θ and S k θ corresponding toỹ k andỹ k measurement respectively, (k, k ) ∈ {1, ..., N } 2 , we define the degree of inclusion (DOI) of one solution set S k θ with respect to another S k θ as where A(S k θ ) corresponds to the area of the set S k θ . The DOI between two solution sets is non-symmetric, i.e., DOI kk = DOI k k . DOI reaches to its boundary values in the following situations as illustrated in Fig. 2 . Furthermore, the value of DOI kk will always be between 0 and 1 when A(S k θ ) is non-zero. The larger the value of DOI between one solution set and another, the higher the possibility of S k θ included in S k θ . We now introduce a measurement-wise consistency degree from a set of measurements. By using the pairwise degree of inclusion (DOI) of the solution sets corresponding to the measurements, we define the global degree of consistency (GDOC) of any k th measurement with respect to all other measurements as which reaches its boundary values in the following situations: The value of GDOC(k) will always be between 0 and 1. Note that the condition for GDOC = 1 is very strong, as it requires all sets to be identical. If GDOC(k) = 0 then the k th measurement is fully inconsistent with all other measurements. A high value of GDOC for the k th measurement then indicates a high consistency with most of the other measurements. Finally, we define a global consistency measure for a group of measurements. It has the following properties: 1. It is insensitive to permutation of the sets of measurement (commutativity). 2. The value of GCON S is monotonically decreasing with the size of the set E, in the sense that for any subsets of measurements E, F, with E ⊆ F , then we have GCON S(F ) ≤ GCON S(E). It is also mean that the more measurement we have, the less consistent they are with one another. 3. A good principle to choose a subset of consistent measurements would be to search for the biggest subset E (the maximal number of measurements) that has a reasonable consistency, that is for which GCON S(E) is above some threshold. Yet, such a search could be exponential in N , which can be quite large, and therefore untractable. This is why we propose a greedy algorithm (Algorithm 1) that makes use of GDOC measures to find a suitable subset E. The idea is quite simple: starting from the most consistent measurement according to GDOC and ordering them according to their individual consistency, we iteratively add new measurements to E unless they bring the global consistency GCON S under a pre-defined threshold, that is unless they introduce too much inconsistency. To solve the set-valued inverse problem, we need a discrete description of the sets. There are multiple ways to represent the sets in a discrete way, such as using boxes (SIVIA algorithm [15] ) or a grid of points. Here, we use the same description as in [11] , that is a grid of points, θ i , i ∈ {1, ..., N g } as shown in Fig. 3a where N g is the number of grid points. Such a description is convenient when comparing or intersecting the sets since the grid of points is the same for any set. Any set S θ ⊂ S 0θ is then characterized through its discrete characteristic function, defined at any point θ i ∈ S 0θ of the grid as shown in Eq. (11) and Fig. 3b . In the current application, a uniform grid is chosen to describe the prior parameter set S 0θ , but it is not mandatory. In our method, each S k θ is therefore described by its discrete characteristic function, defined at any point of the grid as These discrete characteristic functions can be collected in a N g ×N matrix X as columns of boolean values as shown in Eq. (13) . N g ×N matrix X is described as where χ S k θ (θ i ) is the element of column k and line i. Using matrix X, a N ×N symmetric matrix T = X T X can be obtained, whose components are directly proportional to the inverse sets areas, and can therefore be used as an estimation of such areas: Indeed, the diagonal element T kk of T represents the number of grid points for which the k th measurement is consistent and it is proportional to A(S k θ ). The non-diagonal element T kk of T represents the number of grid points for which both k th and k th measurements are consistent and it is proportional to A(S k θ ∩ S k θ ). Hence, GDOC can be computed from matrix T for any k th measurement as follows We have presented an identification strategy and outlier detection method that makes use of intervals to represent information about parameters and measurements. The next section will be devoted to an application of this strategy to a mechanical problem, as well as to a comparison with the Bayesian inference method, exploring in particular their behaviour in presence of outliers. In this Section, we apply the set-valued inverse method to identify elastic properties (Lamé parameters: λ and μ) of a homogeneous 2D plate under plane strain as shown in Fig. 4a . The plate is clamped on the left side and loaded on the right side by a uniform traction f = 1000 N/m. To generate displacement measurement dataỹ (386 measurements), exact displacement data y Ref is simulated by a Finite Element (FE) model (193 nodes, 336 elements) as shown in Fig. 4b considering the reference values λ 0 = 1.15 · 10 5 MPa and μ 0 = 7.69 · 10 4 MPa. We also consider a possible Gaussian noise with 0 mean (no systematic bias) and with standard deviation σ. In the current work, σ was taken as 5% of the average of all the exact displacement values and in practical cases it can be assumed that σ can be deduced from the measurement technique. For the set-valued inverse method, the uncertainty on the measurements has to be given in interval form. Therefore, each measurement is modelled as [ỹ k − 2σ,ỹ k + 2σ]. The width of 2σ ensures that sufficient measurements will be consistent with one another. Prior information about the parameters (S 0θ ) is considered as a uniform 2D box λ p × μ p with λ p = [0.72 10 5 , 1. We first apply the set-valued inverse method to identify the set of elastic parameters when there is no noise in the data. The measurement data was chosen such thatỹ = y Ref , and the information on the measurementỹ was described in an interval form: [ỹ − 2σ,ỹ + 2σ]. Figure 5 shows the feasible set (yellow color) of the identified parameter which is consistent with all 386 measurements using the set-valued inverse method. We can note that the exact value of the parameter included in the solution set. We then apply the set-valued inverse method along with GCON S outlier detection method (Algorithm 1) to identify the set of elastic parameters when there is random noise in the data. The measurementỹ is created from y Ref by adding to it a Gaussian white noise with standard deviation σ and the information on the measurementỹ was described in an interval form: [ỹ − 2σ,ỹ + 2σ]. Figure 6a shows that the identified set (green color) when taking all the measurements is empty due to inconsistency within the measurements. To obtain a non-empty solution set, we use our proposed solution and Algorithm 1 with the value of the GCON S threshold settled to 0.1. We use a low value of GCONS to ensure that a high enough number of measurements will be included. Figure 6b shows the feasible set (yellow color) of the identified parameter using GCONS method, with 55 measurements removed. We can note here again that the exact value of the parameter included in the solution set. We now compare the set-valued inverse method with the standard Bayesian inference method. We apply the set-valued inverse method and Bayesian inference method to identify elastic properties (Lamé parameters: λ and μ) of a homogeneous 2D plate for the same 386 measurements. For the set-valued inverse method, information on the measurementỹ is described in an interval form: [ỹ − 2σ,ỹ + 2σ] with σ = 0.0020 and prior information about the parameters is described with a discretization of the set λ p × μ p with λ p = [0.72 10 5 , 1. Rouhgly speaking, this means that the Bayesian model is not misspecified. Figure 7 shows the feasible set (yellow color) of the identified parameter using the set-valued inverse method and the feasible set (red color) of the identified parameter using Bayesian inference method. In the case of Bayesian inference method, the feasible set (red color) corresponds to a credibility set having a probability of 90%. The results on this specific example indicate that both methods are consistent with each other, with the Bayesian approach delivering more precise inferences. This observation has been made on other simulations using a well-specified Bayesian model. Now, we compare the set-valued inverse method and the Bayesian inference method in terms of their sensitivity to outliers i.e., how they perform when some data becomes aberrant, hence departing from the Bayesian assumptions. To do this, we use 8 sets of 100 experiments (each experiment with 386 measurements) in a way such that for each set the percentage of outlier measurements will increase. In practice, we use the following schemẽ y =ỹ 0 + αI (17) whereỹ 0 are noisy measurements, ∼ N (0, σ 2 ) is the initial noise, α = 5 is a multiplicative constant applied to when a measurement is an outlier, and I ∼ B(p i ) is a Bernoulli variable with parameter p i depending on the experiment set, and indicating the average percentage of outlier measurements. In particular, we used the values 0%, 3%, 5%, 7%, 9%, 11%, 13%, 15% for p i in our sets of experiment, starting from no outliers to an average of 15%. For each experiment from the 8 sets (thus for 800 experiments), we have performed the identification using our set-valued inverse method and the Bayesian inference method to check their sensitivity towards outliers. For all experiments, we have chosen the value of GCON S threshold = 0.1. For each set of experiment, we have computed the average number of times that each method includes the true parameter values, denoted A c in Fig. 8 . From this figure, it can be observed that when there is an increase in the percentage of over noisy data points per set, the A C value starts to decrease in the case of Bayesian inference method but not with GCON S method. So, while the Bayesian approach strongly suffers from a model misspecification, our method is robust to the presence of outliers, even in significant proportion. Hence, we can conclude that the two methods clearly follows different strategies and provide results that are qualitatively different in presence of outliers. In this paper we have presented a new parameter identification strategy relying on set theory and on interval measurements. In this approach, we have used intervals to describe uncertainty on measurements and parameters. In order to solve the inverse problem, we have proposed a discrete description of sets related to the information about the parameters. We have introduced indicators of consistency of measurements, using them to propose an outlier detection method, i.e., the GCON S method. We applied this strategy to identify the elastic properties of homogenous isotropic material. The results showed that the identification strategy is not only helpful to obtain a feasible set of the parameters but is also able to detect the outliers in the noisy measurements. We also compared our identification strategy with the Bayesian inference method in terms of sensitivity to outliers and results showed that the Bayesian inference method can give a false prediction of the parameter when data is too noisy. The application of the identification strategy considered in the current work concerns a relatively small number of measurements (at least for mechanical applications) and a 2D parameter identification. However, computational complexity in case of very high dimensions is an important issue that remains to be investigated. The next step in this work is to apply this strategy with high dimensional data as well as parameter identification. Tomographic image reconstruction using filtered back projection (FBP) and algebraic reconstruction technique (ART) Acoustic source identification using multiple frequency information Identification of material properties of composite materials using nondestructive vibrational evaluation approaches: a review Numerical Methods for the Solution of Ill-Posed Problems Damage detection and parameter identification by finite element model updating. Revue Européenne de Introduction to the Bayesian approach applied to elastic constants identification Least Squares for Practitioners Impacts of outliers and mis-specification of priors on Bayesian fisheries-stock assessment Different methods are needed to propagate ignorance and variability Essay on uncertanties in elastic and viscoelastic structures: from AM Freudenthal's criticisms to modern convex modeling Identification of elastic properties in the belief function framework Quantification of margins and uncertainties: alternative representations of epistemic uncertainty Non-probabilistic finite element analysis for parametric uncertainty treatment in applied mechanics: recent advances Literature review of methods for representing uncertainty. 2013-03 of the Cahiers de la Sécurité Industrielle. Foundation for an Industrial Safety Culture Set inversion via interval analysis for nonlinear boundederror estimation Digital imaging techniques in experimental stress analysis Applied Interval Analysis. Software Engineering/Programming and Operating Systems Acknowledgements. The research reported in this paper has been supported by the project Labex MS2T, financed by the French Government through the program Investments for the future managed by the National Agency for Research (Reference ANR-11-IDEX-0004-02).