key: cord-0066183-bj94k96w authors: Xu, Shan; Li, Shenggang; Liu, Heng; Garg, Harish; Jin, Xuqin; Zhao, Jingjing title: An understandable way to discover methods to model interval input–output samples date: 2021-07-26 journal: Comp DOI: 10.1007/s40314-021-01561-z sha: 6e835c81754bd53a603e3d59e1fa61d3fef0e0fa doc_id: 66183 cord_uid: bj94k96w This paper shows an application of plausible reasoning methods (mainly, specialization and analogous) in mathematical modeling. Our attention is how a practitioner to determine analogously a more balanced scientific model to assist the desire during solving the entire problem. Taking interval samples modeling as a problem, we exemplify (with consideration paid to the motivation and course of discovering) how to discover, based on the classical corresponding methods, three linear regression models and two linear-like interpolation models relying on n-variable interval input-1-variable interval output samples. The rationality of these recommended models are proved, and applications of them are illuminated in detail by examples. Strategies to model further interval samples towards satisfactory are also exposed. An interval number (Moore 1966) a is a pair ȧ,ä ∈ R 2 withȧ ≤ä, where R is the set of all real numbers with the ordinary partial order. In this paper, we write a = a, a as a and identify 1 it with the corresponding real number a (therefore, interval numbers can be looked as an extension of real numbers) and designate the number ȧ,ä c = 0.5(ȧ +ä) (resp., ȧ,ä r = 0.5(ä −ȧ)) the center or midpoint (resp., the radius) of the interval number ȧ,ä (thus ȧ,ä can also be written as ȧ,ä c , ȧ,ä r or ȧ,ä c ± ȧ,ä r ). We use R to denote the set of all interval numbers. We define a ∨ b = max{a, b}, a ∧ b = min{a, b}, and a, b = a ∧ b, a ∨ b for any a, b ∈ R. The four elementary operations on R have also been extent to R (where a = ȧ,ä , b = ḃ ,b ∈ R, see Moore 1966) : ȧ,ä ⊕ ḃ ,b = ȧ +ḃ,ä +b , ȧ,ä ḃ ,b = ȧ −ḃ,ä −b , ȧ,ä ḃ ,b = ȧḃ,äb , ȧ,ä ÷ ḃ ,b = ȧ b ,äb (0 / ∈ [ḃ,b] ). Moreover, a total order (see Hu and Wang 2006; Xu and Yager 2006) ≤ can be defined on R by putting ȧ,ä ≤ ḃ ,b ⇐⇒ ȧ,ä c < ḃ ,b c , or ȧ,ä c = ḃ ,b c butä −ȧ ≥b −ḃ. Interval data arise in many cases (such as numerical analysis-managing rounding errors, computer-assisted proofs, global optimization, individually, modeling uncertainty) because the data included there can not be exactly expressed in real numbers but can be revealed in interval numbers. For instance, consider the following problems with interval numbers as input and output (briefly, interval input-output) sample set [for more details, we refer to see also Table 4 in Inuiguchi and Mizoshita (2012) ] 2 : (I) The pattern recognition problem including interval samples. Astragali Radix is a medicinal and edible plant of the same origin that can regulate the body's immune function and is perfect for endangered patients. Usually, Astragali Radixes are distributed (based on test and measure data) into 5 grades: 1 (the lowest grade), . . ., 5 (the highest grade). The following Table 1 (taken from Zhang et al. 2020 ) dispenses some useful samples, where x 1 stands for length (cm) of Astragali Radix, x 2 stands for head diameter (cm) of Astragali Radix, x 3 stands for tail diameter (cm) of Astragali Radix, and y stands for grade of Astragali Radix. For a given Astragali Radix t t t 0 = (35.8 ± 5.4, 1.7 ± 0.2, 0.8 ± 0.1) (whose grade cannot be resolved instantly by using Pharmacopoeia of the People's Republic of China 1992), try to match it. (II) A control problem involving interval samples: where x x x(t) = (x 1 (t), x 2 (t)) T ∈ U (a compact set in R 2 ) represents the state at time t, f (x x x) is a binary continuous function on U which is just observable (i.e., we can get the approximate value, an interval number in general, for each x x x), and u u u is the controller to be designed. Just like succeeding a decimal by an integer (so-called rounding up or down), practitioners can restore each interval number a in a sample set by its center a c to process the new sample using some known processes. However, this is an unacceptable method due to the loss of the information. The other way to deal with it while solving the practical problem is the use of a real input-interval output model (i.e. real number input-interval 1 .2 ± 0.1 0 .9 ± 0.1 2 ± 0.1 29.2 ± 3.8 1 .0 ± 0.1 0 .8 ± 0.1 1 ± 0.1 36.5 ± 9.0 1 .3 ± 0.2 0 .9 ± 0.1 4 ± 0.1 32.4 ± 5.5 1 .0 ± 0.1 0 .8 ± 0.1 3 ± 0.1 30.4 ± 5.4 0 .9 ± 0.1 0 .7 ± 0.1 2 ± 0.1 32.9 ± 6.1 0 .8 ± 0.1 0 .6 ± 0.1 1 ± 0.1 number output, called also crisp input-interval output); for more details, see (Hwang et al. 2006; Ishibuchi and Tanaka 1990; Jeng et al. 2003; Lee and Tanaka 1999 ) and the following Example 1.1 and Remark 1.2. Example 1.1 Consider a real input-output (i.e. real number input-real number output) sample (an n-element set) S = {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x n , y n )} from a continuous function f (x). Without loss of generality we assume S = {(−1, 0), (0, 1), (1, 0)} (a 3-element set). Firstly, we get the linear interpolation f 1, Secondly, we get the first real input-interval output model Similarly, we get f 1,n,0 (x) and f 1,n,ε (x) = f − 1,n,ε (x), f + 1,n,ε (x) = f 1,n,0 (x) ± ε (n > 3, ε > 0). Notice f 1,n,0 (x i ) ∈ f 1,n,ε (x i ) (i = 1, 2, . . . , n), we have lim ε→0 f 1,n,ε (x i ) = f 1,n,0 (x i ), f 1,n,0 (x i ) = f 1,n,0 (x i ) (i = 1, 2, . . . , n), and lim n→+∞ f 1,n,ε (x) = f 1,n,0 (x), f 1,n,0 (x) = f 1,n,0 (x) (x ∈ [a, b]) if a = x 1 < x 2 < · · · x n = b is an equal-length partition. Thirdly, we get the second real input-interval output Remark 1.2 An interval linear regression model for a real input-output sample can also be written as 1, 2, . . . , n) , the optimal interval coefficients. Moreover, the optimal interval coefficients is a solution of the following quadratic programming problem [see (Chukhrova and Johannssen 2019; Tanaka and Lee 1998) where a a a = (a 0 , a 1 , . . . , a n ) T , c c c = (c 0 , c 1 , . . . , c n ) T , c c c T is the transpose of c c c, (x x x T j , y j ) = (x j1 , x j2 , . . . , x jn ; y j ) is j-th sample, |x x x j | T = (|x j1 |, |x j2 |, . . . , |x jn |), and ξ > 0 (very small). Thus, the main concern of this paper is how practitioners to model interval input-output samples. There are some available work in the literature related to the topic (we refer to Boukezzoula et al. 2011 Boukezzoula et al. , 2018 Boukezzoula et al. , 2020 Chuang 2008; Hladíka andČerný 2012) ; but we extend the method as presented by Hladíka andČerný (2012) because the readers probably see from it not only the motivation the authors propose their method but also the course the method is formed [see Polya's famous book (Polya 1954 ) for more in this direction]. The present paper is a sequel of these works which exemplifies, in the light of Polya's idea, how to restore or approximate the true function (in form of interval input-output function) from the interval input-output samples based on some commonly used methods. For problem (I), we first concentrate the original problem by assuming that each interval datum has a radius 0 and thus get a new sample set and find a solution: Determine the grade of t t t 0 using the classical linear regression function obtained relying on the new sample set. Then we investigate the generalization of this strategy in the case of the interval sample. Analogously, for problem (II), we first use an interval input-output function (which can be formulated) to approximate f (by observing the analogy between the case of interval input-output sample and the case of the real input-output sample) and then consider the establishment of the new system. To make our discussion more reasonable, let us recall three most usually used metrics on the n-dimension Euclidean space R n (n ≥ 1) which are defined by (a a a = (a 1 , . . . , a n ), b b b = (b 1 , . . . , a n )) ρ n (a a a, b b b) = max{|a 1 − b 1 |, . . . , |a n − b n |} (called the Chebyshev metric), called the Manhattan metric or the city block metric). The rest of the paper consists of four sections. Section 2 exemplifies how to discover, based on the classical linear regression, methods to model interval input-output samples. Section 3 investigates, based on the classical linear interpolation, the same problem as in Sect. 2. Section 4 exemplifies applications of the models proposed. Conclusions, discussion, and strategies for further generalization are given in Sect. 5. In this section, the following interval input-output sample set, consisting of n + 1-dimension row vectors of interval numbers (each interval number contains the true datum), will be considered: We first present three linear regression models, each is stemmed from the corresponding linear regression modal with real input-output sample. Then we prove that each modal is an universal approximator (Ying 2015) to the corresponding linear regression modal based on the true sample. (1) If m = n + 1 = 2, (x 1,1 , y 1 ) = (x 1 , y 1 ), and (x 1,2 , y 2 ) = (x 2 , y 2 ) (x 1 = x 2 ). Then the linear regression function based on S is a classical one: (2) If m = n + 1 = 2, (x 1,1 , y 1 ) = ( x 1 , δ , y 1 , δ ), and (x 1,2 , y 2 ) = ( x 2 , δ , y 2 , δ ) (x 1 = x 2 , δ > 0). It is most possible for us to get two functions: one is the classical linear regression function another is the classical linear regression function based on the real input-output sample set {(x 1 + δ, y 1 + δ), (x 2 + δ, y 2 + δ)}. It is also natural for us to take the linear regression function based on the above interval input-output sample set S asf δ ( x 1 − x 2 , a 1 = y 1 −y 2 x 1 −x 2 , and x = ẋ,ẍ ∈ R. It can be easily seen that lim . This confirms the rationality that we take the linear regression function based on the above interval input-output sample set S asf δ orf δ . (3) Notice that the radii of interval numbers in a data set are not the same in general in practice problems. So we should consider a little big generalization of (2): m = n + 1 = 2, (x 1,1 , y 1 ) = ( x 1 , δ 1,1 , y 1 , δ 1,2 ), and (x 1,2 , y 2 ) = ( x 2 , δ 2,1 , y 2 , δ 2,2 ) (x 1 = x 2 , δ i, j > 0, i, j = 1, 2). Analogously to (2), we can get the linear regression function is the classical linear regression function based on the real input-output sample set is the classical linear regression function based on the real input-output sample set This further support us to take the linear regression function based on the above interval input-output sample set S asf δ δ δ orf δ δ δ . (4) If δ 1,1 = δ 1,2 = δ 2,1 = δ 2,2 does not hold, then a 1 = y 1 −y 2 x 1 −x 2 does not hold (which can be seen from (1)-(3)) and the output should be f c (x) + r , where r should be a continuous function of variables δ 1,1 , δ 1,2 , δ 2,1 , and δ 2,2 satisfying r (0, 0, 0, 0) = 0. Thus we can take r as the second kind linear regression function (i.e. the linear regression function having no the constant term and obtaining by the classical least square estimation method) r (δ) = δ 1,1 δ 1,2 +δ 2,1 δ 2,2 δ 2 1,1 +δ 2 2,1 δ based on the real input-output sample set {(δ 1,1 , δ 1,2 ), (δ 2,1 , δ 2,2 )}. Therefore, we can also take the linear regression function based on the interval input-output sample set S asf δ δ δ ( Table 2 . (1) Similar to Remark 2.1(2) and Remark 2. 1882x 2 (i.e. the classical linear regression function or the first kind linear regression function based on the sample set {(x c 1 , x c 2 , y c ) | (x 1 , x 2 , y) ∈ S)}. We chose, by the classical least square estimation method, a linear regression function r (δ 1 , δ 2 ) = 0.1δ 1 + 4.4δ 2 from the set of all 2-variable linear functions without constant terms (i.e. the second kind linear regression function) 3 As a result, we obtain the third model (2) Assumes s s 0 = 1000, 600, 1200, 500, 300, 400, 1300, 1100, 1300, 300; 5, 7, 6, 6, 8, 7, 5, 4, 3, 9; 100, 75, 80, 70, 50, 65, 90, 100, 110 , 60 ∈ x 1,1 ×x 1,2 ×· · ·×x 1,10 ×x 2,1 ×x 2,2 × · · · × x 2,10 × y 1 × y 2 × · · · × y 10 is the true data. 1882x 2 is the linear regression function based on the real input-real output data set s s s 0 . For each s s s = 1000 + a, 600 + a, 1200 + a, 500 + a, 300 + a, 400 + a, 1300 + a, 1100 + a, 1300 + a, 300 + a; (111.6918, 0.0143, 7.1882) , (β a , β b , β c )). Then we have the computing results as shown in Table 3 . Supported by Remark 2.1 and Example 2.2, we have reasons to propose the following Algorithm 2.3: 3 Similar to obtaining a linear regression functionr (x x x) =β 0 +β 1 x 1 +β 2 x 2 + · · · +β n x n (by the classical least square estimation method) from the set of all n-variable linear functions, we can also obtain a linear regression function r (x x x) = r (x 1 , x 2 , . . . , x n ) = β 1 x 1 + β 2 x 2 + · · · + β n x n from the set of all n-variable linear functions without the constant term by the classical least square estimation method, i.e. by solving the linear system of linear equations ∂e . . , (x n,1 , . . . , x n,m ), (y 1 , . . . , y m )} is the real input-real output data set (having n + 1 element and) consisting of m-dimension row vectors of real numbers. General speaking, β 1 =β 1 , β 2 =β 2 , . . ., β n =β n . It can be easily seen from above linear system of linear equations that β 1 = and when n = 2. Comparison of distances between f and f abc under three commonly used metrics (2) can be given by the following ways: (1) Step 1 Compute the ordinary linear regression function f(t t t) =f(t 1 , . . . , t n ) = a 0 +a 1 t 1 + · · · +a n t n (∀t t t ∈ R n ), based on the real input-output sample set S = {(ẋ 1,1 ,ẋ 2,1 , . . . ,ẋ n,1 ;ẏ 1 ), (ẋ 1,2 ,ẋ 2,2 , . . . ,ẋ n,2 ;ẏ 2 ), . . . , (ẋ 1,m ,ẋ 2,m , . . . ,ẋ n,m ;ẏ m )}, using the classical method. Step 2 Compute the ordinary linear regression functionf (t t t) =f (t 1 , . . . , t n ) = a 0 +ā 1 t 1 + · · · +ā n t n (∀t t t ∈ R n ), based on the real input-output sample setS = {(ẍ 1,1 ,ẍ 2,1 , . . . ,ẍ n,1 ;ÿ 1 ), (ẍ 1,2 ,ẍ 2,2 , . . . ,ẍ n,2 ;ẏ 2 ), . . . , (ẍ 1,m ,ẍ 2,m , . . . ,ẍ n,m ;ÿ m )}, using the classical method. Step 3 Obtain the first linear regression functionf ( . ., a n = a n ,ā n . Step 4 Obtain the second linear regression functionf ( using the classical method. Step 2 Compute the ordinary linear regression function r (δ δ δ) = r (δ 1 , . . . , δ n ) = b 1 δ 1 + · · · + b n δ n (∀δ δ δ ∈ R n ), based on the real input-output sample set is the third linear regression function. The rationality of Algorithm 2.3 is guaranteed by the following Theorem 2.4 Let s s s = ((x 1,1 , x 2,1 , . . . , x n,1 ; y 1 ), (x 1,2 , x 2,2 , . . . , x n,2 ; y 2 ), . . . , (x 1,m , x 2,m , . . . , x n,m ; y m )) be a real input-output datum from S S S = (x 1,1 ×x 2,1 ×· · ·×x n,1 ×y 1 )×(x 1,2 × x 2,2 × · · · × x n,2 × y 2 ) × · · · × (x 1,m × x 2,m × · · · × x n,m × y m ) (which is a rearrangement of the sample set in equality (2)) and f s s s (x x x) = β 0 (s s s)+ β 1 (s s s)x 1 + · · · + β n (s s s)x n the linear regression function based on the real input-output sample set s s s. Then (1) β 0 (s s s), β 1 (s s s), . . ., β n (s s s) are continuous functions from R 2m(n+1) , ρ to (R, ρ 1 ) (ρ ∈ {ρ 2m(n+1) , ρ 2m(n+1) ,ρ 2m(n+1) }). (2) The mapping g : β 1 (s s s) , . . . , β n (s s s) , is continuous. (3) Let f (x x x) = β 0 + β 1 x 1 + · · · + β n x n be the linear regression function based on the true sample t t t = (x 0 1,1 , x 0 2,1 , . . . , x 0 n,1 ; y 0 1 ), (x 0 1,2 , x 0 2,2 , . . . , x 0 n,2 ; y 0 2 ), . . . , (x 0 1,m , x 0 2,m , . . . , x 0 n,m ; y 0 m ) in S S S. Then, for each ε > 0, there exists a δ δ δ = (δ 0 , δ 1 , δ 2 , . . . , δ n ) T > > > 0 0 0 (i.e. δ 0 > 0, δ 1 > 0, δ 2 > 0, . . . , δ n > 0) and a δ > 0, such that (where ρ ∈ {ρ n+1 , ρ n+1 ,ρ n+1 }) i) ρ g(s s s), (β 0 , β 1 , . . . , β n ) < ε if S S S is aδ δ δ-sample set, i.e. it satisfies max y r 1 , y r 2 , . . . , y r m ≤ δ 0 , max x r 1,1 , x r 1,2 , . . . , x r 1,m ≤ δ 1 , max x r 2,1 , . . . , x r 2,m ≤ δ 2 , . . ., max x r n,1 , x r n,2 , . . . , x r n,m ≤ δ n . ii) ρ g(s s s), (β 0 , β 1 , . . . , β n ) < ε if S S S is a δ-sample set, i.e. it satisfies max{y r 1 , y r 2 , . . . , y r m ; x r 1,1 , x r 1,2 , . . . , x r 1,m ; x r 2,1 , x r 2,2 , . . . , x r 2,m ; . . . , x r n,1 , x r n,2 , . . . , x r n,m } ≤ δ. (4)f is a universal approximator to f , i.e. for each compact set U ⊆ R n+1 and each ε > 0, there exists a δ(U , ε) > 0 such that sup a a a = (a 0 , a 1 , a 2 , . . . , a n ) ∈ a 0 × a 1 × a 2 × · · · × a n and each δ(U , ε)-sample is the first linear regression function based on S S S. (5)f is a universal approximator to f , i.e. for each compact set U ⊆ R n+1 and each ε > 0, there exists a δ(U , ε) > 0 such that sup Step 1 For two k-variables polynomials P(x x x) and where E Q is the zero-point set of Q(x x x) (and thus a finite set). Moreover, the inequalities ρ k+1 (x x x, y y y) ≤ ρ k+1 (x x x, y y y) ≤ρ k+1 (x x x, y y y) hold for any {x x x, y y y} ⊆ R k+1 . Step 2 As ρ 2m(n+1) , ρ 2m(n+1) , andρ 2m(n+1) induce the same topology (i.e. the Euclidean topology) on R 2m(n+1) , it can be easily seen from computing formulae of classical linear regression and Step 1 that (1) is true. By (1), p i • g(s s s) = β i (s s s) : R m(n+1) , ρ m(n+1) −→ (R, ρ 1 ) is a continuous function (where p i : R n+1 −→ R is the i-th projection, i = 1, 2, . . . , n + 1). Thus g(s s s) is a continuous function i.e. (2) is true (equivalently, (3) is true). (4)-(6) follow from (3). Table 4 . Example 2.6 Consider the special interval input-output sample set S = {(x 1,i , x 2,i , y i ) | i = 1, 2, 3, 4, 5} in Table 5 (where δ 2 ≡ 2, r ≡ 5, and δ 1 is almost equal to 10). Then f (t t t) = 111.7393 + 0.0124t 1 − 11.4787t 2 , f c (t t t) = f c (t 1 , t 2 ) = 139.57317 + 0.0124t 1 − 11.4787t 2 , andf (t t t) =f (t 1 , t 2 ) = 167.407 + 0.0124t 1 − 11.4787t 2 (∀t t t = (t 1 , t 2 ) ∈ R 2 ). By Algorithm 2.3,f (t t t) =f ( t 1 , 10 , t 2 , 2 ) = 111.7393 + 0.0124t 1 − 11.4787t 2 , 167.407 + 0.0124t 1 −11.4787t 2 =f (t t t) (∀t t t = ( t 1 , 10 , t 2 , 2 ) ∈ R). As r (δ 1 , δ 2 ) = 2.5δ 2 = 5 (if δ 2 = 2),f (t t t) = 0.0124t 1 − 11.4787t 2 + 139.5732, 5 = f c (t t t) ± 5 = f c (t 1 , t 2 ) ± 5 (t t t = ( t 1 , 10 , t 2 , 2 ) ∈ R). This motivates the following easy-to-use Corollary 2.7: Corollary 2.7 For a special sample (or data set) S = ( x 1,1 , ε 1 , x 2,1 , ε 2 , . . . , x n,1 , ε n ; y 1 , ε 0 ), ( x 1,2 , ε 1 , x 2,2 , ε 2 , . . . , x n,2 , ε n ; y 2 , ε 0 ), . . . , ( x 1,m , ε 1 , x 2,m , ε 2 , . . . , x n,m , ε n ; y m , ε 0 ) , the first two linear regression functions in Algorithm 2.3 are the same: it equals exactly to f c (t 1 , . . . , t n ), |o| = f c (t 1 , . . . , t n ) ± |o|, which can be looked to bef (t t t) (because r (δ δ δ) in Algorithm 2.3 has infinite many chooses, including |o|), here t t t = (t 1 , t 2 , . . . , t n ) and o = ε 0 − a 1 ε 1 − a 2 ε 2 − · · · − a n ε n . Proof For two column vectors x x x = (x 1 , x 2 , . . . , x m ) T and z z z = (z 1 , z 2 , . . . , z m ) T , we writex x x = 1 m (x 1 + x 2 + · · · + x m ) and x x x T z z z = x 1 z 1 + x 2 z 2 + · · · + x m z m . Using x x x 1 (resp., x x x 2 , . . . , x x x n , y y y) to denote the column vector (x 1,1 , x 1,2 , . . . , x 1,m ) T (resp., (x 2,1 , x 2,2 , . . . , x 2,m ) T , . . . , (x n,1 , x n,2 , . . . , x n,m ) T , (y 1 , y 2 , . . . , y m ) T ) and ε to denote a column vector with the constant coordinates ε, we get (y y y = 1, 2, . . . , n) . This implies a i = a i (i, j = 1, 2, . . . , n), and thus a 0 = y y y In practice we can ask people to collect sample as that in Corollary 2.7 so as to use the simplified models in Corollary 2.7. In this section, we first present two linear-like interpolation models with interval input-output sample, each is stemmed from the classical linear interpolation modal with real input-output sample. Then we prove that each modal is an universal approximator to the corresponding linear interpolation modal. are called basis functions relaying on S. 1, 2, . . . , m n − 1), and m = min{m 1 , m 2 , . . . , m n } > 3. Then the n-variable linear-like interpolation functionf ( 2 )(x 2 ) is a base function relying on the sample set S 2 , . . ., l(x (k n ) n )(x n ) is a base function relying on the sample set S n . Motivated by Example 3.1, we have (as in Sect. 2) the following Algorithm 3.2: Algorithm 3.2 Consider an interval number sample set (from an n-variable function f , can be given by the following ways: . , x n ) ∈ S}, using the method given in Example 3.1(2). Step 2 Similarly, compute the n-variable linear-like interpolation functionf (t t t) = s s s (t 1 , . . . , t n ) is the n-variable linear-like interpolation function, based on the real input-output sample set s s s ) and obtained by using the method given in Example 3.1(2). Proof Similarly to that of Theorem 2.4. In practical problem if the set S of all interpolation nodes (related to the nvariable function f ) is not an n-dimension cub-like set, then we consider the new set S − of all interpolation nodes (related to the n-variable function f − which is an extension of f ) and give the two n-variable linear-like interpolation functions (f − andf − ) of f − using Algorithm 3.2 (and thus Theorem 3.3 holds for f − , particularly, for f ), where S − = S 1 × S 2 × · · · × S n , 1, 2, . . . , n) is the i − th projection, and the value of f − (s s s) will be given by experts using arithmetic average operator, weighted average operator, or other aggregation operators (Beliakov et al. 2007; Cubillo et al. 2015; Deschrijver and Kerre 2008) (s s s ∈ S − −S). (2) Consider the special interval input-output sample set S = {(x 1,i , x 2, j , y i, j ) | i, j = 1, 2, 3} as shown in Table 6 (where δ 1 ≡ 10, δ 2 ≡ 0.1, and r almost takes the same value 1). Step 1 l x (1) if t t t = (t 1 , t 2 ) ∈ [300, 600) × [5, 6); f c (t) = 60 2 − t 1 300 (7 − t 2 ) + 70 2 − t 1 300 (t 2 − 6) + 70 t 1 300 −1 (7−t 2 )+80 t 1 300 −1 (t 2 −6) = 1 30 t 1 +10t 2 −10 if t t t ∈ [300, 600)×[6, 7); f c (t) = 60 3− t 1 300 (6−t 2 )+70 3− t 1 300 (t 2 −5)+50 t 1 300 −2 (6−t 2 )+80 t 1 300 −2 (t 2 −5) = In a words, f c (t t t)= t t t ∈ [300, 600) × [5, 7), 1 30 t 1 + 10t 2 − 10, t t t ∈ [600, 900) × [5, 6), 1 30 t 1 t 2 − 1 6 t 1 − 10t 2 + 110, t t t ∈ [600, 900) × [6, 7), − 7 30 t 1 t 2 + 14 15 t 1 + 280t 2 − 1120, t t t ∈ [900, 1200) × [4, 5), − 1 30 t 1 t 2 − 1 15 t 1 + 40t 2 + 80, t t t ∈ [900, 1200) × [5, 6), − 1 15 t 1 t 2 + 2 15 t 1 + 80t 2 − 160, t t t ∈ [900, 1200) × [6, 7), 1 3 t 1 t 2 − 8 Step 2 l x (1) t t t ∈ [290, 590) × [4.9, 6.9), − 1 3000 t 1 t 2 + 1049 30000 t 1 + 3059 300 t 2 − 31891 3000 , t t t ∈ [590, 890) × [4.9, 5.9), 101 3000 t 1 t 2 − 4969 30000 t 1 − 2959 300 t 2 + 32317 3000 , t t t ∈ [590, 890) × [5.9, 6.9), 49 300 t 1 t 2 − 637 1000 t 1 + 49 30 t 2 − 637 100 , t t t ∈ [0, 290) × [3.9, 4.9), 1 30 t 1 t 2 + 1 3 t 2 , t t t ∈ [0, 290) × [4.9, 6.9), − 23 100 t 1 t 2 + 1817 1000 t 1 − 23 10 t 2 + 1817 100 , t t t ∈ [0, 290) × [6.9, 7.9], − 23 100 t 1 t 2 + 897 1000 t 1 + 2737 10 t 2 − 106743 100 , t t t ∈ [890, 1190) × [3.9, 4.9), − 33 1000 t 1 t 2 − 683 10000 t 1 + 3927 100 t 2 + 81277 1000 , t t t ∈ [890, 1190) × [4.9, 5.9), − 67 1000 t 1 t 2 + 1323 10000 t 1 + 7973 100 t 2 − 157437 1000 , t t t ∈ [890, 1190) × [5.9, 6.9), 33 100 t 1 t 2 − 2607 1000 t 1 − 3927 10 t 2 + 310233 100 , t t t ∈ [890, 1190) × [6.9, 7.9], 0, otherwise. Step 3 l x (1) Step 4f (t t t) = f (ṫ 1 ,ṫ 2 ),f (ẗ 1 ,ẗ 2 ) (∀t t t = (t 1 , t 2 ) ∈ R 2 ). Using the classical least square estimation method we can obtain an linear regression function 5 r (ε ε ε) = r (ε 1 , ε 2 ) (it has no the constant term) based on the sample set 7times (10, 0.1, 1), . . . , (10, 0.1, 1), (10, 0.1, 1.1), (10, 0.1, 1) . Step 5 If the true sample set S = {(x 1,i , x 2, j , y i, j ) | i, j = 1, 2, 3} consists of centers of interval numbers in S, then the compute results and the true sample are list in Table 7 , where y c i, j = f c (x 1,i , x 2, j ), y − i, j = f (x 1,i − 10 −2 , x 2, j − 10 −5 ), and y + i, j =f (x 1,i + 10 −2 , x 2, j + 10 −5 ). Example 3.6 Consider the special interval input-output sample set S = (300±10, 5±2, 50± 5), (300±10, 7±2, 70±5), (600±10, 6±2, 70±5), (900±10, 5±2, 70±5), (900±10, 7±2, 90 ± 5) (from a function f ) in Table 8 and its Table 9 , where the unknown sample are determined using an arithmetic average operator (of course, we can also use an appropriate aggregation operator, a t-norm, a t-conorm, etc.): * = 0.5( 50, 1 ⊕ 70, 1 ) = 60, 1 , * * = 0.5( 50, 1 ⊕ 70, 1 ) = 60, 1 , = 0.5( 70, 1 ⊕ 90, 1 ) = 80, 1 , = 0.5( 70, 1 ⊕ 90, 1 ) = 80, 1 . Let f c , f ,f ,f , andf be as in Example 3.5(2), then t t t ∈ [600, 900) × [5, 7), − 1 30 t 1 t 2 + 4 15 t 1 − 60t 2 + 480, t t t ∈ [600, 900) × [7, 8], − 1 30 t 1 t 2 − 1 15 t 1 + 40t 2 + 80, t t t ∈ [900, 1200) × [5, 7), 3 10 t 1 t 2 − 12 Example 4.1 Now we give detailed solutions to problem (I) in Sect. 1. (1) To determine the grade by using the linear regression model presented in Sect. 2. Firstly, compute the 3-variable linear regression functions. f (t t t) = −1.5253 − 0.0141t 1 +1.4017t 2 + 3.5535t 3 , f c (t t t) = −1.514 − 0.0114t 1 +0.1363t 2 + 4.6991t 3 ,f (t t t) = −0.8411 − 0.0259t 1 +1.893t 2 + 1.917t 3 , and r (δ δ δ) = 0.0075δ 1 +0.1585δ 2 − 0.169δ 3 . By Algorithm 2.3,f (t t t) = −1.5253, −0.8411 ⊕ −0.0266, −0.0141 t 1 ⊕ 1.4017, 1.893 t 2 ⊕ 1.917, 3.5535 t 3 ,f (t t t) = −1.514 − 0.0114t c 1 + 0.1363t c 2 + 4.6991t c 3 , 0.0075δ 1 + 0.1586δ 2 − 0.169δ 3 andf (t t t) = f (ṫ 1 ,ṫ 2 ,ṫ 3 ),f (ẗ 1 ,ẗ 2 ,ẗ 3 ) (∀t t t = (t 1 , t 2 , t 3 ) ∈ R 3 ). Secondly,f (t t t 0 ) = 1.4635, 4.7158 ,f (t t t 0 ) = 2.983, 3.1963 , andf (t t t 0 ) = 2.0694, 0.0551 . Finally, determine the grade of t t t 0 by experts based on the computation results. (2) To determine the grade by using the linear-like interpolation model presented in Sect. 3. Firstly, supplement Table 9 (which we will call Table 10 ). In Table 10 , only 2 3 samples are need because, for the computation of each of {f (t t t 0 ),f (t t t 0 )}, only the corresponding 2 3 basis functions relaying the sample will be used. Secondly, computef (t t t 0 ) orf (t t t 0 ) (and thus determine the grade). We omit the computation course (including Table 10 ) here because it is tedious (unless a matched program is given). (3) To determine the grade by using some easy-to-think-of methods (which can also be discovered in plausible manner). For example, take three Astragali Radix t t t 1 = ( 36.5, 9 , 1.3, 0.2 , 0.9, 0.1 ), t t t 2 = ( 36.7, 10.4 , 1.7, 0.2 , 1.2, 0.2 ), and t t t 3 = ( 32.4, 5.5 , 1, 0.1 , 0.8, 0.1 ) (such that each is much similar to t t t 0 in one aspect) in Table 9 first. Then compute Finally, as d 1 ≤ d 3 ≤ d 2 , t t t 0 is most similar to t t t 1 , and thus the grade of t t t 0 can be judged as 4. In a practical problem with an interval input-output sample and a given interval input t t t, how to obtain the interval output? the suggested strategies are as follows: (1) If the sample is managed so carefully that are as in Examples 2.6 and 3.1 (i.e., the radii are the same in the same variable data), then we can compute the output of t t t just using one of the five kinds of functions (i.e., the linear regression functions or the line-like interpolation functions given in this paper, see Corollary 2.7 and Examples 3.5 and 3.6 for details); of course, we can also take the output of t t t to be an aggregation (for example, a simple weighted average) of these outputs. (2) In other cases, we can compute the output of t t t using one of the three kinds of linear regression functions given in this paper, see Example 4.1(1) for details); we can also compute the output of t t t as in the last part of Example 4.1(2). Of course, we can also take the output of t t t to be an appropriate aggregation of these outputs. Consider the control problem dx x x dt = ( f 1 (x x x), f 2 (x x x)) T + u u u(t) in (II) of Sect. 1. Assume the sample obtained on y 1 = f 1 (x x x) is as the Table 4 in Example 2.6, and the sample obtained on y 2 = f 2 (x x x) is as the Table 7 in Example 3.6. Replacing ( f 1 (x x x), f 2 (x x x)) by ( f 1,c (x x x), f 2,c (x x x)) (resp, ( f 1 (x x x), f 2 (x x x)), (f 1 (x x x),f 2 (x x x)) (see Example 3.5(2)), we obtain the following three systems: dx x x dt A Midpoint-Radius approach to regression with interval data From fuzzy regression to gradual regression: interval-based analysis and extensions A possibilistic regression based on gradual interval B-splines: application for hyperspectral imaging lake sediments Extended support vector interval regression networks for interval input-output data Fuzzy regression analysis: systematic review and bibliography Examples of aggregation operators on membership degrees of type-2 fuzzy sets Aggregation operation in interval-valued fuzzy and Atanassov's intuitionistic fuzzy set theory Composite learning sliding mode synchronization of chaotic fractional-order neural networks Interval regression by tolerance analysis approach A novel approach in uncertain programming part I: new arithmetic and order relation for interval numbers Support vector interval regression machine for crisp input and output data Qualitative and quantitative data envelopment analysis with interval data Several formulations of interval regression analysis Support vector interval regression networks for interval regression analysis Upper and lower approximation models in interval regression using regression quantile techniques Interval analysis Mathematics and plausible reasoning, Vol. I, induction and analogy in mathematics Interval regression analysis by quadratic programming approach Some geometric aggregation operators based on intuitionistic fuzzy sets Interval sets and three-way concept analysis in incomplete contexts A sufficient condition on a general class of interval type-2 Takagi-Sugeno fuzzy systems with linear rule consequent as universal approximators Analysis of correlation between commercial traits and chemical characteristics and absolute growth years of Astragali Radix Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations input-output sample set { (ẍ 1 ,ẍ 2 , . . . ,ẍ n ), p 2 ( f (x x x 0 )) | x x x 0 = (x 1 , x 2 , . . . , x n ) ∈ S}, using the method given in Example 3.1(2).Step 3 Obtain, based on Steps 1-2, the first n-variable linear-like interpolation functioň f (x x x) = f (ẋ 1 ,ẋ 2 , . . . ,ẋ n ),f (ẍ 1 ,ẍ 2 , . . . ,ẍ n ) (∀x x x = (x 1 , x 2 , . . . , x n ) ∈ R n ).(2) Step 1 Compute the n-variable linear-like interpolation function f c (t t t). , x n ) ∈ S}, using the method given in Example 3.1(2).Step 2 Compute the ordinary linear regression function (without the term b 0 ) 4 r (δ δ δ) = r (δ 1 , . . . , δ n ) = b 1 δ 1 + · · · + b n δ n (∀δ δ δ ∈ R n ), based on the real input-output sample setThe rationality of Algorithm 3.2 is guaranteed by the following Theorem 3.3. For a vector ( (3) is asymptotically stable. Similarly, systems (4) and (5) are also asymptotically stable. The idea and method used here can be used immediately to control chaotic fractional-order neural networks (Han et al. 2020 ). Intelligent doctors, plus the Tradition Chinese Medicine (TCM for short), can resist having diarrhea, cold, being ill with a fever, and tonsillitis completely (even coronavirus disease-2019, COVID-19 for short, in a degree actually) with a smaller side effect. The present paper confirms that interval input-output samples (which are major samples or data in TCM all the time) can always be disposed towards satisfactory with the help of models (stemmed from the classical ones) and practitioners' cooperation. Given an n-variable interval input-1-variable interval output sample set S, we have illustrated in much detail how to establish, in Polya's discovering pattern (mainly using specialization and analogous towards exploring solutions to problems), three linear regression models and two linear-like interpolation models relaying on S. Each model is proved to be a universal approximator to the corresponding model based on the true samples under some easy-to-be-satisfied conditions. As the computations only involve centers, left endpoints, or right endpoints of interval numbers in the sample, offthe-shelf software can be utilized. Practitioners can optimize these models in the following ways:(1) Collect directly or obtain such kind of interval samples that have the same radius r and r is as small as possible [this can be realized by practitioners themselves directly or by experts indirectly using three-way decision theory (Yao 2017) ].to be the new model in the case of linear regression model, where {w c 1 , w c 2 , w c 3 } ⊆ (0, 1) satisfying w c 1 + w c 2 + w c 3 ≈ 1 may be determined by experts; take g 2 (x x x) = w 1f (x x x) ⊕ w 2f (x x x) to be the new model in the case of linear-like interpolation model, where {w c 1 , w c 2 } ⊆ (0, 1) satisfying w c 1 + w c 2 ≈ 1 may be determined by experts.Notice that the models disposing of interval samples in this paper are stemmed from the classical linear regression model or the classical linear interpolation model, thus practitioners can also make out similarly other new models disposing of interval samples based on other classical models disposing of real number samples. Theoretically, these work can be generalized further (e.g., replace the operations by demand-oriented aggregation-like operation, t-norm-like operation, or t-conorm-like operation to be coined by experts), if necessary. Finally, some new universal approximators can also be discovered and used to control chaotic fractional-order neural networks (Han et al. 2020 ) based on this research (which will be our future work).Funding The work was supported by the National Natural Science Foundation of China (Grant No. 11771263, 61967001, 61807023) and the Fundamental Research Funds for the Central Universities (Grant No. GK202105007, GK201702011). The authors declare that they have no conflict of interest.Human and animal rights statement This article does not contain any studies with human participants performed by any of the authors.