PII: 0888-613X(94)90009-4 Approximations Between Fuzzy Expert Systems and Neural Networks Yoichi Hayashi l b a r a k i University, Hitachi-shi, Ibaraki 316, J a p a n James J. Buckley University o f A l a b a m a at B i r m i n g h a m , A l a b a m a A B S T R A C T The fuzzy expert system we are concerned about in this paper is a rule-based fuzzy expert system using any method o f approximate reasoning to evaluate the rules when given new data. In this paper we argue that: (1) any continuous fuzzy expert system may be approximated by a neural net; and (2) any continuous neural net (feedforward, multilayered) may be approximated by a fuzzy expert system. We show how to train the neural net and how to write down the rules in the fuzzy expert system. K E Y W O R D S : N e u r a l networks, f u z z y expert systems, approximations 1. I N T R O D U C T I O N T h e fuzzy e x p e r t s y s t e m w e a r e c o n c e r n e d a b o u t in this p a p e r is a c o n t i n u o u s r u l e - b a s e d fuzzy e x p e r t s y s t e m using any m e t h o d o f approxi- m a t e r e a s o n i n g to e v a l u a t e t h e rules w h e n given new data. T h e r e will b e n o u n c e r t a i n t i e s in t h e d a t a o r t h e rules, a n d t h e r e is n o t h r e s h o l d i n g in the firing o f a rule. T h a t is, all rules fire given new data. T h e n e u r a l nets will b e c o n t i n u o u s ( n o t h r e s h o l d i n g within n e u r o n s ) f e e d f o r w a r d , multilay- ered, e m p l o y i n g any l e a r n i n g a l g o r i t h m [1]. I n the next section we b e g i n b y showing t h a t a t h r e e - l a y e r e d n e u r a l net c a n b e t r a i n e d t o a p p r o x i m a t e a given discrete, continuous, fuzzy e x p e r t system uniformly, to any d e g r e e o f accuracy, o v e r all inputs. T h e m o t i v a - Address correspondence to Dr. Yoichi Hayashi, Department of Computer & Information Sciences, lbaraki University, Hitachi-shi, Ibaraki 316, Japan. A Preliminary version of this paper was presented at the Second International Conference on Fuzzy Logic and Neural Networks (IIZUKA'92), Iizuka, Japan, July 17-22, 1992. Received November 17, 1992; accepted August 15, 1993. International Journal of Approximate Reasoning 1994; 10:63-73 © 1994 Elsevier Science Inc. 655 Avenue of the Americas, New York, NY 10010 0888-613X/94/$7.00 63 64 Yoichi Hayashi and James J. Buckley tion for this part of our research is to build fast parallel computation (a neural net) for discrete fuzzy expert systems. In [2] we discussed approxi- mating fuzzy expert systems by neural nets, but the discussion was based on an existence proof. That is, used results that state continuous function can be approximated uniformly on compact sets by neural networks. In this study we show how to train a three-layered neural net, having a sufficient number of neurons in the hidden layer, to approximate a given continuous, discrete, fuzzy expert system. In [3] the authors also train neural nets to approximate fuzzy rules in a fuzzy expert system. However, they train a neural net to approximate one or two explicit fuzzy rules. Our results are more general in that our continuous, discrete, fuzzy expert system has any number of rules and uses any method of approximate reasoning to infer its final conclusion. In the second part of the next section we explain how to construct a continuous, discrete, fuzzy expert system to approximate a given neural net uniformly, to any degree of accuracy, over all inputs. We are not suggest- ing replacing neural nets by fuzzy expert systems. This is an important theoretical result showing that fuzzy expert systems can be computationally equivalent to neural nets. In [2] we also discussed this approximation result, but there we presented more of an existence proof. We showed that given the neural net, we can build a discrete fuzzy expert system, in particular we showed that there exists a method of approximate reasoning, to approximate the net. In this paper we present a more general discussion on the construction of the discrete fuzzy expert system. We will use a bar over a symbol to represent a fuzzy set. So A, B, C . . . . are all fuzzy sets. Also, all our fuzzy sets will be subsets of the real numbers. If , 4 is a fuzzy set, then ,4(x) denotes its membership function evaluated at real number x. Similarly, B(x), C(x) are the membership functions for B, C, respectively. 2. APPROXIMATIONS We will first discuss how to approximate a fuzzy expert system with a neural net and then use a fuzzy expert system to approximate a neural net. 2.1. Fuzzy Expert System Consider a fuzzy expert system (abbreviated FES) having one block of rules d r : If X = . ~ and Y = Bj, then Z = Cp, (1) Fuzzy Expert Systems and Neural Networks 65 f o r 1 < r < n. F o r simplicity we have a s s u m e d t h at each rule has only two simple clauses in its a n t e c e d e n t . Also, we are using fuzzy sets directly in the rules instead o f having t h e i r equivalent linguistic values. In e q u a t i o n (1) we c o n n e c t e d the two clauses with an " a n d , " b u t o n e can use " a n d " o r " o r " in the rules. W e will assume that: (1) the interval [al, b 1] contains t h e s u p p o r t o f all the fuzzy sets t h a n can b e used fo r X ; (2) [a 2, b2] contains all the fuzzy sets that m a y b e used f o r Y; and (3) [a 3, b 3 ] has t h e s u p p o r t o f all the fuzzy sets that can b e identified with variable Z. T h e expert system will use s o m e m e t h o d o f a p p r o x i m a t e reaso n i n g to evaluate the rules w h e n given d a t a X = . ~ an d Y = B ' . O n e m e t h o d o f a p p r o x i m a t e reasoning involves: (1) choosing animplica_tion o p e r a t o r ; (2) picking a m e t h o d o f c o m p o s i n g the d a t a X = A ' , Y = B ' , with t h e infor- m a t i o n in a rule; and (3) deciding on how to c o m b i n e t h e results o f each rule into a final conclusion [4]. T h e r e are o t h e r m e t h o d s like first combin- ing all the rules into o n e fuzzy r e l a t i o n [4], b u t we n e e d n o t exactly specify any particular p r o c e d u r e of a p p r o x i m a t e reaso n i n g in this p ap er. W e now assume some m e t h o d o f a p p r o x i m a t e r e a s o n i n g has b e e n chosen, and we will d e n o t e this m e t h o d by ~ ¢ 2 . So, given the d a t a X = A and Y = B ' , t h e block o f rules .9~r, 1 < r < n, and s¢~2, the fuzzy ex p ert system p r o d u c e s its conclusion Z = C ' . Now assume we r u n this fuzzy e x p e r t system o n so m e test d at a X = ~zTk, Y - B~,, 1 _< k _< K. L e t the c o r r e s p o n d i n g conclusions b e Z = C~,, 1 < k _< K. In a c o m p u t e r o n e usually uses discrete versions o f t h e c o n t i n u o u s fuzzy sets A i , B j , C p , Z ' , B ' , C ' , etc. So let ~ b e a discretiza- tion o f the intervals [ai, bi], 1 < i _< 3. In this p a p e r we will c h o o s e t h e following discretization: (1) pick x i in [a~, b 1 ] as x o = a 1, x i = a 1 + i ( b ~ - a l ) / N 1 1 < i <_ N 1, f o r positive i n t e g e r N1; (2) c h o o s e Yi in [a2, b2] as Y o = a 2 , Yi = a 2 + i ( b 2 - a 2 ) / N 2 , 1 <_ i <_ N 2, f o r positive i n t eg er N2; and (3) let z i be in [a3, b 3] so that z 0 = a3, z i = a 3 + i ( b 3 - a 3 ) / N 3 , 1 < i < N 3, N 3 a positive integer. ~ consists o f x 0 . . . . . xN1, Y0 . . . . , YN2, Z o , ' " , ZN~" L e t M = N1 + N2. T h e n we input the n u m b e r s a ~ k ( x i ), 0 < i < N 1, fo r X = A ~ and the n u m b e r s B ~ ( y i ) . . . . O < i < N 2 an d Y = B k , ' into t h e fuzzy expert system and obtain the n u m b e r s ff~,(z~), 0 < i < N3, fo r Z = C'~,, l < _ k < K . W e now construct, and train, a neural n e t w o r k t h at will c o m p u t e (approximately) the same results as the fuzzy ex p ert system f o r t h e inputs ~.~k(x~), 0 < i < N ~ , B ~ ( y i ) , 0 < i < N 2, f o r 1 _< k _< K. T h a t is, the n eu ral net c o m p u t e s the same as the fuzzy e x p e r t system, with resp ect to .~, fo r the test d a t a X = -~k, Y = B~,, 1 < k < K. It is well-known t h a t n e u r a l nets are universal approximators. W h a t this m e a n s is that given a c o n t i n u o u s F : R d ~ R t h e r e is a n eu ral n e t t h at can u n i f o r m l y a p p r o x i m a t e F, to any d e g r e e o f accuracy, o n c o m p a c t subsets o f R a. See [5-17] f o r a survey o f this literature. F r o m t h e discussion above we 66 Yoichi Hayashi and James J. Buckley see t h a t a discrete fuzzy e x p e r t system F E S will b e a m a p p i n g f r o m [0, l] d into [0, 1 e, f o r d = M + 2, e = N 3 + 1. W e n o w a r g u e t h a t we w o u l d n o r m a l l y e x p e r t F E S to b e a c o n t i n u o u s m a p p i n g . I n [18] it is s h o w n t h a t Z a d e h ' s c o m p o s i t i o n a l rule o f i n f e r e n c e is a c o n t i n u o u s o p e r a t i o n o n discrete fuzzy sets, w h e n it is b a s e d o n a c o n t i n u - o u s t-norm• W e w o u l d t h e r e f o r e e x p e c t t h a t t h e m e t h o d o f a p p r o x i m a t e reasoning, u s e d within the fuzzy e x p e r t system is also continuous. This t h e n implies t h a t t h e F E S is a c o n t i n u o u s m a p p i n g f r o m [0, 1] d into [0, 1] e. I t can b e shown t h a t t h e r e is a n e u r a l n e t t h a t can a p p r o x i m a t e F E S , u n i f o r m l y to a n y d e g r e e o f accuracy, o n c o m p a c t [0, 1] d. H o w e v e r , this a r g u m e n t is only a n existence a r g u m e n t a n d it d o e s n o t tell y o u h o w to c o n s t r u c t a n d t r a i n t h e n e u r a l net. So, f o r t h e rest o f this subsection we will discuss h o w to o b t a i n a n e u r a l n e t t h a t will a p p r o x i m a t e t h e given FES. T h e n e u r a l n e t will h a v e M + 2 input n e u r o n s , o n e h i d d e n layer, a n d N 3 + 1 o u t p u t neurons• T h e r e a r e d i f f e r e n t m e t h o d s o f specifying a sufficient n u m b e r o f n e u r o n s in t h e h i d d e n layer ([19-21]), a n d we a s s u m e t h a t o n e such m e t h o d h a s b e e n c h o s e n so t h a t t h e h i d d e n l a y e r h a s a sufficient n u m b e r o f n e u r o n s to l e a r n the training set. L a b e l the i n p u t n e u r o n s 11, 1 2 , . . . , IM+2, a n d the o u t p u t n e u r o n s O 1 , . . . , ON3+l.._We n o w d e s c r i b e h o w to train the n e u r a l n e t w o r k . W e input A'k(x o) to ~'7 --t --r I 1 . . . . , A k ( x N) t o [N+l, B k ( y o ) t o I N + 2 , . . . , B k ( y N ) t o IM+ 2 a n d t h e l - - l - - 2 • • ! d e s i r e d o u t p u t is Ck~ 3 - z 0) f r o m O1 . . . . . Ck(z N ) f r o m ON~+V T h a t IS, the training set has K p a i r s o f i n p u t s - o u t p u t s with {A'k(xi)[0 < i < N 1} U --t _ _ _ _ {Ck(Zi)]O < i < N 3} the o u t p u t set. { B k ( y i ) [ O < i < N 2} the input set a n d - ' _ _ F i g u r e 1 shows h o w t h e n e u r a l n e t will a p p r o x i m a t e the fuzzy e x p e r t system f o r t h e special case o f N 1 = Ne = N 3 = 10. T h e n the n e u r a l n e t a n d t h e fuzzy e x p e r t s y s t e m c o m p u t e t h e s a m e o u t p u t , with r e s p e c t to ~ , f o r the test d a t a X ~7 - , = A k , Y = B k , 1 < k < K . N o w o n e w o n d e r s h o w t h e o u t p u t s , f r o m t h e two systems, c o m p a r e if we i n p u t n e w d a t a X = A'-; a n d Y = B ' w h e r e t h e s e fuzzy sets do n o t b e l o n g to the test d a t a set. T h e result d e p e n d s o n h o w we pick__ed t h e test d a t a set. S u p p o s e first t h a t we c h o s e A~ = A i, B'~ = Bj, C~ = Cp, the fuzzy sets in t h e rules. T h a t is, we d o n o t r u n the fuzzy e x p e r t system to c o m p u t e Z = C~, b u t set the i n p u t p a i r to b e the fuzzy sets in the a n t e c e d e n t o f a rule a n d the o u t p u t fuzzy set is the fuzzy set in t h e r u l e ' s c o n s e q u e n c e . T h e test set has all t h e fuzzy sets in t h e rules a n d n o o t h e r fuzzy sets. W e t h e n train t h e n e u r a l n e t only o n t h e k n o w l e d g e b a s e (rules) o f the fuzzy e x p e r t s y s t e m a n d the n e t will k n o w n o t h i n g a b o u t ~¢.~. So, the two systems can differ c o n s i d e r a b l y f o r new d a t a X = -,~ a n d Y = B ' . H o w e v e r , if the test d a t a set h a s a n u m b e r o f p a i r s {A'k, B~,}, w h e r e t h e s e fuzzy sets do n o t = Q , t h e n b e l o n g to s o m e r u l e ' s a n t e c e d e n t , a n d use J ~ ' to c o m p u t e Z - ' t h e n e t t r a i n e d o n this i n f o r m a t i o n will i n c o r p o r a t e s o m e o f ~¢~' into its Fuzzy Expert Systems and Neural Networks xox~ ' ~ j O u t p u t s 67 x3 xlo Yo Yl Y2 Y~o Inputs : V \ Figure 1. Neural net approximating fuzzy expert system. weights. Then the two systems can produce similar results for new data X = ~ and Y = B'. So, one should pick the test data so that the input pairs (A'k, B~,) broadly cover applications of the fuzzy expert system and then train the neural net on this information. Then the net will better approximate the FES on new data. In the appendix we present a more formal argument (mathematical) on why you should not use only the rules in the FES as the training set for the neural network. 2.2. Neural Net Consider a continuous neural net (NN) with m input neurons, any number of hidden layers, and n output neurons. Assume that all the input, and output, signals are bounded between zero and one. Therefore, NN is a continuous mapping from [0, 1] m into [0, 1] n. 68 Yoichi Hayashi and James J. Buckley Recently, t h e r e have b e e n a n u m b e r o f p a p e r s showing that certain types o f fuzzy systems are universal a p p r o x i m a t o r s ([22-28]). All o f these fuzzy systems r e s e m b l e a fuzzy c o n t r o l l e r in that t h ey have singleton (crisp) inputs and defuzzified (crisp) output. W e may use these results to obtain a generalized fuzzy system (multiple outputs) that can a p p r o x i m a t e N N uniformly, to any d e g r e e o f accuracy, o v e r [0, l] m. H o w e v e r , we shall n o t p u r s u e that idea in this p a p e r but instead we will b e i n t e r e s t e d in building a fuzzy expert system to a p p r o x i m a t e NN. This F E S will have o n e block o f rules, we will use a fuzzy relation to model the implication in each rule, use Z a d e h ' s compositional rule o f i n f e r e n c e to evaluate each rule given new data, and finally c o m b i n e the o u t p u t s f r o m all the rules into o n e final conclusion. Input, and output, f r o m the F E S will b e discrete versions o f c o n t i n u o u s fuzzy sets. T h e arguments that fuzzy systems are universal a p p r o x i m a t o r s are existence proofs, they do not show y o u how to build the approximating fuzzy system. O u r m e t h o d is m o r e constructive in that we show how to construct rules and we can specify the m e t h o d o f a p p r o x i m a t e reasoning to be used to evaluate the rules. W e first c h o o s e wj, 1 < j <_ J, uniformly spread a r o u n d [0, 1] m an d let N N ( w j ) = qj, 1 < j < J. W e are using functional n o t a t i o n w h e r e wj [0, l] m is input to the neural net and N N ( w j ) is its o u t p u t , a v e c t o r qj in [0, 1] n. T h e F E S will have o n e block o f rules . 9 ~ j : I f X = A j , t h e n Z = C j , 1 < j < J . (2) T h e interval that contains all the fuzzy sets f o r X ( Z ) is [1, m]([1, n]). T h e discretization o f these intervals is: (1) x 0 = 1, x I = 2 . . . . . xm i = m; an d (2) z 0 = 1, zl = 2 , . . . , z , _ ~ = n. W e define A i an d Ci with respect to this discretization as follows: (A) ~ . i ) = the i th c o m p o n e n t o f w i, 1 <_ i < m ; and_ (2) ~ ( i ) = the i th c o m p o n e n t o f qj, 1 _< i < n. T h e fuzzy sets ~ . an d C i have no physical m e a n i n g n o r do they r e p r e s e n t linguistic variables. T h e y are defined to match the input (wj)-output (q j ) pairs f r o m N N so that the F E S will be able to uniformly a p p r o x i m a t e the neural net. Next we n e e d to specify the m e t h o d o f a p p r o x i m a t e r e a s o n i n g used t o evaluate the rules given input u in [0, 1] m on X. T h a t is, d at a on X will be A , a fuzzy subset o f [0, m l, and the fuzzy exp ert system t h e n concludes Z - - t = C , a fuzzy subset o f [0, n]. H o w e v e r , discrete versions o f these fuzzy sets will be used so that the input data will be X = u in [0, 1] m, w h e r e ~zV(i) = the i th c o m p o n e n t o f u for 1 _< i < m, and the o u t p u t f r o m t h e F E S will be Z = u in [0, 1] ~, w h e r e ~,0) = the i th c o m p o n e n t o f u fo r 1 < i _< n. In functional n o t a t i o n F E S ( ~ , ) = u. T h e main r e q u i r e m e n t o f the m e t h o d o f a p p r o x i m a t e reaso n i n g is F E S ( w j ) = qj all j, o r F E S ( w ~ ) = N N ( w ~ ) all j. W h a t this m e a n s is if Fuzzy Expert Systems and Neural Networks 69 X = z~ = A~ (discrete version _~) in rule 3 i , t h e n the final conclusion from the expert system is Z = C ' = Cj (discrete version qi), for all rules. We first must g u a r a n t e e t h a t this will h a p p e n for each rule because the rules will all fire separately, and t h e n we will combine their results to get final o u t p u t Z = C ' . W e first combine the d a t a ( ~ and ~ ) in each rule ~ i into a discrete fuzzy relation Ri on [0, m] × [0, hi. Given input d a t a X = v in [0, 1] m each rule fires producing conclusion Z = u o R j, for some composition o p e r a t o r " o ." T h e n the FES combines the results v o Rj across all rules (1 < j < J ) into its final conclusion (output) Z = u in [0, 1] n. W e n e e d to choose the Rj and " o " so that w i o Rj = qj all j. T h e r e are a n u m b e r of different choices if the fuzzy sets are normalized (at least one c o m p o n e n t of wj is equal to one, all j). However, our (discrete) fuzzy sets are not necessarily normalized since wj = 0 (all c o m p o n e n t s zero) could be a choice for a wj. But, there are m e t h o d s of approximate reasoning ([2]) with the property ~) o Rj = qj all j, for any wj in [0, 1]". T h e n we have w~ o Rj equal to qj for each rule 3~j. All that is left to do is to specify how the FES combines the results into its final output. F o r any v in [0, 1] m, the FES averages the results wj o Rj, for those wi nearest to v in [0, 1] m. T h e n given e > 0, we choose the wj, 1 _< j _< J, uniformly spread a r o u n d [0,1] m, construct the F E S as described above, and obtain I N N ( v ) - FES(v)I 0 t h e r e is a 8 1 > 0 s o t h a t I F E S ( v I ) - F E S ( v 2 ) I < e / 2 if Iv t - v 2 1 < 6 l , v l , v z in [0,1] a. W e a r e u s i n g t h e f u n c t i o n a l n o t a t i o n o f v i i n p u t t o F E S p r o d u c i n g F E S ( v i) as ( d i s c r e t e ) o u t p u t , a v e c t o r in [0, 1] e. L e t N N b e a c o n t i n u o u s t h r e e - l a y e r e d , f e e d f o r w a r d , n e u r a l n e t w i t h d i n p u t n e u r o n s , e o u t p u t n e u r o n s , a n d a s u f f i c i e n t n u m b e r o f n e u r o n s in t h e h i d d e n l a y e r so t h a t it c a n l e a r n a d a t a s e t o f size L . S u p p o s e t h a t u t, 1 _< l _< L , is a s e t o f v e c t o r s s p r e a d a r o u n d [0, 1] a. L e t u ) = F E S ( u l) a v e c t o r in [0, 1] e, 1 _< l _< L . W e h a v e a s s u m e d t h a t N N c a n l e a r n ( u l, u)), 1 _< l _< L , w h i c h m e a n s t h a t t h e w e i g h t s c a n b e a d j u s t e d s o t h a t N N ( u t) = u ' t , l < l < L . N N is a c o n t i n u o u s m a p p i n g f r o m [0, 1] a i n t o [0, 1] e, all signals a r e in t h e i n t e r v a l [0, 1], s o it is u n i f o r m l y c o n t i n u o u s • T h e r e f o r e , t h e r e is a 82 > 0 s o t h a t [ N N ( v ~ ) - N N ( v 2 ) [ < s / 2 if Iv l - v2[ < 32, vl, v 2 in [0,1] a. A g a i n w e a r e u s i n g f u n c t i o n a l n o t a t i o n w i t h /"i i n p u t t o t h e n e u r a l n e t a n d N N ( v i) its o u t p u t v e c t o r in [0, l ] e . D e f i n e 6 t o b e t h e m i n i m u m o f 61 a n d t~ 2 . C h o o s e wj, 1 <_ j < J, u n i f o r m l y s p r e a d a r o u n d [0, 1] d w i t h t h e p r o p e r t y t h a t g i v e n a n y v in [0, 1] d t h e r e is a wj s o t h a t I v - wjl < 8. A s s u m e t h a t J < L . 1 L e t F E S ( w i) = qj, 1 < j <_ J. T h e l e a r n i n g d a t a f o r N N will b e (wj, qj), 1 < j < J. T h a t is, t r a i n t h e N N s o t h a t N N ( w j ) = qj all j . W e n o w a r g u e t h a t this N N will u n i f o r m l y a p p r o x i m a t e t h e F E S . L e t v b e a n y v e c t o r in [0, 1] d a n d c h o o s e wj a l s o in [0, 1] a s o t h a t Iv - wjl < 8. 1We k/lOW there is an NN (sufficient number of neurons in the hidden layer) that will approximate FES, uniformly over all inputs v in [0, 1] a, to any degree of accuracy e > 0. So this NN will have L large enough. Fuzzy Expert Systems and Neural Networks 71 T h e n [ F E S ( v ) - N N ( v ) I = f F E S ( v ) - F E S ( w j ) + N N ( w j ) - N N ( v)[ < [ F E S ( v ) - F E S ( w i ) l + I N N ( w j ) - N N ( v ) l < ~ / 2 + ~ / 2 = because: (1) F E S ( w j ) = N N ( w j ) ; (2) Iv - wjl < 6j; an d (3) Iv - wj[ < 82. So, this n e u r a l n e t will c o m p u t e , within e, t h e same as the F ES , across all possible inputs. Now suppose y o u train the N N only o n the rules (k n o w l ed g e base). L e t s r ~ [0, 1] d d e n o t e the discretization o f A i and Bj in t h e a n t e c e d e n t o f rule ~'r, 1 < r _< n. Next let t r in [0, 1 ] e be the discretization o f Cp, t h e fuzzy set in the conclusion o f rule ~q~r, 1 < r < n. T r a i n t h e N N o n the set (st, t~), 1 < r < n, so t h a t N N ( s ~ ) = t~ all r. Now pick any v in [0, 1] a. W e would not expect I F E S ( v ) - N N ( v ) I to be small because: (1) t h e s r, 1 _< r < n d o e s not necessarily f o r m a u n i f o r m span o f t h e input space [0, 1]a; and (2) it may h a p p e n , d e p e n d i n g o n ~ ¢ ~ , t h a t F E S ( s r) -~ t r fo r s o m e rules. T h a t is, if u is n o t n e a r any st, t h e n we would n o t ex p ect N N ( u ) to b e close to F E S ( v ) . Also, N N m a y differ f r o m F E S e v e n o n the training set. F o r these r e a s o n s we do n o t r e c o m m e n d training the neural n e t only o n t h e rules o f the fuzzy e x p e r t system. References 1. Rumelhart, D. D., Hinton, G. E., and Williams, R. J., Learning internal representations by error propagation, in Parallel Distributed Processing: Explo- rations in the Microstructures o f Cognition, Vol. 1, (D. E. Rumelhart and J. L. McClelland, eds.), MIT Press, Cambridge, MA, 675-695, 1986. 2. Buckley, J. J., Hayashi, Y., and Czogala, E., On the equivalence of neural nets and fuzzy expert systems, Fuzzy Sets and Systems 53, 129-134, 1993. 3. Keller, J. M., and Tahani, H., Implementation of conjunctive and disjunctive fuzzy logic rules with neural networks, Int. J. Approx. Reasoning 6, 221-240, 1992. 4. Dubois, D., and Prade, H., Fuzzy sets in approximate reasoning, part I: inference and possibility distributions, Fuzzy Sets and Systems 40, 143-202, 1991. 5. Blum, E. K., and Li, L. K., Approximation theory and feedforward networks, Neural Networks 4, 511-515, 1991. 6. Cardaliaguest, P., and Euvrard, G., Approximation of a function and its derivative with a neural network, Neural Networks 5, 207-220, 1992. 72 Yoichi Hayashi and James J. Buckley 7. Cotter, N. E., and Guillerm, T. J., The CMAC and a theorem of Kolmogorov, Neural Networks 5, 221-228, 1992. 8. Cybenko, G., Approximation by superpositions of a sigmoidal function, Math. o f Control, Signals, and Systems, 2, 303-314, 1989. 9. Geva, S., and Sitte, J., A constructive method for multivariate function approxi- mation by multilayer perceptrons, I E E E Trans. on Neural Networks, 3, 621-623, 1992. 10. Hornik, K., Approximation capabilities of multilayer feedforward networks, Neural Networks 4, 251-257, 1991. 11. Hornik, K., Stinchcombe, M., and White, H., Multilayer feedforward networks are universal approximators, Neural Networks 2, 359-366, 1989. 12. Ito, Y., Approximation of functions on compact sets by finite sums of a sigmoid function without scaling, Neural Networks 4, 817-826, 1991. 13. Ito, Y., Approximation of continuous functions on R a by linear combinations of shifted rotations of a sigmoid function with and without scaling, Neural Networks 5, 105-115, 1992. 14. Kreinovich, V. Y., Arbitrary nonlinearity is sufficient to represent all functions by neural networks: a theorem, Neural Networks 4, 381-383, 1991. 15. Kfirkov~, V., Kolmogorov's theorem is relevant, Neural Computation 3, 617-622, 1991. 16. Kfirkov~, V., Kolmogorov's theorem and multilayer neural networks, Neural Networks 5, 501-506, 1992. 17. Park, J., and Sandberg, I. W., Universal approximation using radial-basis-func- tion networks, Neural Computation 3, 246-257, 1991. 18. Full6r, R., and Zimmermann, H.-J., On Zadeh's compositional rule of infer- ence, Proc. Fourth IFSA World Congress, Vol. Artificial Intelligence, Brussels, 41-44, July 7-12, 1991. 19. Huang, S.-C., and Huang, Y.-F., Bounds on the number of hidden neurons in multilayer perceptrons, I E E E Trans. on Neural Networks 2, 47-55, 1991. 20. Mirchandani, G., On hidden nodes for neural nets, I E E E Trans. on Circuits and Systems 36, 661-664, 1989. 21. Sartori, M. A., and Antsaklis, P. J., A simple method to derive bounds on the size and to train multilayer neural networks, I E E E Trans. on Neural Networks 2, 467-471, 1991. 22. Buckley, J. J., Sugeno type controllers are universal controllers, Fuzzy Sets and Systems 53, 299-304, 1993. 23. Buckley, J. J., Universal fuzzy controllers, Automatica 28, 1245-1248, 1992. 24. Buckley, J. J., Controllable processes and the fuzzy controller, Fuzzy Sets and Systems 53, 27-32, 1993. Fuzzy Expert Systems and Neural Networks 73 25. Buckley, J. J., Applicability of the fuzzy controller, in Advances in Fuzzy Systems: Applications and Theory, (P.-Z. Wang and K. F. Loe, Eds.) World Scientific Press, Singapore, in press. 26. Kosko, B., Fuzzy systems as universal approximators, Proc. o f l E E E Inter. Conf. on Fuzzy Systems, San Diego, 1153-1162, March 8-12, 1992. 27. Nguyen, H. T., and Kreinovich, V., On approximation of controls by fuzzy systems. Unpublished manuscript. 28. Wang, L.-X., Fuzzy systems and universal controllers, Proc. of IEEE Inter. Conf. on Fuzzy Systems, San Diego, 1163-1170, March 8-12, 1992.