Extracting expertise from experts: Methods for knowledge acquisition Extracting expertise from experts: Methods for knowledge acquisition Abstract: Knowledge acquisition is the biggest bottleneck in the development if expert systems. Fortunately, the process of translating expert knowledge to a,form suitable f o r expert system development can benefit f r o m methods developed by cognitive science to reveal human knowledge structures. There are two clusses of t h i w investigative methods, direct and indirect. W e provide reviews, criteria f o r ’ use, and literature sources f o r all principal methods. Direct methods discussed are: inret-views, questionnaires, observation of tusk perjbrmance, protocol analysis, interruption analysis, closed curves. and inferential flow analysis. Indirect methods include: multidimensional scaling, hierarchical clustering, Kenera1 weighted network.s, ordered trees, and repertory grid analysis. J U DlTH REITMAN OLSON Cruduute School of Business Administrurion The University of Mic,higur Ann Arbor Michigan IJSA HENRY H. RUETER Vwtor. Research Inc Ann Arbor Michigun U S A 1. Introduction Expert systems are here to stay. Although they are out of fashion in the research laboratories, they are nonetheless growing in popularity inside business and industry. Armies of small expert systems are being built to solve routine, medium- difficult problems in areas such as diagnosis of engine failures, tax planning, and feasibility analysis of cases i n union disputes about seniority. The literature on how to build expert systems is burgeoning. People flock to seminars and tutorials at professional conferences because they need training on how to build systems for very practical ends. I t has long been recognized that the system development process for expert systems and other artificial intelligence applications is dif- ferent from that for standard software. Whereas standard software development has a small frac- tion of the time spent in planning and a lot of coding, the majority of the time in expert system development is in planning - in deciding what knowledge should be encoded into the system. The bottleneck in the development of expert sys- tems is in extracting the knowledge from the ex- pert, that is, in knowledge acquisition I I ] and 121. Even books about how to build expert sys- tems have very little to offer about knowledge ac- quisition. Expert system developers have relied on two standard methods for getting the knowledge out 3f the expert and into the system: I ) The leveloper, or knowledge engineer i n this citse. :ngages in intense interview with the expert o r 2) the knowledge engineer becomes a n s x p e r l iim/herself, relying on introspection t o articulate .he requisite knowledge. The knowledge gineer then encodes the knowledge into a I;m guage of choice, encoding facts into OHJECT- ATTRIBUTE-VALLIE triples and infercnces into IF-THEN rules. When run, thc inferencc en- gines grind away at these knowledge structure\ working either with facts (e.g. the features that the particular case under diagnosis exhibits) j i w . ward to the goal, or working with the goal ( e . g one of several known diagnosis categories) huc X - ward to the facts. triples and forward and backward \card1 strategies are a small subset of the knowledge structures and search strategies human expert\ have. Expertise is primarily a skill o f r e c o g n - tion, of ‘seeing’ old patterns in thc new ptoblern. Chess experts, for example, have thc* same limited abilities as novices t o hold intorination for analysis and look ahead only 3 limited num- ber of steps. They excel because they h;rvc hundreds of thousands of chess configuration\ i n their memories, and can quickly encode the cur- rent situation into constellations o f previously - seen chess -patterns. The choice of candidate ‘good moves’ for the expert is thus restricted 10 ii small set of known good movcs that fit the pa- terns, whereas the novice has no such expert p a l - tern knowledge to filter out bad c;indidatil\ 131, 141 and [ S l . There is also evidence to suggest that c t p c r t a see more richly encoded patterns than novices\ do. They have organized the concepts in thcir knowledge bases with much more depth and with many more central associations t h i t n novices. For example, expert Algol prograiniiicw had much more structure i n the rclation\hrp\ among concepts held i n nieinory thaii novice\ did. And, experts’ organizations wcre hip.11ly similar, whereas novices’ structures weie s a t . tered, based on a variety of irrelevant prior i t s sociations 161. Not only d o experts have in1orrri;rtiori o r - ganized in a highly structured manner. hut they use a variety of kitids of knowledge structure\. Some things are stored in simple /isr.s. e.9. tht. months of the year and the days o f thc week. Some information fits a stored f u l d c better. i n l o r mation such as calendar appointments and the pe-- riodic table. Some information is stored ;I\ ;I tlow diagram, such as a ducisiorr ti’(’(’, for c x - Simple OBJECT-ATTRIBI.ITE- V A 1.1 1 E I52 Expert Sy\tem\, Augu\t 1987 V o l 4 , N o . 3 ample, representing the routing of telephone mes- sages to the people who can handle them. There is information stored in hierarchies of relation- ships, nested categories or clusters, such as animal taxonomies. Networks store richly con- nected language associations. Such information as room arrangements or maps may be stored as physical space. And, some information may be stored about a device’s internal components and how they are causally related as a physical model, commonly referred to as a mental model. Experts may hold what they know about objects in a myriad of different representations, each suitable for a particular kind of reasoning or retrieval. Not only do experts see problem situations dif- ferently, they may also search differently through intermediate problem states. Recent work in medical reasoning, for example, has shown that experts primarily work forward through the problem (beginning with a ‘good’ representation), and that novices work backward from the goal. Furthermore, from work on plan- ning, we see that experts also plan ahead loosely, filling in details when the situation dictates, and that they move back and forth from abstract to detailed entities when evaluating tactics and strategies [7J and [8]. Typically the knowledge engineer has only OB- JECT-ATTRIBUTE-VALUE and IF-THEN rules to use to encode what the expert knows. If in acquiring the expert’s knowledge, the knowledge engineer focuses only on knowledge that can be easily encoded in these forms, then significant expertise will be missed. The process of translating expert knowledge to this form will benefit greatly from a knowledge of 1) the fact that many different expert structures and proces- ses are possible, and 2) which tools are ap- propriate to uncovering these various structures. The purpose of this paper is to convey to the knowledge engineer the myriad of methods used in cognilive science research for revealing expert knowledge structures and processes. Some of these methods may be useful in the developing expert systems. The ultimate goal, however, is to alert the knowledge engineer to the fact that simple, intense, time-consuming interviews of experts leading to the coding of OBJECT-AT- TRIBUTE-VALUE triples and IF-THEN rules may be both misleading and costly as well as dif- ficult. Experts have rich structures and reasoning abilities. Truly representing the expert’s exper- tise depends on knowledge of these structures and abilities. 2. Methods for knowledge acquisition There are two classes of methods for revealing what experts know. ‘Direct methods’ ask the ex- pert to report on knowledge he/she can directly articulate. This set of methods includes inter- views, questionnaires, simple observation, think- ing-out-loud protocols, interruption analysis, drawing closed curves, and inferential flow analysis. In contrast, ‘indirect methods’ d o not rely on the expert’s abilities to articulate the in- formation that is used; they collect other be- haviors, such as recall or scaling responses from which the analyst can make inferences about what the expert must have known in order to respond the way he/she did. ‘Indirect methods’ include multi-dimensional scaling, hierarchical clustering, general weighted networks, ordered trees from recall, and repertory grid analysis. 2.1 Direct methods 2.1.1 Interviews. Interviews are the most com- mon method for eliciting knowledge from the ex- pert. In conversation, the expert reveals the ob- jects he/she thinks about, how they are related or organized, and the processes helshe goes through in making a judgment, solving a problem, or designing a solution. There are simple guidelines that can be followed to make interviewing effi- cient. 1 . Enlist the expert’s cooperation. Interviewing an unwilling expert dooms the project to failure. There is a number of reasons why an expert may not be cooperative. First, the expert may know that he/she performs the task in simple, intuitive ways that, if revealed, would reduce the esteem others hold for himher. Second, the expert may not know how he/she performs that task and may be reluctant to express the uncertainty, thinking that experts are expected to be rational and ar- ticulate. And third, the expert may believe that if h i s h e r expertise can be captured in a computer, hisfher job will be eliminated. On the other hand, the expert may be flattered to think that the com- pany is willing to invest the time and money to clone the expertise, to allow many more problems to be solved with the knowledge this one person has gained. 2. Ask free-form questions at the start, narrow- ing in spec$icity as the interview process progresses. The goal at first is to discover the vocabulary the expert uses and to allow h i m h e r to begin to articulate the inferences drawn and ~~~ ~ ~ Expert Systems, August 1987. Vol. 4 , N o . 3 . 153 the relationships seen. Asking, "How do you do your task?" can be followed with, "Recall the last case ..." The interviewer should note the order in which the expert addresses topics, the relative importance of the way evidence is weighed. 3. Do not impose your own understanding on the expert. Allow the expert to talk, even if the cur- rent topic seems tangential to the main purpose. Do not interrupt. Ask about what you do not un- derstand, but do not impose your own, naive bias on what the expert is saying. 4 . Limit the sessions t o coherent tasks, recogniz- ing fatigue and attentiond limits. The inter- viewer should be aware of the effort that goes into answering questions about expertise, and make sessions tolerably long. However, breaking in the middle of a thought or problem is un- natural and disturbing. It is almost impossible to 'take up where we left off' if major questions are not settled on a problem or task at the end of the previous session. The interview questions should center on the expert's knowledge of objects, relationships, and inferences. Good questions to ask might be: "What kinds of things do you like to know "What facts or hypotheses do you try to estab- "What are the factors that influence how you "What type of values can this object have? "Does this factor depend on other factors? If "Is this factor needed for solving all problems about when you begin to ponder the problem?" lish when thinking about a problem?" reason about a problem?" What range of values is permissible?" so, which ones?" in the domain or for just some?"' It is probably best to begin knowledge acquisi- tion with an interview, to identify a set of objects and their relationships (noting especially whether the expert is thinking of these objects in special relationships like lists, tables, physical spaces, etc). However, the specificity and com- pleteness of what can be obtained soon reaches a limit in the free form style. To alleviate this problem, often the interviewer can mix in ques- tions of other styles, such as focusing on a par- ticular case in detail, called the 'critical incident technique'. Focusing on a case elicits particular descriptions, rules and objects, which can be ex- From [9] 2mined for their generality i n later sessions By isking for 'symptoms' and 'characteri\tics' one Aicits features, while asking tor 'evidcncc' :licits inferences. 2.1.2 Questionnaires. Interviews have a distinct advantage in that they can elicit unforeseen infor mation. Interviews are free form; experts L ~ I I generate information in the order they wi\h i n the detail they wish. Experts are in control Inter- views, however, are very time-conwming. Ques- tionnaires, on the other hand, have the advantape of being a very efficient way to gather infornia- tion. Furthermore, the expert can fill O U I que+ tionnaires in a leisurely and relaxed atmospherc Questionnaires can be particularly useful i n d i \ - covering the objects of the domain, in uncover ing relationships, and perhaps in determining un- certainties, if the expert system attaches uncer- tainty to its conclusions. The questionnaires suggested for ube here drc not the kind used in survey research. Instead, they consist of cards or pieces of paper on which are printed some standard, but open -ended ques- tions. Figures 1 and 2 illustrate cards uwd t o elicit variables (in Figure 1 the object 'sale\' is defined) and relationships (in Figure 2 the relationship between 'sales', 'quota', and 'ha\e' I-. Figure 1. Questionnaire cardfor variahk elicitation [ 9 ) Figure 2. Questionnaire cardfor relationship elicitation [91 154 Expert Systems, August 1987. Vol. 4 , N o . 3. is drawn.) Figure 3 illustrates the use of questionnaires to elicit uricertainties about particular inferences an expert has reported. Questionnaires are par- ticularly appropriate for eliciting uncertainty in- formation, since normal verbal responses from people are not very reliable in revealing this kind ~ of information. People are not good at estimating probabilities; they overestimate low ones and un- derestimate high ones [ 101. Consequently, simp- ly asking for probabilities in an interview is not effective. Eliciting probability estimates using pre- formatted response scales can yield much more accurate estimates. There are two preferred formats: 1) the bar on which the expert indicates a point to reflect uncertainty, and 2) a five point verbal scale on which the expert marks the work most closely associated with hisher certainty. The five-point scale is one taken from Meister’s compilation of verbal scales possessing equal spacing and reliability of measurement 11 1 I. Figure 3. Response scale for uncertainty elicitation [9] 2.1.3 Observation of the taskperformunce. Often the best way to discover how an expert makes a judgment, diagnosis, or design decision is to watch the expert work at a real problem. In this situation, the knowledge engineer has several ways of discovering the objects, relationships, and inferences that the expert is using. The first decision that must be made is how to record the expert’s performance. One possibility is simply to watch, take notes, and try to follow the ex- pert’s thinking process on the fly. A second pos- sibility is to videotape the process for later review with the expert. In choosing between these methods, remember that the first method suffers from time pressure and observer bias, while the second relies on the expert’s less than perfect ability to recall the reasons underlying hisher performance. 2.1.4 Protocol analysis. A close cousin of simple observation is protocol analysis. Like observa- tion, above, the expert engages in normal task be- havior with particular typical problems. In addi- tion to video-recording the session and annotat- ing the behaviors after the fact, the knowledge engineer asks the expert to ‘think out loud’ while performing the task. The expert is to answer the questions, “What are your goals?“ “What are your methods?“ “What are you seeing?“ In- ference is then drawn from a transcript of this session about the objects, their relationships, and the inferences the expert was drawing moment by moment. The advantage of this method over the an- notated silent task performance of the observa- tion method is that there is no delay between the act of thinking of something and reporting it. But protocol analysis is not appropriate for all kinds of tasks. Ericsson and Simon [I21 carefully detail the kinds of tasks for which thinking-out- loud protocol might be acceptable, useful kind of data. To summarize, those tasks for which ver- balization is a natural part of thinking are those for which we can take thinking-out-loud as data. That is, if verbal information is produced while someone makes inferences to himherself, or in identifying salient features of the objects in the situation, then the information from the protocols is acceptable data. However, there are other kinds of tasks, those for which some idiosyncratic language is used in the process (e.g. in composing music, composers often have a special language for the parts of the piece they are writing or the section of the style they are in- stantiating currently), for which the process of thinking-out-loud and explaining might be dis- torted or even wrong. And, of course, there are tasks for which there is no natural verbalization; perceptual-motor tasks are examples of these. Verbalization of perceptual-motor tasks makes someone attend to aspects not normally attended to, and the attention required to report on the process usurps resources normally devoted to the task itself. Once obtained, protocols must be analyzed. The goal in obtaining a protocol lies in identify- ing the kinds of objects the expert sees, the relationship that exists between the protocols, and the kinds of inferences drawn from the relationships seen. For example, in the protocol in Figure 4, the problem solver is trying to find a solution to the ‘cryptarithmetic problem’ DONALD+GERALD=ROBERT. In cryptarith- metic, each letter can be mapped into one digit such that the arithmetic operations apply. The Expert Systems, August 1987. Vol. 4 , N o . 3 . 155 IS6 problem solver has been given the fact that D=5. In the protocol, the analyst looks for the change in focus of attention: it moves from working for- wards ("each D is 5; therefore, T is zero ... Now do I have any other Ts? No, but I have another D"). Later it reveals some working backwards (...Since R is going to be an odd number and D is 5 , G has to be an even number). One sees a shift in goals and subgoals, the kinds of things the problem solver is paying attention to, and the kinds of inferences made [ 131. Figure 4. Cr~ptarithmetic psoblem [I31 2.1.5 Iiiteiwption analysis. One way of preserv- ing the natural thought process of the problem solver is to let himher proceed without thinking aloud. But, when the process gets to a point where the observer can no longer understand the expert's thought processes, the observer inter- wpts. At that point, the observer asks in detail why the expert did what he/she did, trying t o cap- :ure at the moment the focus of attention and the kinds of inferences drawn for the features noticed [14]. This process can be very instructive about the process observed, but, or course, oncc a process has been interrupted, there is very littlc chance that it can be restarted. This procedure is likely to give most of its value after the expert system has been coded in its prototype stage, and the expert's performance is being compi4recl to that of the system. 2.1.6 Drawing closed curves. The previcius methods attempt to reveal the contents ot the thought processes during the solution of existing problems. They highlight the vocabulary the ex- pert uses to identify the objects and their relation- ships; they highlight the kinds of inferences drawn. These methods are free of assumptions about the form of the relationships among the items, be they lists or tables or networks o r physi- cal space. In contrast, the method of drawing closed curves is a specialized method for indicat- ing the relationships among those ob.jects t h a t can be assumed to be encoded i n a p f t j s i c i t f space representation. In the method of drawing closed curves, the ex- pert is asked to indicate which of a collection of physical ob- jects 'go together', to draw the related object4 in a closed curve. This technique is applicable to any spatial representation, such as a typeset formula, an x-ray or CAT scan, or a position on a game board. For example, a Go expert drew closed ciirveb around the stones on a position in the chesh-like game of Go [ 5 ] . Figure S illustrates several aspects of his responses. Four positions are dis- played, A-D. Inside each stone is a number which represents the ordinal position in which that stone was placed on the board in a Iecall task. Note that the order matches the closed cur.- ves to a remarkable degree; all stones of a chunk are recalled before moving on to another chunk. Furthermore, on the right side of the Figure are shown three successive recall trials of the same position. As above, the numbers indicate the order in which the stones were placed in recall trial. It is noteworthy that groups of stones, indi- cated by closed curves on a separate occasion, are recalled consistently together before other groups are recalled. This regularity of behavior suggests the validity and reliability of the inlor- mation contained in the originally drawn closed Expert Systems, August 1987. Vol. 4 , No. 3 curves. 2.1.7 Inferential flow analysis. A variant on the interview is the method called inferential flow analysis. In this method, answers to particular questions about causal relations are used to build up a causal network among concepts o r objects in the domain of expertise. Salter [IS] used this technique to uncover laymen's models of economics. This technique begins with a list of some of the key objects in the domain of expertise. In Sal- ter's case, this list contained such items as busi- ness borrowing, personal savings rate, produc- tivity, etc. Salter then asked an interviewee a series of pointed questions about the relationship between two of the objects. For example, he would ask, "What is the relationship between savings rate and inflation?" Answers revealed the linkages among items between these two key objects and the direction of the relationships. For example, the person might respond, "If inflation goes up, savings rates will g o down, because savings interest rates are lower than the amount one can save by buying now instead of later." These two items are then linked in an inverse relationship, in which purchasing is an interven- ing variable. Responses to a set of questions should uncover some consistency in the relationships between in- tervening variables. Each time an item is men- tioned i n an answer it is linked with the other items in the answer, with the links labelled posi- tive or negative as indicated. Linked items are joined into one all-inclusive network of rela- tions. At the first mention of a link, a standard weight is given to the link, e.g. 0.50. With each succeeding mention the link is raised in strength, some proportion of the remaining strength be- tween the current value and 1.00. The resulting relations are displayed as a network, such as the one displayed in Figure 6 overleaf. Although this technique appears somewhat ad hoc, the resulting networks have been shown to be both stable and consistent with other sets of behaviors [ 151. This technique is simple to apply and powerful as a tool for displaying to the ex- pert aspects of the expertise heishe has un- covered to that point. This display can be used ef- fectively as a stimulus for further interviews. 2 . 2 Indirpct methods All of the previously described methods ask the expert directly what heishe knows. They rely on the availability of the information t o both intro- spection and articulation. Of course, it is not al- Figure 5. Closed curves drawn by Go expert [S] ways the case that the expert has access t o the details of h i s h e r knowledge or mental process- ing. In fact, it is not uncommon for experts to perceive complex relationships or come to sound conclusions without knowing exactly how they did it. In these cases, indirect knowledge elicita- tion methods are required. In all the following methods, experts are not asked to express their knowledge directly. In- stead, they are given a variety of other tasks, e.g. to rate how similar these two objects are, o r t o recall all these objects several times from several starting points. From the results, the analyst then Expert Systems, August 1987. Vol. 4 , No. 3 . 157 Figure 6. Inferentia1,flow network infers underlying structure among the objects rated or recalled. All the indirect methods dis- cussed here have been validated in experimental studies that have convincingly demonstrated their psychological validity. These different techniques make different as- sumpfions about the form of the underlying repre- sentation, whether it is physical space, lists, net- Figure 7,. Similarity judgment matrix works, or tables, etc. It is important to use only those methods for which the assumptions fit the analysts’ best guess as to what form the expert’s underlying representation is in. This information can be gleaned from initial interviews with the expert as well as from careful questioning and noting of object names and/or notations the ex- pert makes. 2.2.1 Multidimensional scaling. Multidinm- sional scaling (MDS) is a technique that should be used only on data that are assumed to have come from stored representations of p h y . s i c ~ . i / 1 2 - dimensional space [ 161. The subject provides similarity judgments on all pairs of objects o r concepts in the domain of inquiry. These judg- ments are assumed to be symmetric and graded; i.e. A is as similar to B as B is to A, ant1 the similarities are assumed to take on a variety of continuous values, not just 0 or 1 . The scaling technique produces a layout of the items in space. All objects of a particular target domain are paired with all other objects in a set of queries to the expert: ”How similar are A and B?” These similarity judgments are arrayed in a half-matrix such as that shown in Figure 7. The matrix in Figure 7 is part of a matrix that com- pares pairs of common farm and wild animals. This matrix is then the input to an analysis program which searches for the best placement of these objects in space of user-specified dimen- sion. Each dimensional solution has a ‘stress’ as.- sociated with it, a measure of the deviation from a perfect fit. The analyst looks for solutions with low ‘stress’. Those using fewer dimensions are then plotted. (For higher dimensions, some of the more illuminating two dimensional projections can be drawn). The analyst then examines the plots to judge the ‘best’ placement of the axes and a plausible labelling for them. In Figure 8, the solution to a fuller similarity matrix of the form in Figure 7, the two axes might be recog- nized as ‘size’ (the abscissa) and ‘ferocity’ (the ordinate). I 158 Expert Systems, August 1987. Vol. 4, No. 3 . Figure 8. Multidimensional sculing solution in two dimensions This technique is good for producing a diagram that the expert may then inspect and describe in more detail. It can reveal interesting clusters of objects, neighbor relations, and outliers. One dif- ficulty with this technique, however, is the tedium of collecting the required pair-wise similarity judgments; for n objects, n(n-1)/2 judg- ments are required, a number that quickly grows to the hundreds and thousands with more than a few objects. Furthermore, it is difficult for the analyst to find the dimensionality with the best ‘stress’ value, and then perceive the best place- ment of the axes and the axes’ names. Using the technique is straightforward; interpreting the results is not. 2.2.2 Johnson hierarchical clustering. Like MDS, hierarchical clustering begins with a half- matrix of similarity judgments. The assumptions for this technique, however, are in direct con- tradiction to those for multidimensional scaling. Whereas multidimensional scaling assumes sym- metric distances and graded properties, hierarchi- cal clustering assumes merely that an item is or is not a member of a cluster. Judgments are as- sumed to be a function of the number of nested clusters two items have in common, or the ‘height’ at which two items become members of the same superordinate category. Items cannot at once satisfy assumptions for both multidimen- sional scaling and hierarchical clustering [ 171. Johnson hierarchical clustering is a fairly simple, straightforward algorithm that starts with the half-matrix of distances and ends with a hierarchical representation of the items. In broad strokes, pairs of items that are the closest in the matrix are joined to a single cluster, a new matrix drawn with this cluster serving as a new ‘item’. This new matrix is examined again for that pair of ‘items’ that is closest together. These are joined as if the next new ‘item’, and a new matrix drawn. Each time a new matrix is drawn, interitem distances among unclustered items are copied from the original matrix to the new; dis- tances between items and clusters are calculated as either the minimum distance of all cluster items to the item, the maximum, or the average. For example, using the matrix in Figure 7, the items that are the closest are COW-SHEEP- GOAT. The value in the re-written matrix for the distance between this cluster and PIG, for ex- ample, using the minimum joining algorithm, as- signs ‘2’ to the distance, the minimum of 2, 3 and 2 (for PIG-GOAT, PIG-COW, and PIG- SHEEP, respectively). Figure 9 shows the full rewritten matrix with the COW-SHEEP-GOAT cluster now serving as an ‘item’. In this matrix, PIG joins COW- SHEEP-GOAT (at 2) as does DOG to RABBIT. This matrix is rewritten into Figure 10. In Figure 10, HORSE joins the ((COW-GOAT-SHEEP)- PIG) cluster at 3, and the two clusters join at 6. The completed hierarchy is shown in Figure 1 1 ~ cow Sheep- Goat Pig Horse Dog Rabbit cov- Sheep- Goat Pig Horse Dog Rabbit Figure 9. Similarity mutrix of Fig. 7 with Cow-Sheep-Goat cluster An advantage of hierarchical clustering is that it can be done with paper and pencil. Unfor- tunately, it begins with a distance matrix that is just as tedious to collect as that for multi- dimen- sional scaling. Furthermore, without some theoretical justification for choosing a particular joining algorithm (the minimum, maximum, or average) one must choose arbitrarily; unfor- tunately, different algorithms can produce remarkably different hierarchies. In this sense, the analysis is somewhat subjective. 1 Unfortunately, some people routinely do a multi-dimensional scaling solution in two dimensions (arbitrari- ly) and then indicate the nested curves found in Johnson hierarchical clustering by drawing closed curves around chunk elements. This cannot be done if one adheres to the underlying assumptions of both techniques. Expert Systems, August 1987. Vol. 4 , N o . 3 . 159 CGS-Pig Horse Dog-hbbit CSG-Pig Horse Dog-Ubbit Figure 10. Similarity matrix of F i g . 9 afterfurther clusterin2 Figure 11. Final cluster hierarchy f o r animal exuniple 2.2.3 General weighted networks. Like the preceding two techniques the expert gives sym- metric distance judgments on all possible pairs of objects. These distances are assumed to arise from the expert traversing a network of associa- tions, a network in which there is a single primary path between every two items, and, for some of them, a differently encoded, secondary path between them as well. A recent investiga- tion using networks is [ 191. From the distance matrix, a minimal connected network (MCN) is first formed. This network is formed by connecting the most closely linked items, such as COW-SHEEP in Figure 7.’ In Figure 12. M C N and MEN f o r animals exump!e Figure 12, the solid lines show the resulting net- work for these items. Secondly, additional links are added to this tree and the resulting structure called the mrnim,il :laborated network (MEN). Here, we add a l i n k d and only if it is shorter than the links currently in the network between those two node\ ‘The dashed lines in Figure 12 are those atfditioniil links appropriate to the MEN. These two struc- tures are then examined for 1) dominating concepts - those that have 3 large number of connections to many other nodes, and 2) members of cycles - those items that are fully linked into circles. In Figure 12, SHEEP is a dominating concept, the one with the most primary links, HORSE i s somewhat less dominating, because although it has many links, most are from the elaborated net- work. Figures 13 and 14 show the results 0 1 an ex- ploration of the MCN and MEN for expert and novice pilots, rating a set of terms having to do with ‘split plane concepts’ [20]. Figure 13 I \ - lustrates the network for an expert; Figure 14 shows the network for the same concepts held by a novice. Several things were noted in the study: Experts’ structures were simpler than stu- dents’. Elaborated links connected integrated larger conceptual structures. Experts could easily identify link relations, using such terms as ‘Affects’, ‘Is-a’. ‘Desirable’, ‘Acceptable’, etc. The fact that the experts were so clearly dif- ferent from the novices suggests that this techni- que can reveal significant aspects of expertise, aspects that clearly should be encoded into an ex- pert system. 2.2.4 Ordered trees f i o m recall. Ordered trees come from work by Reitman and Kueter ( 2 I 1 in their exploration of how memory organizations differ among experts and novices in a particular domain. Unlike the indirect methods described above, ordered trees begin not with a distance matrix but with recall trials. The technique as- sumes that objects belong to a cluster o r not. similar to the assumption of hierarchical cluster- ’ SHEEP is taken as the most central item i n thi\ ’tied’ cluster because it is, on average, closer to all other items than either GOAT or COW, I60 Expert Systems, August 1987. Vol. 4, No. 3 Figure 13. M C N and MEN f o r expertfighter pilots 1201 Figure 14. MCN and MEN@ novice jighter pilots 1201 Expert Systems, August 1987. Vol. 4 , No. 3 . ing. Unlike hierarchical clustering, however, this technique is built upon a model of how the data are produced by the subject: it assumes that people recall all items from a stored cluster before recalling items from another cluster. This assumption builds on data from people recalling from known (learned) organizations. Regularities found over a set of recall orders are assumed to reflect organization in memory. Figure 15 illustrates four orders of the seven animal names used in previous examples. The ex- pert/subject is asked to recall object names ten to twenty times; to encourage variety, on some of the trials he/she is told which item to begin with. These recall trials are then examined for regularities. All sets of items that are recalled together are identified as chunks, the chunks written into a lattice (ordered inclusion covering relationship). This lattice is then re-drawn into an ordered tree structure, such as that in Figure 16, where arrows (either uni-directional or bi- directional) are drawn over the chunk elements that were recalled consistently in a particular order (or, in the case of a bi-directional, one order and its reverse). This analysis can be done by hand, but because it is tedious and open to perceptual error, a computer analysis is best. A program can also perform certain advanced analyses in addition, such as calculating an index of organization and looking for ‘outlier’ trials, without which the tree structure reveals sig- nificantly more structure. Figure 15. Four animal recall trials with low-level chunks marked This technique has been used in a variety of studies of expert-novice differences. In [6], for Figure 16. tfigher-order animal rec ull( h n X ! andfinal order tree example, novice, intermediate, and expert Algol- W programmers were asked to recall Algol keywords many times from many starting point5 while their performance orders were recorded Experts differed remarkably from the n o v t Experts showed much more organiiation. ,rnd the similarity among the expert 5tructures wa\ far greater than that among the novices In 12 1 1 , furthermore, the pauses between recall\ of wc- cessive items was accounted for by the number of chunk boundaries crossed in the inferred memory organization. There has been a vmety of studies that have used this technique t o reveal organization in different domains of e x p e r t i ~ , all showing a convergence among expert5 in their organization of the concepts in memory Figures 17 and 18 show the ordered tree5 t o r one expert and one novice, respectively, in the study of Algol knowledge. The expert clearly u n derstands the function of these 5pecial word\, grouping the words concerning loops (WHILE- DO, FOR-STEP), the logic item5 (AND-OK, TRUE-FALSE), and the string representation items (STRING-BITS, LONG-SHORT, REAL,) The novice, on the other hand, grouped the short words (AND-OR-OF-FOR), grouped the condi- tional words (THEN-WHILE, ELSE-IF) in an order not standard to programming, dnd clustered words into a small scenario connected with do with “long and short bits ot stnng” (BITS-STRING-LONG-SHORT). Clow ex- amination of the resulting ordered trees can reveal aspects of what the expert ‘sees’ in a \itua- 162 Expert Systems, August 1987. Vol. 4 , No. 3 . Figure 17. Ordered tree for expert Algol programmer [ 6 ] tion in his/her domain of expertise. 2.2.5 Repertory grid analysis. This technique, the last lo be presented in this review, is the most complete. It includes an initial dialog with the ex- pert, a rating session, and analyses that both cluster the objects and the dimensions on which the iterris were rated. Essentially, it is a free- form recall and rating session in which the analyst makes inferences about the relationships among objects and the relatedness of the dimen- sions the expert pays attention to. Repertory grid analysis is a technique whose origins are in personal construct theory in clini- cal psychology [22]. Used in the clinical setting, it was intended to reveal to the patients the kind of attributes they normally or abnormally attend to in their emotional lives. Boose [231 and [24] has adapted it for the explicit development of rules in expert systems. The initial session begins with an open inter- view of the expert, asking him/her to name some objects in the domain of expertise. After a small set is generated, the analyst picks three of these objects and asks, “What trait distinguishes‘ any two of these objects from the third?” The expert names a dimension in whatever vernacular is natural. He/she then indicates which are ‘high’ on this trait, and which are ‘low’. The analyst records the dimension and assigns a scale value (e.g. 1-3) to the three objects. The analyst then picks three other objects and asks the same ques- tion about what trait distinguishes two of these from the third. This process of asking for salient dimensions continues with a significant number of triples, enough so that the analyst is satisfied that he/she has uncovered the major dimensions of similarity and dissimilarity. Having collected a ‘grid’ with objects at the top and dimensions across the left border, the analyst then asks the expert to fill in all the missing values. That is, all objects are then to be rated on all dimensions. Figure 19 is an example grid that rates quality of students on nine different elicited dimensions, where each student is rated on a three point scale for each of the dimensions that Figure 18. Ordered tree f o r novice Algol programmer [6] Expert Systems, August 1987. Vol. 4, No. 3 . 163 Figure 19. Rating gridfor seven students l2.51 the expert generated (taken from [25]). Two further analyses use this grid: clustering of its objects (in this case, students), and cluster- ing of its dimensions. Johnson hierarchical clustering (described above) is used, so a dis- tance matrix is required. For clustering objects, the distance metric is straightforward: for each pair of objects, count the absolute difference be- tween the scores each object was given on each Figure 20. Between-object (student) distance matrix Figure 21. Student hierarcAy jimension. Thus, for objects E l and E2 in our e x - ample, the distance is: 2+1+2+2+2+2+2+2+2 = 17 Figure 20 illustrates the distance matrix that arises from this calculation on the ratings f o r Figure 19. The resulting hierarchy, using the minimum joining method, is shown in Figure 2 1 . This display should be given to the expert f o r fur- ther comment and analysis. The second analysis done on the original grid examines the similarity of the dimensions. Here the definition of distances is not so straightfor- ward. Since no judgment was ever made as trr which end of a scale received a ‘3’ and which ;I ‘ l ’ , there may be cases where very \iruilar dimensions are highly correlated but were as- signed opposite scale values. Therefore, part of the calculation of the distance matrix involves ‘flipping’ appropriate dimensions. This is tiont: in the following manner: first, the full distance matrix is calculated as the actual (not absolute) difference between the ratings of all objects on two dimensions. Above the diagonal are the dis- tances between two dimensions the way they are written; below the diagonal, the second dimen- sion (the ‘row’ dimension) is reverse in scale. That is, comparing C1 and C 2 across all ob.iect\ produces a score of (CI-C2) =2+1+2+1+2+1+1= 10 Next, the C2 values would be ‘flipped’, e a c h scale value 3 translated as a 1, each I a\ a 3 Cnn- sequently, comparing C1 with C2’ (llipped) I 64 Expert Systems, August 1987. Vol. 4 , No. 3 . would result in: Figure 22 is the full matrix of the distances among all pairs and ‘flipped’ pairs of dimensions from the grid in Figure 19. since Johnson hierar- chical clustering cannot use assymetric distan- ces, a symmetric half-matrix is needed. To trans- late this into a half-matrix, we select the highest of the two relevant cells, (Cl-C2) and (Cl-C2’). This resulting half-matrix is shown in Figure 23; the hierarchical clustering solution is shown in Figure 24. Figure 22. Asymmetric hetween-concept(rating scale) distances Boose’s expertise transfer system further ex- amines these clustered dimensions to find those that are highly correlated, those that imply one another, and those that are super-ordinate to each other. Rules are obtained in a second interview with rhe expert, wherein the traits or dimensions elicited in the first phase are reviewed and named. ”he program then generates production rules. Some rules reflect the correlation between dimensions, some combine dimensional values to predict an end category. Boose [24] reports that several hundred prototype systems have been created using this technique. It seems to be particularly applicable to classification type problems, where features of a new object (case) are observed and the object sorted into one of a known set of categories. The system has several advantages: it produces similarity matrices with a procedure that is much Figure 23. Symmetric hetween-concept(ratinR scale) distances Figure 24. Rating scale hierarchy less tedious than directly rating the similarity of all pairs. Furthermore, it has been used to com- bine the expertise from two different experts in the same domain, and it has been used to com- bine two experts in different aspects of the same general domain. Individual repertory grids can also be used as a basis for discussing or negotiat- ing disagreements among experts. In this situa- tion, individual grids are generated by each ex- pert; verbal discussion ensues from both experts viewing both grids and clustering solutions. Similarly, grids could be used as an aid in trans- fering expertise from an expert tutor to a novice. - Expert Systems, August 1987. Vo[. 4, No. 3. - I65 166 The novice grid and the expert grid could be compared, and mismatches determine the focus of additional instruction. 3. General discussion Experts have stored rich representations of facts, objects and their attributes, as well as a set of in- ference rules that connect constellations of facts for use in problem-solving situations. The methods collected in this review differentially il- luminate these various kinds of knowledge. Table 1 categorizes the twelve methods accord- ing to whether they illuminate objects, their relationships, or inference rules. Clearly, in the open, free-form format of the direct techniques, with the exception of drawing closed curves and inferential flow analysis, the knowledge engineer has a chance of finding any kind of information. These techniques are not specifically designed to elicit one particular kind of information over another. Drawing closed curves explicitly il- luminates the relationships among objects in the problem space; inferential flow analysis, as indi- cated in its name, displays the inference chains experts may use to reach conclusions. All of the direct techniques, however, suffer from the fact that experts cannot always say what they know or how they solve a particular problem. If knowledge engineers confine their knowledge acquisition techniques to these, they run the risk of excluding important kinds of in- formation from their expert systems. The indirect techniques, however, are more limited in what they can reveal. All of the in- direct techniques illuminate particular aspects of the relationships among the objects in the domain of expertise. The repertory grid analysis can also produce inferences based on the correla- tions among attributes of objects. The indirect techniques, however, involve as- sumptions about the underlyingform of the repre- sentation of objects and their relations. Table 2 aligns the indirect techniques with each of the forms listed in the introduction: lists, tables, categorical hierarchies, inferential flows (decision trees), networks, physical space, and physical models. The two direct techniques, drawing closed curves and inferential flow analysis, are included here because they similar- ly make assumptions about the format of the un- derlying representation. For each technique, the primary assumed form is indicated with an ‘X’, and those forms that can additionally be revealed are marked with a ‘+’. An additional category of form, ‘undefined OBJECT-ATTRIBUTE- VALUE’, is included to account for the primary iata elicited in repertory grid analy\is. All of these techniques add richness to the ba\e i f information about expertise that can be :licited from interviews. Care should be taken, iowever, with the amount of faith that I \ placed In the validity of the displays of knowledge each >f these techniques produces. For example. protocol analysis, if applied to a task that I \ not normally verbalized, may distort the problem solving process and reveal what the expert t h r n k \ Table 1. Kinds of information methods cur1 r e v d Interviews X X X Questionnaires X X X Obscrvriion X X X Protocol Anrlyrir X X X Interruption X X X Closed Curves Inferential Flow MDS Hierarchies GWN X X X X X Ordered Trec X Repertory Grid X X X Table 2. Kinds ojstructures the method,s can ~ h o u Lists Tables Hierarchies Flow Networks Physicrl Physica Space Model MDS + + X -Y X OWN + + + X + Ordered T r m + X Rep. Grid x + Closed cwu X + Inter. Flow X + + he/she should report rather than what he/\he nor- mally uses. The indirect technique\ are more resistant to these distortions because they dtr not suggest to the expert/subject the ‘right’ way to respond. Since indirect techniques are based on ah\ump- tions and underlying theories, they can be abu\ed Expert Systems, August 1987. V o l . 4 , No. 3 . to the extent that their basic assumptions are not met by the data. Multidimensional scaling, for example, can be applied to situations for which there is no reason to assume an underlying n- dimensional space. Johnson hierarchical cluster- ing is too often used to indicate the ‘clusters’ of points on a multidimensional scaling solution. This co-representation is a direct violation of the underlying assumptions; the data cannot simul- taneously satisfy both sets of assumptions. General weighted networks assume an underly- ing network; ordered trees assume a tree struc- ture in which some clusters have prescribed or- ders of recall imposed on them. Repertory grid analysis assumes that objects are stored with their values on bi-polar dimensions, and con- nected in networks of dimensional values. Only one technique makes explicit assumptions about how the expert produces the data. Ordered trees assume that items are stored in nested clusters and that all items of a cluster are recalled before all items of any other cluster. The remainder of the techniques rely loosely on the ability of the subject/expert to make a single valued ‘similarity’ judgment from whatever the stored form is. Just as a statistician makes judgments about the suitability of a data set to the assumptions of a proposed analysis, the knowledge engineer must make judgments of the suitability of a method for knowledge elicitation to the kinds of knowledge the expert is assumed to possess. There is a number of ways these techniques can be misapplied for scientific discovery of mental organizations. However, if used as exploratory analyses of the way specific experts may store and use knowledge, these techniques can bring a great deal of information to the knowledge en- gineer. Using these techniques, knowledge en- gineers can hope to uncover more of what ex- perts know than is currently accessible through interviewing or introspection. 4. References [ l ] P. Harmon and D. King, Expert Systems: Art$cial Intelligence in Business, John Wiley and Sons, New York, 1985. [2] D.A. Waterman, A Guide to Expert Sys- tems, Addison- Wesley, Reading, Mas- sachusetts, 1986. [3] W.G. Chase and H.A. Simon, ‘The mind’s eye in chess’, in W.G. Chase (ed.) Visual Information Processing, Academic Press, New York, 1973. [4] A.D. deGroot, Thought and choice in chess, Mouton Co., Paris, 1965. [5] J.S. Reitman, ‘Skilled perception in Go: Deducing memory structures from inter- response times’, Cognitive Psychology, 8, 1976, pp. 336-356. [6] K.B. McKeithen, J.S. Reitman, H.H.Rueter and S.C. Hirtle, ‘Knowledge organization and skill differences in com- puter programmers’, Cognitive Psych- [7] V.L.. Pate1 and G.J. Groen, ‘Knowledge based solution strategies in medical reasoning’, Cognitive Science, 10, 1 , 1986. [8] B. Hayes-Roth and F. Hayes-Roth, ‘A cognitive model of planning’, Cognitive Scitxce, 3, 1979, pp. 275-310. [9] C.W. Holsapple and A.G. Whinston, Manager’s Guide to Expert Systems, Dow-Jones-Irwin, New York, 1986. ology, 13,1981, pp. 307-325. [lo] C.H. Coombs, R.M. Dawes and A. Tversky, Mathematical Psychology: An Elementary Introduction, Prentice-Hall, Englewood Cliffs, New Jersey, 1970. [ l l ] D. Meister, Behavioral Analysis and Measurement Methods, John Wiley and Sons, New York, 1985. [12] K.A. Ericcson and H.A. Simon, Protocol analysis: Verbal reports as data, MIT Press, Cambridge, Massachusetts, 1984. [13] A. Newell and H.A. Simon, Human Problem Solving, Prentice-Hall, Englewood Cliffs, New Jersey, 1972. [I41 R. Schweickert, A.M. Burton, N.K.Taylor, E.N. Corlett, N.R. Shadbolt and A. Hedgecock, A comparison of knowledge elicitation techniques f o r ex- pert systems: A case study in lighting f o r industrial inspection, Technical Report, Department of Psychology, Purdue University, 1986. [ 151 W.J. Salter, ‘Tacit theories of economics’, Proceedings of the 5th Annual Con- ference of the Cognitive Science Society, Rochester, New York, 1983. [16] R.N. Shepard, A.K. Romney and S.B. Nerlove, Multidimensional Scaling: Theory and Applications in the Be- havioral Sciences, Volume I, Seminar Press, New York, 1972. Expert Systems, August 1987. Vol. 4, No. 3 . 167 [17] E.W. Holman, ‘The relation between hierarchical and Euclidean models for psychological distances’, Psychometrika, 37,4, 1972, pp. 417423. 11 81 S.C. Johnson, ‘Hierarchical clustering schemes’, Psychometriku, 32, 1967, pp. [19] R.W. Schvaneveldt, F.T. Durso and D.W. Dearholt, Pathfinder: Scaling with net- work structures, Memoranda in Computer and Cognitive Science, Report No. MCCS-85-9, Computer Research Laboratory, New Mexico State Univer- sity, 1985. [20] R.W. Schvaneveldt, M. Anderson, T.J. Breen, N.M. Cooke, T.E. Goldsmith, F.T. Durso, R.G. Tucker and J.C. DeMaio, Structures of memory for criticalflight in- formation, Air Force Human Resource Laboratory, Technical Report 81 -46, 1982. 241-254. About the authors [21] J.S. Reitman and H.H. Rueter, ‘OrganiLa- tion revealed by recall orders and con- firmed by pauses’, Cognitive Psvcho/o,qy, 12,4, 1980, pp. 554-581. 1221 G.A. Kelley, The Psychology oj l ’ p r x i n d Constructs, Norton, New York, 1955. [23] J.H. Boose, ‘A knowledge acquisition program for expert systems based on per- sonal construct psychology’, Intrrnutional Journal of Man Machine Studirs, 23, 1985, pp. 495-525. [24] J.H. Boose, Expertise Trun.sjer Jhr f h p t ~ r t System Design, Elsevier, New York, 19x6. [25] A. Hart, Knowledge Acquisition j i ) r Et-- pert Systems, McGraw-Hill, New York, 1986. Judith Reitman Olson Judith Olson is an Associate Professor of Computer and Infor- mation Systems at the Business School at the University of Michigan, and an Adjunct Associate Professor in the Michigan Psychology Department. She is the Director of the Human-Com- puter Interaction Laboratory, a collection of researchers from computer science, computer and information systems, organiza- tional behavior, industrial and operations engineering, technical communication, and psychology. Her work focuses on the design of interfaces for multi-media document preparation and mailing, the analysis of office work with the goal of suggestion computer- based support 01 automation, and knowledge acquisition for ex- pert systems. Henry H. Rueter Dr Rueter is a Senior Systems Analyst at Vector Research, Inc. in Ann Arbor, Michigan, and a Visiting Scholar at the Human- Computer Interaction Laboratory at the University of Michigan. His current research interests include training and expert sys- tems. One of the methods described in this paper, the Reitman- Rueter ordered tree analysis, was the subject of Dr Rueter’s PhD dissertation at Michigan. Prior to his current position, Dr Rueter received the MS degree in mathematics and the MA de- gree in psychology from Georgia State University. His under- graduate degree was a BS in chemistry from the University of Georgia 168 Expert Systems, August 1987. Vol. 4 , No. 3 .