AD-A1IX 672 STANFORD UNIV CA.DEPT OF COMPUTER SCIENCE F/G 9/2 THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM: A FRAMEWORK FOR--ETC(Ul JAN 82 W J CLANCEY N00014-79..C-0302 UNCLASSIFIED STAN-CS-81-896 NL H . 1II.I5 Noiernbr 1981 Report. No. STAN-CS--R -- .lso numinered: The Epistemology of A Rule-Based Expert System: A Framework for Explanation .4 by William J. Clancey Department of Computer Science Stanrord University Stanrord, CA 94305 C MAR 3 1,4 C.2. L= $. is re._., =ma ';,ag b.e rrI]t . w, JA ERRATUM PAGE 20: THE NODE THAT IS LABELED "P M N S>100" SOULD READ: "P M N S + BNDS < 1000" '1 UN CT.AgRTFTpT SECURITY CLASSIFICATION OF THIS PAGE lWhen Oat& Entered) REPORT DOCUMENTATION PAGE READ INSTRUCTIONS BEFORE COMPLETING FORM 1 REPORT NUMBER 2. GOVT ACCESSION NO. 3 RECIPIENT'S CATALOG NUMBER STAN-CS-81-896, Tech. Report #4 im .i _ _______'___ 4. TITLE (and Subtitle) 5. TYPE OF REPORT & PERIOD COVERED The Epistemology of a Rule-Based Expert System: technical, January 1982 A Framework for Explanation 6. PERFORMING ORG. REPORT NUMBER 7. AUTHOR(s) STAN-CS-81-896 8. CONTRACT OR GRANT NUMBER(s) William John Clancey, Ph.D. N0014-79-C-0302 9 PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK Department of Computer Science AREA & WORK UNIT NUMBERS Stanford University NR 154-436 Stanford, CA 94305 USA 12. REPORT DATE 13. 0O OF PAGES 11. CONTROLLING OFFICE NAME AND ADDRESS Personnel and Training Research Programs Jan SECURITY CLASS (of 9h82 repot' Office of Naval Research (Code 458) Arlington, VA 22217 Unclassified 14. MONITORING AGENCY NAME & ADDRESS (if diff. from Controlling Office) ONR Representative-Mr. Robin Simpson 15a. DECLASSIFICATION/DOWNGRADING Durand Aeronautics Building, Room 165 SCHEDULE §an ora U"veaor_, 16, DISTRIBUTION STATEMENT (of this report) Approved for public release; distribution unlimited. 17. DISTRIBUTION STATEMENT (of the abstract entered in Block 20, if different from report) 18. SUPPLEMENTARY NOTES To appear in the AI Journal 19. KEY WORDS IContinue on reverse side if necessary and identify by block number) 20.A RACT (Continue on rererie side if necessary and identify by block number) Production rules .are a popular representation for encoding heuristic know- ledge in programs for scientific and medical problem solving. However, experience with one of these programs, MYCIN. indicates that the representation has serious limitations: people other than the original rule authors find it difficult to modify the rule set, and the rules are unsuitable for use in other settings, such as for application to teaching. These problems are rooted in fundamental limitations in MYCIN's original rule representation: the view that expert knowledge can be encoded as a uniform, weakly-structured set of if/then associations is found to be wanting. (continued) D D O J 1473 UNCLASSIFIED EDITION OF I NOV 65 IS OBSOLETE SECURITY CLASSIFICATION OF THIS PAGE (When Data Enleredl UNCLASSIFIED SEC rITY CLASSIFICATION OF THIS PAGE (When Data Entered) 19. KEY WORDS (Continued) 20 ABSTRACT (Continued) To illustrate these problems, this paper examines MYCIN's rules from the perspective of a teacher trying to justify them and to convey a problem-solving approach. We discover that individual rules play different roles, have different kinds of justifications, and are constructed using different rationales for the ordering and choice of premise clauses. This design knowledge, consisting of structural and strategic concepts which lie outside the representation, is shown to be procedurally embedded in the rules. Moreover, because the data/hypothesis associations are themselves a proceduralized form of underlying disease models, they can only be supported by appealing to this deeper level of knowledge. Making explicit this structural, strategic and support knowledge enhances the ability to understand and modify the system. D0 FoAN7 3147 ( BACK) EOITION OF 1 NOV 66 IS O1SOLETE SECURITY CLASSIFICATION OF THIS PAGE (When Date Entered) THE EPISTEMOLOGY OF A RULE- BASED EXPERT SYSTEM: A FRAMEWORK FOR EXPLANATION William J. Clancey Department of Computer Science Stanford University, Stanford CA 94305 Contract No. N000C4-79-0302. effective March 15, 1979. Expiration Date: March 14, 1979 Total Amount of Contract -- $396,325 Principal Investigator, Bruce G. Buchanan (415) 497-0935 Associate Investigator, William J. Clancey (415) 497-1997 Sponsored by: Office of Naval Research, Personnel and Training Research Programs, Psychological Sciences Division. Contract Authority No. NR 154-436 Scientific Officers: Dr. Marshall Farr and Dr. Henry Halff The views and conclusions contained in this document are those of the authors and should not be interpret as necessrily representing the official policies, either expressed or implied, of the Office of Naval Research or the U.S. Government. Approved for public release: distribution unlimited. Reproduction in whole or in part is permitted for any purpose of the United States Government Aci, ', or ",'. ,- .A~ ~. E.t, : Abstract Production rules are a popular representation for encoding heuristic knowledge in programs for scientific and medical problem solving. However, experience with one of these programs, MYCIN, indicates that the representation has serious limitations: people other than the original rule authors find it difficult to modify the rule set, and the rules are unsuitable for use in other settings, such as for application to teaching. These problems are rooted in fundamental limitations in MYCIN's original rule representation: the view that expert knowledge can be encoded as a uniform, weakly-structured set of If/then associations is found to be wanting. To illustrate these problems, this paper examines MYCIN's rules from the perspective of a teacher trying to Justify them and to convey a problem-solving approach. We discover that individual rules play different roles, have different kinds of Just'fications, and are constructed using different rationales for the ordering and choice of premise clauses. This design knowledge, consisting of structural and strategic concepts which lie outside the representation, is shown to be procedurally embedded in the rules. Moreover, because the data/hypothesis associations are themselves a proceduralized form of underlying disease models, they can only be supported by appealing to this deeper level of knowledge. Making explicit this structural, strategic and support knowledge enhances the ability to understand and modify the system. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 2 1 Introduction Production rules are a popular representation for capturing heuristics, "rules of thumb," In expert systems. These artificial intelligence programs are designed to provide expert-level consultative advice in scientific (CRYSAIS--[1])(R1--[22]) and medical problem solving (EXPERT--[23]) (VM--(1g]). MYCIN (24] Is generally acknowledged to be one of the forerunners of this research. Shortliffe's and Buchanan's original design goal was to use a simple, uniform formalism, which was easy to modify, to encode heuristics. They cited the Importance of providing reasonable explanations, as well as good advice, for the program to be acceptable to users. This approach was in stark contrast to the common Bayesian programs which did not seek to capture an expert's reasoning steps, and so were not easily understood by clients [26]. MYCIN set the standards for much subsequent research. The success of MYCIN as a problem solver In infectious disease diagnosis, as shown in several formal evaluations [33] [34], suggested that the program's knowledge base might be a suitable source of subject material for teaching students. This use of MYCIN was consistent with the design goals that the program's explanations be educational to naive users, and that the representation be flexible enough to allow for use of the rules outside of the consultative setting. In theory, the rules acquired from human experts would be understandable and useful to students. The GUIDON program (6] [7] (8] was developed to push these assumptions by using the rules In a tutorial interaction with medical students. In attempting to "transfer back" the experts' knowledge through GUIDON, we find that the expert's approach and understanding of rules have not been represented. GUIDON cannot justify the rules because MYCIN does not have an encoding of how the concepts In 3 a rule fit together. GUIDON cannot fully articulate MYCIN's problem solving approach because the structure of the search space and the strategy for traversing it are Implicit In the ordering of rule concepts. Thus, the seemingly straightforward task of converting a knowledge based-system into a computer-aided instruction program has led to a detailed re-examination of the rule base and the foundations upon which rules are constructed, an epistemological study. In building MYCIN, rule authors did not recognize a need to record the structured way In which they were fitting rule parts together. The rules are more than simple associations between data and hypotheses. Sometimes clause order counts for everything (and the order can mean different things), and some rules are present for effect, to control the invocation of others. The uniformity of the representation obscures these various functions of clauses and rules. In looking beyond the surface of the rule representation to make explicit the intent of the rule authors, this paper has a purpose similar to Wood's "What's In a Link?" (32] and Brachman's "What's in a Concept?" [2]. We ask, "What's in a Rule?" The demands of tutoring provide a "forcing function" for articulating the structure of the rule base and the limitations of the program's explanation behavior. These Insights have implications for generating and explaining consultative advice, and modifying the system. For the moment, consider that many, if not most or all, of the issues of tutoring must be addressed In providing explanations to naive users. Consider how an expert can violate a rule In difficult, "non-standard" situations, because he can reason about the rule's Justification. Finally, consider the difficulties of changing a program whose design Is not fully documented. In building GUIDON, we thought that we were simply being "applications engineers" by THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 4 making use of MYCIN's explanation facility for a tutorial setting. it was completely surprising to find out how little the facility could accomplish for a student. Without a crisp characterization of what we expected an "explanation" to convey, the program was of questionable value. On the positive side, the study of knowledge that these shortcomings motivated led to a radical change in our conception of MYCIN's rules, and supplied a new epistemological framework for building expert systems, the subject of this paper. Subsequent sections provide a review of MYCIN's explanatory capability (Section 2), and an overview of an epistemological framework for enhancing that capability (Section 3). The following two sections examine In detail the problems of justifying a rule and explaining an approach, thereby elucidating the support and strategic aspects of the epistemological framework. In Section 6 Implications for performance of a consultation system and modifiability are considered briefly. Finally, In Section 7 the framework is used to analyze a variety of other expert systems. !6 2 MYCIN's Explanation Capability As already mentioned, MYCIN was designed so that It could explain Its reasoning to the user. This means that it can describe its reasoning steps: how a request for date Is related to a goal, how one goal leads to another, and how a goal is achieved. Figure 1 shows a consultation excerpt In which the program demonstrates a line of reasoning (chained goals) that lead backwards from the current question: Is the patient a compromised host (by rule 343 would imply) -> is the patient Immunosuppressed? (rule 395) -> is the patient receiving steroids? (rule 543) -> is e.coi causing the meningitis? Figure 2 illustrates how, In the questioning session after the consultation, one can inquire further about the program's itermediate reasoning steps, Including why It didn't ask about something. These are ibe explanation capabilities that we sought to exploit In a teaching program. MYCIN's explanations are entirely In terms of Its rules and goals. The question "WHY?" means, "How Is this Information useful?," translated internally as "in what rule does this goal appear, and what goal does the rule conclude about?" Davis, who developed the explanation facility (Figure 1), pointed out that MYCIN did not have the knowledge to respond to other Interpretations of a "WHY" question [11]. He mentioned specifically the lack of rule justifications and planning knowledge which we have studied and address In this paper. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 6 23) Is J.Smith a compromised host? *WHY [i. e. WHY is it important to determine whether J.Smith is a compromised host?] (3.8] This will aid in determining whether J.Smith is immunosuppressed. If ( 3. 1] J.Smith Is not a compromised host then It Is definite (7.0) that J.Smith Is not immunosuppressed (ROLE343] (back to question 23...] ~WHY (i. e. WHY is it important to determine whether J.Smith is immunosuppressed?] (4.8]... in order to determine whether J.Smith has received corticosteroids. If ( 4. 1 ] J.Smith Is not Immunosuppressed then (R135It Is definite (1.0) that J.Srnith has not received corticosteroids.I (back to question 23 ...] SWHY (i.e9. WHY is it important to determine whether J.Smith has received corticosteroids?] (5.8J]... in order to determine the organisms causing the infection. It has already been established that [(5. 1]) the Infection which requires therapy Is meningitis, and [(5.2] only circumstantial evidence Is available for this case, and ( 5. 3 ] the type of meningitis Is bacterial. Therefore, if (5.4 ] J.Smith Is receiving corticosteroids then there Is evidence that the organisms causing the Infection are: e.coll (.4), Alebsiela-pneumoniae (.2), or pseudomonas-aerugiroosa (.1) (RULE 543] (back to question 23...] ~NO Figure i. Consultation excerpt showing explanation capability (User input in BOLD ITALICS; all other text is printed by NYCIN.) 7 m DID YOU CONSIDER WHETHER THE PATIENT WAS RECEIVING STEROIDS? RULE543 could have used: whether J.Smith has received corticosterolds. Clause 4 of RULE543 (the patient has received corticosteroids) was already known to be false. SHOW DID YOU KNOW THAT THE PATIENT HAD NOT RECEIVED STEROIDS? RULE395 was used to conclude that J.Smith has not received corticosterolds. The last question asked before the conclusion was made was 23. SWHAT WAS QUESTION 23? Question 23 was asked in order to find out whether J.Smith is a compromised host in an effort to execute RULE343. Figure 2. Excerpt from question/answer session (User input appears in BOLD ITALICS.) In order to Illustrate other meanings for the question "Why?" In MYCIN, we Illustrate the rule set as a network of goals, rules and hypotheses (Figure 3). At the top level are all of the system's goals that It might want to pursue to solve a problem (diagnostic and therapeutic decisions). Examples of goals are: "What is the shape of the organism?" "What organism Is causing the meningitis?" At the second level are hypotheses or possible choices for each of the goals. Examples of hypotheses are: "The organism Is a rod." "E.coll Is causing the meningitis." At the third level are the rules that support each hypothesis (this Is a graph because a rule might support more than one hypothesis). At the fourth level appear the premises of these rules--specific hypotheses that must be believed for the rule to apply (multiple rules might predicate the same hypothesis). For example, for rule 643 to apply, It must be the case that the Infection Is meningitis, that the meningitis be caused by bacteria, that the patient be receiving steroids, and so on. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM S Level 1 Goal What cause of inf? What therapy? ~ more specific" 2 Hypothesis e.coli cryptococcus "concluded by" 3 Rule Rule543 Rule535 predicates" 4 Hypothesis meningitis bacterial steroids a3coholic "more general" 5 Goal What infection? What kind of meningitis? Steroids? Acohofic? Figure 3. Rule set shown as a network linking hypotheses and goals A key aspect of MYCIN's Interpreter Is that, when confronted with a hypothesis In a rule premise that It needs to confirm, it considers all related hypotheses by pursuing the more general goal. For example, attempting to apply rule 643, the program will consider all rules that conclude about the infection, rather than just those that conclude that the Infection Is meningitis. Similarly, It will consider all rules that conclude about the kind of meningitis (viral? fungal? tb? bacterial?), rather than just those that hypothesize that the meningitis Is bacterial1 . These new goals deriving from rules can now be seen conceptually This is not Inefficient, given the program's exhaustive search strategy and the fact that the other hypotheses will be referenced by other rules. Note also that some hypotheses, such as "the patient Is receiving steroids," are not generalized, but represented as goals directly. Whether or not a hypothesis is represented as a "yes/no parameter" or as a "value" of "multi-valued parameter" (such as "kind of meningitis") Is a rule-author decison? deriving from a pattern of hypotheses that he wishes to collapse for clarity into a more general goal. By this process of abstraction, a single "multi-valued as level 1 goals and the process recurs. The links in Figure 3 and their ordering are points of flexibility in the rule representation. For example, the rule author defines each goal and Its specific hypotheses (levels 1-2 and 4-5). Less trivially, it Is the rule author's choice to define rules that link hypotheses to one another (rules on level 3 link levels 2 and 4). We call the rationale behind this link the justification of the rule. GUIDON cannot teach rule justifications because they are not represented In MYCIN. Section 4 examines the nature of rule justifications and how a tutoring system can provide them. Next, by the rule author's ordering of hypotheses in a rule's premise, he will affect the order in which goals are pursued (level 5). The rationale for this choice again lies outside of the rule network. Thus, the program cannot explain why It pursues meningitis (goal 6.1 In Figure 1) before determining that the infection Is bacterial (goal 6.3). Section 5 examines how this ordering constitutes a strategy and how It can be made explicit. The order in which rules for a goal are tried (level 3) also affects the order In which hypotheses and hence subgoals are pursued (level 5). For example, rule 535 considers whether the patient Is an alcoholic, so if this rule is tried before rule 543, alcoholism will be considered before steroids. As these goals lead to asking questions of the user, it Is evident that the ordering of questions, when it derives from rule order as opposed to clause order, Is also determined by the ordering of rules. Here there Is no Implicit author rationale, for rule order lies outside of his choice; It Is parameter" dealing with kinds of surgery would replace individual "yes/no parameters" that specified "cardiac surgery," "neurosurgery," etc. These organizational decisions have no bearing on system performance, so the knowledge base Is somewhat Inconsistent In how these choices are made. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 10 fixed, but completely arbitrary. As pointed out above, MYCIN does not decide to pursue the hypothesis "bacterial meningitis" before "viral meningitis"--it simply picks up the bag of rules that make some conclusion about "kind of meningitis" and tries them In numeric order. Hence, rule order Is the answer to the question "Why is one hypothesis (for a given goal) considered before another?" And rule order is often the answer to "Why Is one question asked before another?" Focusing on a hypothesis and choosing a question to confirm a hypothesis are not necessarily arbitrary In human reasoning, raising serious questions about using MYCIN for interpreting a student's behavior and teaching him how to reason, also discussed In Section 5. To summarize, we have used a rule network as a device for illustrating aspects of MYCIN's behavior which it cannot explain. We are especially Interested In making explicit the knowledge that lies behind the behavior that is not arbitrary, but which cannot be explained because it Implicit in rule design, particularly the nature of the rule link between hypotheses ahd the ordering of hypotheses In rule premises. To do this, we will need some sort of framework for characterizing the knowledge Involved, since the rule link Itself is not sufficient. An epistemological framework for understanding MYCIN's rules is presented In the next section. 3 An Epistemological Framework for Rule-based Systems The framework presented In this section stems from an extensive study of MYCIN's rules. It is the basic framework that we have used for understanding physician's explanations of their reasoning, as well as being a foundation for re-representing the knowledge In MYCIN's rules. As an Illustration, we will consider In detail the "steroids rule" (Figure 4)2. RULE543 If: 1) The infection which requires therapy is meningitis, 2) Only circumstantial evidence is available for this case, 3) The type of the infection is bacterial, 4) The patient is receiving corticosteroids, Then: There is evidence that the organisms which might be causing the infection are e.coli (.4), klebsiella-pneumoniae (.2), or pseudomonas-aeruginosa (.1) Figure 4. The steroids rule. Figure 6 shows how this diagnostic heuristic is justified and Incorporated in a problem- solving approach by relating It to strategic, structural, and support knowledge. Recalling Figure 3, we use the term strategy to refer to a plan by which goals and hypotheses are ordered In problem solving. A decision to determine "cause of the Infection" before "therapy to administer" is a strategic decision. Similarly, a decision to pursue the hypothesis "e.coll Is causing meningitis" before "cryptococcus is causing meningitis" Is strategic. And recalling an earlier example, deliberately deciding to ask the user about steroids before alcoholism would be a strategic decision. These decisions all lie above the plane of goals and hypotheses, and as shown In Section 5, they can often be stated In 2 The English form of rules stated in this paper has been simplified for readability. Sometimes clauses are omitted. Medical examples are for purposes of Illustration only. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 12 domain-Independent terms. "Consider differential-broadening factors" Is an example of a domain-Independent statement of strategy. In order to make contact with the knowledge of the domain, a level of structural knowledge is necessary. Structural knowledge consists of abstractions that are used to Index the domain knowledge. For example, one can classify causes of disease Into common and unusual causes, as there are common and unusual causes of bacterial meningitis. These concepts provide a handle by which a strategy can be applied, a means of referencing the domain-specific knowledge. For example, a strategy might specify to consider common causes of a disease; the structural knowledge about bacterial meningitis allows this strategy to be Instantiated in that context. This conception of structural knowledge follows directly from Davis' [12) technique of content-directed invocation of knowledge sources. A handle Is a means of Indirect reference and Is the key to abstracting reasoning in domain-Independent terms. The discussion here (particularly Section 7) elaborates upon the nature of handles and their role In the explanation of reasoning. The structural knowledge we will be considering Is used to Index two kinds of hypotheses (as the term was used In Section 2): problem features which describe the problem at hand (for example, whether or not the patient Is receiving steroids Is a problem feature) and diagnoses which characterize the cause (disorder or disease) of the observed problem features (for example, acute meningitis Is a diagnosis). In general, problem features appear In the premise of diagnostic rules and diagnoses appear In the conclusion. Thus, organizations of problem features and diagnoses provide two ways of Indexing rule associations: one can use a strategy that brings certain diagnoses to mind, and consider rules that support those hypotheses; or one can use a strategy that brings certain problem 18 features to mind, gather that Information, and draw conclusions (apply rules) In a data- directed way. Figure 5 shows how a rule model or generalized rule3 , as a form of structural knowledge, enables both data-directed consideration of the steroids rule (via compromised host risk factors) or hypothesis-directed consiaeration (via unusual causes of meningitis). Illustrated are partial hierarchies of problem features (compromised host factors) and diagnoses (kinds of Infections, meningitis, etc.)--a typical form of structural knowledge. The specific organisms of the steroids rule are replaced by the set "gram-negative rods," a key hierarchical concept we use for understanding this rule. Finally, the justification of the steroids rule, a link between the problem feature hypothesis "patient Is receiving steroids" and diagnostic hypothesis "gram-negative rod organisms are causing acute bacterial infectious meningitis," is based on a causal argument about steroids Impairing the body's ability to control organisms that normally reside In the body. While this support knowledge is characteristically low-level or narrow In contrast with the strategical justification for considering compromised host risk factors, It still makes Interesting contact with structural terms, such as the mention of enterobacteriaceae (kinds of gram-negative rod organisms). In the next section, we will consider the nature of rule justifications in more detail, illustrating how structural knowledge enables us to make sense of a rule by tying It to the underlying causal process. 3 Davis' rule models (13], generated automatically, capture patterns, but they do not restate rules more abstractly as we Intend here. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 14 (STRATEGY) ESTABLISH HYPOTHESIS SPACE: CONSIDER DIFFERENTIAL-BROADENING FACTOR (RULE MODEL) IN BACTERIAL MENINGITIS, COMPROMISED HOST RISK FACTORS SUGGEST UNUSUAL ORGANISMS ANY-DISORDER INFECTION (STRUCTURE) MENINGITIS COMPROMISED HOST ACUTE CHRONIC CURRENT BACTERIAL VIRAL MEDICATIONS UNUSUAL-CAUSES SKIN ORGS (INFERENCE RULE) if STEROIDS then GRAM-NEGATIVE ROD ORGS (SUPPORT) STEROIDS IMPAIR IMMUNO-RESPONSE MAKING PATIENT SUSCEPTIBLE TO INFECTION BY ENTEROBACTERIACEAE, NORMALLY FOUND IN THE BODY. Figure 5. Knowledge for Indexing, justifying, and Invoking a MYCIN Rule 4 Justifying a Rule Here we consider the logical bases for rules: What kinds of arguments justify the rules and what is their relation to a mechanistic model of the domain? We use the terms "explain" and "justify" synonymously, though the sense of "making clear what Is not understood" (explain) is Intended more than "vindicating, showing to be right or lawful" (justify). 4.1 Different Kinds of Justifications There are four kinds of Justifications for MYCIN's rules: Identification, causal, world fact, and definition. In order to explain a rule, it is first necessary to know what kind of justification it Is based upon. Rules that use identifying properties of an object to classify it are called i(dentification rules. Most of MYCIN's rules that use laboratory observations of an unknown are like this: "if the organism Is a gram-negative, anaerobic rod, Its genus may be bacteroldes (.6)." Thus, an Identification rule is based on the properties of a class. Rules whose premise and action are related by a causal argument are called causal rules. The causality can go in either direction In MYCIN rules: "symptom caused by disease" or (more commonly) "prior problem causes disease". Szolovits suggests that It Is possible to subdivide causal rules according to the scientific understanding of the causal link: empirical association (a correlation for which the process Is not understood), complication (direction of causality is known, but the conditions of the process are not understood), and mechanistic (process Is well-modeled) (after Szolovits [28]). The CF's In MYCIN's causal rules generally represent a mixture of probabilistic and cost/benefit Judgment. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 16 Rules that are based on empirical, common sense knowledge about the world are called world fact rules. An example is, "if the patient is male, then the patient Is not pregnant or breast feeding." Other examples are based on social patterns of behavior, such as the fact that a young male might be a military recruit (and so living in a crowded environment where disease spreads readily). Domain fact rules link hypotheses on the basis of domain definitions. An example is "if a drug was administered orally and It is poorly absorbed In the GI tract, then the drug was not administered adequately." (By definition, to be administered adequately a drug must be present in the body at high enough dosage levels.) By using domain fact rules, the program can relate problem features to one another, reducing the amount of Information it has to request from the user. In summary, a rule link captures class properties, social and domain facts, and probabilstic and cost/benefit judgments. When a definition, property or world fact is Involved, simply saying this provides a reasonable explanation. But causal rules, with their connection to an underlying process of disease, require much more, so we will concentrate on them. 4.2 Levels of explanation -- what's not In a rule? In this section we will consider the problem of justifying a causal rule, the tetracycline rule, "If the patient is less than 8 years old, don't prescribe tetracycline." This rule simply states one of the things that MYCIN needs to know to properly prescribe drugs for youngsters. The rule does not mention the underlying Causal process (chelation- -drug deposition In developing bones) and the social ramifications (blackened permanent teeth) Levels of explanation -- what's not In a rule? 17 upon which It is based. From this example, It should be clear that the Justifications of MYCIN's rules lie outside of the rule base. In other words, the record of Inference steps that ties premise to action has been left out. A few issues need to be raised here: Did the expert really leave out steps of reasoning? What Is a justification for? And what Is a good Justification? Frequently we refer to rules like MYCIN's as "compiled knowledge." However, when we ask physicians to justify rules that they believe and follow, they very often can't explain why they are correct, or rationales are so long In coming and so tentative, it Is clear that we are not being told reasoning steps that are consciously followed. Leaps from data to conclusion are justified because the intermediate steps (like the process of chelation and the social ramifications) generally remain the same from problem to problem. There Is no need to step through this knowledge--to express It conditionally In rules. Thus, for the most part, MYCIN's rules are not compiled in the sense that they represent a deliberate composition of reasoning steps by the rule authors. They are compiled In the sense that they can be expanded (decomposed). If an expert does not think about the reasoning steps that justify a rule, why does a student need to be told about them? One simple reason why a student needs a justification for a rule Is so that he can remember the rule. A justification can even serve as memory aid (mnemonic) without being an accurate description of the underlying phenomena. For example,, medical students have long been told to think In terms of "bacteria eating glucose" from which they can remember that low CSF glucose Is a sign of a bacterial (as opposed to fungal or viral) meningitis. The interpretative rule Is learned by analogy to a familiar association (glucose Is a food and bacteria are analogous to larger organisms that THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 18 eat food). This explanation has been discredited by biological research, but It Is still a useful mnemonic. Given that an accurate causal argument is usually expected, how Is a satisfying explanation constructed? To see the difficulty here, observe that, In expanding a rule, there Is seemingly no limit to the details that might be included. Imagine expanding the tetracycline rule by Introducing three intermediate concepts: tetracycline in youngster 0> chelation of the drug In growing bones =0 teeth discoloration 0> undesirable body change :) don't administer tetracycline The choice of Intermediate concepts is arbitrary, of course. For example, there Is no mention of how the chelation occurs. What are the conditions? What molecules or Ions are Involved? There are levels of detail In a causal explanation. To explain a rule, we not only need to know the intermediate steps, we need to decide which steps in the reasoning need to be explained. Purpose (how deep an understanding is desirable) and prior knowledge are obviously Important. Conceptually, the support knowledge for a causal rule Is a tree of rules, where each node Is a reasoning step that can theoretically be justified In terms of finer-grained steps. The Important thing to remember is that MYCIN Is a flat system of rules. It can only state Its Immediate reasoning steps, and cannot explain them on any level of detail. 4.3 Problem features, the hypothesis taxonomy, and rule generalizations A tree of rules seems unwieldy. Surely most teachers cannot expand upon every Problem features, the hypothesis taxonomy, and rule generalizations 1i reasoning step down to the level of the most detailed physical knowledge known to man. (The. explanation tree for the tetracycline rule quickly gets into chemical bonding theory.) Explaining a rule (or understanding one) does not require that every detail of causality be considered. Instead, a relatively high level of explanation is generally satisfying--you probably felt satisfied by the explanation that tetracycline causes teeth discoloration. This level of "satisfaction" has something to do with the student's prior knowledge. For an explanation to be satisfying, It must make contact with already known concepts. We can characterize explanations by studying the kinds of Intermediate concepts they use. For example, It is significant that most contraindication rules, reasons for not giving antibiotics, will refer to "undesirable body changes." This pattern Is Illustrated hierarchically below. (The first level gives types of undesirable changes; the second level gives causes of these types.) undesirable body changes/ I tt,(<-'types' IZ'photosensitivity, diarrhea, ...teeth nausea discoloration tetracycline drug x Figure 7. Problem feature hierarchy for contraindication rules Notice that this figure contains the last step of the expanded tetracycline rule, and a leap from tetracycline to this step. The pattern connecting drugs to the Idea of undesirable body changes forms the basis of an expectation for explanations: we will be satisfied If a particular explanation connects to this pattern. In other words, given an effect that we can Interpret as an undesirable body change, we will understand why a drug causing that effect should not be given. We might want to know how the effect occurs, but here again, we will THE EPISTEMOLOGY Of A RULE-BASEO EXPERT SYSTEM 20 rest easy on Islands of familiarity, Just as we don't feel compelled to ask. why people don't want black teeth. To summarize, key concepts in rule explanations are abstractions that connect to a pattern of reasoning we have encountered before, premises that we readily accept. This suggests that one way to explain a rule, to make contact with a familiar reasoning pattern, Is to generalize the rule. We can see this more clearly from the viewpoint of diagnosis, which makes rich use of hierarchical abstractions. Consider the rule fragment, "if a complete blood count Is available and the white blood count Is less than 2.6 units, then the following bacteria might be causing Infection: e.coll (.75). pseudomonas-aeruginosa (.5), klebsielia-pneumoniae (.6)." How can we explain this rule? First, we generalize the rule (Figure 8). level 3 Compromised cc> bacteria normally found In host condition the body cause infection "causes" 0 "subtype* T "subset" pregnancy I 2 inmunosuppresslon ==> Gram negative rods and condition enterobacteriaceae "causes" "evidence" "i s a" steroids I I leukopenla ::> E.coli, Pseudomonas, \e eand Klebsiella"evidence= WBC 2.5 PNNS > 188 \ -'componentof CC data Figure 8. Generalizations of the leukopenia rule The premise concepts In the rules on levels 1 through 3 are problem features (recall discussion in Section 3), organized hierarchically by different kinds of relations. Generally, - ii Problem featw es, the hypothesis taxonomy, and rule generalizations 21 a physician speaks loosely about the connections--referrIng to "leukopenia" both as a cause of immunosuppresslon as well as a kind of Immunosuppresslon--probably because the various causes are thought of hierarchically. The similarity of this part of the diagram to Figure 7 should be evident. The relationships among CBC, WBC and Leukopenla reveal some Interesting facts about how MYCIN's rules are constructed. WBC is one component of a "complete blood count" (CBC). If the CBC Is not available, it makes no sense to ask for any of the components. Thus, the CBC clause In the leukopenia rule Is an example of a screening clause. Another example of a screening clause Is the age clause In the rule fragment, "if ... age Is greater than 1 7 and the patient is an alcoholic, then .... " Here the relation Is a social fact; If the patient Is not an adult, we assume that he Is not an alcoholic. The third relation we observe Is subtype, as in "if ... the patient has undergone surgery and the patient has undergone neurosurgery, then...." All screening relations can be expressed as rules, and some are, such as "If the patient has not undergone surgery, then he has not undergone cardiac surgery" (stated negatively, as Is procedurally useful). MYCIN's rule set Is inconsistent in this respect; to be economical and make the relationship between clauses explicit, all screening clauses should be expressed as rules. Indeed, the age/alcoholic relation suggests that som of the relations are uncertain and should be modified by certainty factors. But even when rules explicitly link problem features, (as a rule could state "if no CBC was taken, then WBC is not available"), the Aind of relation Is not represented (WBC Is a component of a CBC) because MYCIN's rule language does not allow the link to be labeled In this way. Finally, when one problem feature serves as a redefinition of another, such as the THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 22 relation between leukopenia and WBC, the more abstract problem feature tends to be left out altogether ("leukopenia" is nt a MYCIN parameter, the rule mentions WBC directly). For purposes of explanation, we argue that problem features, their relations, and the nature of the link should be explicit. Returning to Figure 8, the action concepts, diagnostic hypotheses, are part of a large hierarchy of causes that the problem solver will cite In the final diagnosis. The links In this diagnosis space generally specify refinement of cause, though in our example they strictly designate subclasses. Generally, problem features are abstractions of patient states Indicated by the observable symptoms, while the diagnosis space is made up of abstractions of causal processes that produce the symptoms. Paralleling our observations about rule problem features, we note that the relations among diagnostic hypotheses are not represented in MYCIN--nowhere In the knowledge base does It explicitly state that e.colU is a bacterium. Now suppose that the knowledge in Figure 8 were available, how would this help us to explain the leukopenla rule? The Idea is that we first restate the rule on a higher level. We point out that a low WBC Indicates leukopenia, which is a form of Immunosuppresslon, thus tying the rule to the familiar pattern that Implicates gram-negative rods and enterobacterlaceae. This is directly analogous to pointing out that tetracycline causes teeth discoloration, which Is a form of undesirable body change, suggesting that the drug should not be given. By rerepresenting Figure 8 linearly, we see that it Is an expansion of the original rule: WBC < 2.5 0) Leukopenia > Immunosuppression a> compromised host . Infection by organisms found in body 0 gram-negative rods & enterobacterlaceae 0) e.coii, paeudomonas, and klebsiella. Problem features, the hypothesis taxonomy, and rule generalizations 23 The expansion marches up the problem feature hierarchy and then back down the hierarchy of diagnoses. The links of this expansion Involvd causality composed with Identification, subtype and subset relations. By the hierarchical relationships, a rule on one level "explains" the rule below It. For example, the rule on level 3 provides the detail that links Immunosuppression to the gram-negative rods. By generalizing we have made a connection to familiar concepts. Tabular rules provide an interesting special case. The CSF protein rule (Figure 9) appears to be quite formidable. RULE508 If: 1) The infection which requires therapy is meningitis, 2) A lumbar puncture has been performed on the patient, and 3) The CSF protein is known Then: The type of the infection Is as follows: If the CSF protein is: a) less than 41 then: not bacterial (.5), viral (.7), not fungal (.6), not tb (.5); b) between 41 and 109 then: bacterial (.A), viral (.4), fungal (.1); c) between 198 and 298 then: bacterial (.3), fungal (.3), tb (.3); d) between 288 and 388 then: bacterial (.4), not viral (.5), fungal (.4), tb (.4); e) greater or equal to 388 then: bacterial (.4), not viral (.6), fungal (.4), tb (4); Figure 9. The CSF protein rule Graphing this rule (Figure 10), we find a relatively simple relation that an expert stated as, "If the protein value is less than 40, I think of viral; If it Is more than 100, I think of bacterial, fungal or TB." This is the first level of generalization, the principle that Is Implicit In the rule. The second level elicited from the expert is, "If the protein value Is low, I think of an acute process; if it Is high, I think of a severe or long term process." 4 Then, a 4 Bacterial meningitis Is a severe, acute (short term) problem, while fungal and TB meningitis are problems of long (chronic) duration. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 24 level higher, "an infection In the meninges stimulates protein production." So in moving up abstraction hierarchies on both the premise and action side of the rule (acute and chronic are subtypes of infection), we arrive at a mnemonic, Just like "bacteria eat glucose." Abstractions of both the observations and the conclusions are important for understanding the rule. Bel ief + CF TYPE of MENINGITIS given the CSF PROTEIN .8 OF8- -V-'- V '' , 58 le 156 266 250 306 CSF Protein Value -. 4 -BT -F -. 8 Key: B z BACTERIAL T = TUBERCULOSIS V = VIRAL F = FUNGAL Figure 10. Graph of the conclusions made by rule 500 (Figure 9) We might be surprised that explanations of rules provide levels of detail by referring to more general concepts. We are accustomed to the fact that principled theoretical explanations of, say, chemical phenomenon, refer to atomic properties, finer-grained levels of causality. Why should a rule explanation refer to concepts like "compromised host" or "organisms normally found In the body"? The reason is that In trying to understand a rule like the steroids rule, we are first trying to relate it to our understanding of what an Infection Is at a high, almost metaphorical level. In fact, there are lower level "molecular" details of the mechanism that could be explained, for example, how steroids actually Problem features, the hypothesis taxonomy, and rule generalizations 25 change the immunological system. But our first goal as understanders, our focus, Is at the top level--to link the problem feature (steroids) to the global process of meningitis infection. We ask, "What makes it happen? What role do steroids play In the Infectious meningitis process?" The concept of "compromised host" Is a label for a poorly understood causal pattern that has value because we can relate It to our understanding of the Infectious process. It enables us to relate the steroids or WBC evidence to the familiar metaphor in which Infection is a war that is fought by the body against Invading organisms. (If a patient Is compromised, his defenses are down; he is vulnerable to attack.) In general, causal rules argue that some kind of process has occurred. We expect a top-level explanation of a causal rule to relate the premise of the rule to our most general Idea of the process being explained. This provides a constraint for how the rule should be generalized, the subject of the next section. 4.4 Tying an Explanation to a Causal Model MYCIN's diagnostic rules are arguments that a process has occurred In a particular way, namely, that an Infectious process has transpired In the patient along certain lines. There are many kinds of Infections, which have different characteristics, but bacterial Infections tend to follow the same script: entry of an organism Into the body, passage of the organism to the site of infection, reproduction of the organism, and causing of observable symptoms. An explanation of a rule that concludes that an organism Is causing an Infection must demonstrate that this generic process has occurred. In short, this Is the level of abstraction that the explanation must connect to. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 26 A program was written to demonstrate this idea. The data parameters in MYCIN's 40 diagnostic rules for bacterial meningitis are restated as one or more of the steps of the Infectious process script. This restatement is then printed as the explanation of the rule. For example, the program's explanation of the alcoholic rule (alcoholism > diplococcus meningitis) is. The fact that the patient Is an alcoholic allows access of organisms from the throat and mouth to lungs (by reaspiration of secretions). The fact that the patient Is an alcoholic means that the patient is a compromised host, and so susceptible to Infection. Words in Italics (of the first sentence) constitute the pattern of "portal and passage." We find that the premise of a rule generally supplies evidence for only a single step of the causal process: the other steps must be inferred by default. For example, the alcoholic rule argues for passage of the diplococcus to the lungs. The person reading this explanation must know that diplococcus is normally found in the mouth and throat of. any person and it proceeds from the lungs to the meninges by the blood. The organism finds conditions favorable for growth because the patient is compromised, as stated in the explanation. In contrast, the leukopenla rule only argues for the patient being a compromised host, so the organisms are the default organisms, those already in the body which can proceed to the site of Infection. 5 These explanations say which steps are enabled by the data. They place the patient on the path of an infection, so to speak, and leave It to the understander to fill In the other 6 As might be expected, alcoholism also causes Infection by the gram negative rods and enterobacterlaceae. We have omitted these for simplicity. However, this example Illustrates that a MYCIN rule can have multiple conclusions reached by different causal paths. Tying an Explanation to a Causal Model 27 steps with knowledge of how the body normally works. This Is why the physician generally refers to the premise data as "predisposing factors." (MYCIN's heuristics are generally of the form "predisposing factor causes disease.") They argue that prior steps in a causal process have occurred. it is to the level of these prior steps, general concepts that explain many rules, that the rule should be related In order to be understood. The process of explanation Is a bit more complicated In that a causal relation may exist between clauses In the rule. We have already seen that one clause may screen another on the basis of world facts, multi-component test relations, and the subtype relation. The program described here knows these relations and "subtracts off" screening clauses from the rule. Moreover, as discussed in Section 6, some clauses describe the context in which the rule applies (the role of the first 3 clauses in the Steroids rule, Figure 4). These, too, are maide explicit for the explanation program, and subtracted off. In the vast majority of MYCIN rules, only one premise clause remains, and this is related to the process of infection in the way described above. When more than one clause remains after screening and contextual clauses have been removed, our study shows that a causal connection exists between the remaining clauses. We can always isolate one piece of evidence that the rule is about (for example, WBC in the Leukopenia rule); we call this the key factor of the rule. We call the remaining clauses restriction clauses and observe three kinds of relations between a restriction clause and a key factor: -- A confirmed diagnosis explains a symptom. (Example: a petechial rash would normally be evidence for neisseria, but If the patient has leukemia, It may be the disease THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 28 causing the rash. Therefore the rule states, "if the patient has a petechial rash (the key factor) and does not have leukemia (the restriction clause), then nelsserla may be causing the meningitis.") -- Two symptoms In combination suggest a different diagnosis than one taken alone. (Example: When both purpuric and petechial rashes occur, then a virus Is a more likely cause than neisseria. Therefore, the petechial rule also Includes the restriction clause "the patient does not have a purpuric rash.") -- Weak circumstantial evidence Is made Irrelevant by strong circumstantial evidence. (Example: a head Injury so strongly predisposes a patient to Infection by skin organisms that the age of the patient, a weak circumstantial factor, Is made irrelevant.) Restriction clauses are easy to detect when examining the rule set because they are usually stated negatively ("some problem feature or diagnosis does not apply"). These examples are suggestive of the causal reasoning about problem features that we find in diagnosis when the causality of the system is better understood, as in electronic troubleshooting. In summary, to explain a causal rule, a teacher must know the purposes of the clauses and connect the rule to abstractions in the relevant process script. 4.5 The Relation of Medical Heuristics to Principles It might be argued that we have to go to so much trouble to explain MYCIN's rules because they are written on the wrong level. Now that we have a "theory" for which The Relation of Medical Heuristics to Principles 29 intermediate parameters to Include ("portal," "pathway," etc.), why don't we simply rewrite the rules? The medical knowledge we are trying to codify is really on two levels of detail: 1) principles or generalizations, and 2) empiric details or specializations. MYCIN's rules are empiric. Cleaning them up by representing problem feature relationships explicitly would give us the same set of rules at a higher level. But what would happen If process concepts were Incorporated In completely new reasoning steps, for example, If the rule set related problem features to hypotheses about the pathway the organism took through the body? It turns out that reasoning backwards in terms of a causal model Is not always appropriate. As we discovered when explaining the rules, not all of the causal steps of the process can be directly confirmed; we can only assume that they have occurred. For example, rather than providing diagnostic clues, the concept of "portal of entry and passage" is very often deduced from the diagnosis Itself. According to this view, principles are good for summarizing arguments, and good to fall back on when you've lost grasp on the problem, but they don't drive the process of medical reasoning. Specifically, (1) If a symptom needs to be explained (is highly unusual). we ask what could cause it ("Strep-viridans? It is normally found In the mouth. How did it get to the heart? Has the patient had dental work recently?"); (2) to "prove" that the diagnosis is correct (after it has been constructed), we use a causal argument ("He has pneumonia; the bacteria obviously got into the blood from the lungs"). Thus, causal knowledge can be used to provide feedback that everything fits. It may be difficult or Impossible to expect a set of diagnostic rules to serve both as concise, "clincher" methods for efficiently getting to the right data, and still represent a THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 30 model of disease. Put another way, a student may need the model If he is to understand new associations between disease and manifestations, but he will be an Inefficient problem solver If he always attempts to directly convert that model to a subgoal structure for solving ordinary problems. Szolovits [28] points out that these "first principles," used by a student, are "compiled out" of an expert's reasoning. In meningitis diagnosis, the problem is to manage a broad, if not Incoherent, hypothesis set, rather than to pursue a single causal path. The underlying theory recedes to the background, and the expert tends to approach his problem simply In terms of weak associations between observed data and bottom-line conclusions. This may have promoted a rule-writing style that discouraged introducing intermediate concepts, even where they might have been appropriate, for example, the concept of "leukopenla" described above. 31 6 Teaching Problem-Solving Strategy A strategy is an approach for solving a problem, a plan for ordering methods so that a goal is reached. it is well-accepted that strategic knowledge must be conveyed In teaching diagnostic problem solving: Without explicit awareness of the largely tacit planning and strategic knowledge inherent In each domain, it is difficult for a person to "make sense of" many sequences of behavior as described by a story, a set of instructions, a problem solution, a complex system, etc.... The teacher should articulate for that domain the higher-order planning knowledge and strategic knowledge for formulating and revising hypotheses about what something means. [3] Strategic knowledge is general, much like the principles of mechanism we discussed earlier; both relate to processes that have structure. Thus, it Is not sufficient to merely show a student MYCIN's solution, the surface structure of the program, we must explain why the rules are Invoked in a particular order. Here it is clear how teaching how to do something differs from merely explaining how something was done: we want the student to be able to replicate what he observes, to go off and solve similar problems on his own. This Is why providing mnemonics Is useful when justifying a rule. Regarding strategy, we must again address human foibles and preference: we must teach a strategy that a human can follow. The main points of this section are: -- MYCIN's strategy is different from a physician's strategy THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 32 -- MYCIN's strategic knowledge is embedded in the rules, indistinguishable from screening and problem feature clauses. -- A domain-independent represt Ltation of strategy Is useful for teaching and for purposes of economy. 5.1 Surface and Deep Structure of MYCIN A person trying to understand a MYCIN consultation observes that pieces of data are requested by the program (Figure 11). Conceptually, these questions are terminals hanging below an AND (rule) node in a subgoal tree (one portion corresponding to Figure 1 1 Is shown In Figure 12). Observe that this example, considering the order of questions asked by the program, is a special case of the rule network shown earlier (Figure 3)--here we show only the hypotheses (Figure 3, level 4) that lead directly to a question. Following the terminology of Brown, et al. [3], a rule node Is a method for achieving a goal (e.g., "organisms that might be causing the infection") by asking questions or pursuing a set of subgoals to achieve (the premise of a particular rule). Therefore, the tree of rules and subgoals is part of a deep-structured trace that Brown, et al. postulate is constructed when the understander makes sense of the surface problem solution. Surface and Deep Structure of MYCIN 33 31) Has Pt538 ever undergone any type of' surgery? ** YES 32) Does Pt538 have a history of NEUROSURGERY? ** NO 33) Does Pt538 live in a crowded environment? ** No 34) Do you suspect recent alcoholic history in Pt538? ** No 35) Is meningitis a hospital-acquired infection? ** YES 36) Is Pt538's clinical history consistent with EPIGLOTTITIS? ** NO 37) Is Pt538-s clinical history consistent with OTITIS-fIEDIA? ** No 38) Has Pt538 ever undergone splenectony? ** NO 39) Is Pt538 a burn patient? a'YES Figure 11. MYCIN consultation excerpt GOAL HYPOTHESIS METHOD QUESTION E.COLI -(Rule5ll)- Q32 NEUROSURGERY -NEISSERIA -(Rule533)- Q33 CROWD -D.PNU r 1(11,e035)- Q34 ALCOHOLIC (Rule559)- Q38 SPLENECTOIY COVERFOR - H.NFL j (Rle55)-Q35 NOSOCOMIAL (Rul395) Q36 EPIGLOTTITIS Q37 OTITIS-MEDIA PSEUDOlO. -(Rule578)- Q39 BURN Figure 12. Portion of the AND/OR tree corresponding to the questions shown In Figure 11 (reorganized according to the hypothesis each rule supports). THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 34 It is not sufficient for a student to know all of the possible methods he can bring to bear on a problem. For example, a student who knows what kinds of algebraic transformqtlons can be used to solve for "x" in "x*'2 - 8 a1" could only proceed to apply the methods randomly if he didn't have a plan for solving the problem (that Is, have schemas for kinds of problems that can be tackled using different approaches or lines of reasoning.) A plan sets up a rational sequence of applications of methods that might get you closer to the solution (though this Is not guaranteed). The hypothetico-deductive strategy used in medical problem solving constitutes a plan for focusing on hypotheses and selecting confirmatory questions [1 7]. However, the methods selected In Figure 12 (rules 511 through 578) have been applied In a fixed, arbitrary order--not planned by the rule author. MYCIN has no "deep structure" plan at this level; the program is simply applying rules (methods) exhaustively. This lack of similarity to human reasoning severely limits the usefulness of the system for teaching problem-solving. However, MYCIN does have a problem solving strategy above the level of rule application, namely the control knowledge that causes it to pursue a goal at a certain point In the diagnosis. We can see this by examining how rules Interact In backward chaining. Figure 13 shows the goal rule and a rule that it Indirectly Invokes. Surface and Deep Structure of MYCIN 385 RuleS92 (The Goal Rule) If: 1) Gather information about cultures taken from the patient and therapy he is receiving, 2) Determine if the organisms growing on cultures require therapy 3) Consider circumstantial evidence for additional organisms that therapy should cover Then: Determine the best therapy recommendation RULE535 (The Alcoholic Rule) If: 1) The infection which requires therapy is meningitis, 2) Only circumstantial evidence is available for this case, 3) The type of meningitis is bacterial, 4) The age of the patient is greater than 17 years, and 5) The patient is an alcoholic. Then: There Is evidence that the organisms which might be causing the infection are diplococcus-pneumoniae (.3) or e.coli (.2) Figure 13. The goal rule and alcoholic rule. In order to evaluate the third clause of the goal rule, MYCIN tries each of the "cover for" rules; the alcoholic rule Is one of these (see also Figure 12). We call the goal rule a task rule to distinguish it from inference rules. Clause order counts here; this Is a procedure, not a logical conjunction. MYCIN has a few other task rules, but procedural knowledge appears In almost every rule. The first three clauses of the alcoholic rule, the context clauses, really control the order in which goals are pursued, just as In a task rule. We can represent this hidden structure of goals by a tree which we call the Inference structure of the rule base (produced by "hang,ng" the rule set from the goal rule). Figure 14 Illustrates part of MYCIN's Inference structure. THE EPISTEMOLOGY OF A RULEBASED EXPERT SYSTEM 36 REGIMEN = main goal rule2,c12 TREATFO COVERFOR a rule92, c13 (9)1 W"AT-INF? SIGNIJICANT? IDENTITY? MENINGIT\IS? ACTERIAL? (2)1 (4) (5) ()8 INFECION? CONTA INANT? INFECTON? (1) (3) (6) Figure 14. Portion of MYCIN's Inference structure (Numbers give the order in which non-place-holder goals are achieved by the depth-first interpreter.) The first three goals on the third level appear In a single task rule, while the last two goals (meningitis? and bacterial?) correspond to the first and third clauses of the 40 "cover for" rules similar to the alcoholic rule. The program's strategy comes to light when we list these goals in the order in which they are actually achieved. (By "achieved" we mean that all rules that conclude about that goal have been tried.) For example, under the depth-first control of the Interpreter, all "bacterial?" rules must be applied before any "cover for" rules are completed, so "bacterial?" Is achieved before "cover for." This gives us the ordering (numbers correspond to the numbering In Figure 14): 1. Is there an Infection? 2. Is It bacteremia, cystitis, or meningitis? 3. Any contaminated cultures? 4. Any good cultures with significant growth? 5. Is the organism Identity known? 6. Is there an infection? (already done In step 1) 7. Does the patient have meningitis? (already done in step 2) 8. Is It bacterial? . .. ..... , -I I II i4 Surface and Deep Structure of MYCIN 37 9. Are there specific bacteria to cover for? (We leave out the goals "regimen" and "treatfor" because they are just place holders for task rules, like subroutine names.) MYCIN's diagnostic plan Is in two parts, and both proceed by top-down refinement. This demonstrates that a combination of structural knowledge (the diagnosis space taxonomy--infection, meningitis, bacterial, diplococcus...) and strategic knowledge (traversing the taxonomy from the top down) is procedurally embedded in the rules. In other words, we could write a program that Interpreted an explicit, declarative representation of the diagnosis taxonomy and domain-independent form of the strategy to bring about the same effect. At this level, MYCIN's diagnostic strategy Is not a complete model of how physicians' think, but it could be useful to a student. As the quote from Brown et al. would Indicate, and has been confirmed In GUIDON research, teachers do articulate both the structure of the problem space and the nature of the search strategy to students. This means that we need to explicitly represent the fact that the diagnosis space Is hierarchical, and represent strategy in a domain-independent form. If the strategy Is not in domain-independent form, It can be taught by examples, but not explained. 5.2 Representing Strategic Knowledge in Meta-rules How might we represent domain-independent strategic knowledge In a rule-based system? In the context of the MYCIN system, Davis pursued the representation of strategic knowledge by using meta-rules to order and prune methods. These meta-rules are Invoked just before the object-level rules are applied to achieve a goal. An example of an Infectious disease meta-rule is shown in Figure 16. Observe that this Is a strategy for S..... ...... .... i THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 38 pursuing a goal. In particular, this meta-rule might be associated with the goal "identity of the organism." It will be Invoked to order the rules for every subgoal In the search tree below this goal; In this simple way, the rule sets up a line of reasoning. This mechanism causes some goals to be pursued before other, orders the questions asked by the system, and hence changes the surface structure of the consultation. META-RULEee I If: 1) the infection is pelvic-abscess, and 2) there are rules which mention in their premise enterobacteriaceae, and 3) there are rules which mention in their premise grampos-rods, Then: There is suggestive evidence (.4) that the former should be done before the latter. Figure 15. A MYCIN meta-rulez While meta-rules like this can capture and Implement strategic knowledge about a domain, they have their deficiencies. For ike the MYCIN performance rules we have examined, Davis's domain-dependent examples of meta-rules leave out knowledge Important for explanation. Not only do they, like the object-level rules, leave out the domain-specific support knowledge that justifies the rules, they leave out the domain- Independent strategic principles that GUIDON should teach. In short, meta-rules provide the mechanism for controlling the use of rules, but not the domain-independent language for making the strategy explicit. The implicit strategic principle that lies behind meta-ruleOO1 is that common (frequent) causes of a disorder should be considered first. The structural knowledge that ties this strategy to the object-level diagnostic rules is an explicit partitioning of the diagnosis space taxonomy, Indicating that the group of organisms called the Representing Stralegic Knowledge In Mete-rules 39 enterobacterlaceae are more likely than gram-positive rod organisms to cause pelvic infections. This is what we want to teach the student. One can Imagine, for different infection types, different common causes, requiring a different meta-rule for each Infectio. But if all mete-rules are as specific as meta-rule-O01, principles will be compiled into many rules redundantly and the teaching points will be lost. What does a domain-Independent mete-rule look like, and how Is it Interfaced with the object-level rules? To explore this question, we have reconfigured the MYCIN rule base into a new system, called NEOMYCIN [9]. Briefly, meta-rules are organized hierarchically (again!) into tasks, such as "group and refine the hypothesis space." These rules manage a changing hypothesis list by applying different kinds of knowledge sources, as appropriate. Knowledge sources are essentially the object level rules, Indexed In the diagnosis space taxonomy by a domain-independent structural language. For example, one meta-rule for achieving the task of pursuing an hypothesis Is "If there are unusual causes, then pursue them." 6 Suppose that the current hypothesis Is "bacterial meningitis." The program will use the structural label "unusual causes" to retrieve the nodes "gram-negative rods", "enterobacteriaceae", and "listerla", add them to the hypothesis list, and pursue them in turn. (When there are no "unusual causes" indicated, the meta-rule simply does not apply.) Pursuing gram-negative rods, the program will find that leukopenia is a relevant factor, but first ask if the patient Is a compromised host (Figure 8), modelling a physician's efficient casting of wider questions. 6 This rule appears after the rule for considering common causes, and the ordering Is marked as strategically significant. Domain-independent mete-rules have justifications, organization, and strategies for using them. Their justification refers to properties of the search space and the processor's capabilities. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 40 Other terms In the structural language used by NEOMYCIN's domain Independent meta- rules are: (disease) process features, such as extent and location; the enabling step of a causal process; subtype; cause; trigger association; problem feature screen; and structure. properties of the taxonomy, such as sibling. In effect, the layer of "structural knowledge" allows us to separate out what the heuristic Is from how It will be used. How domain- specific heuristics (like MYCIN's rules) should be properly Integrated with procedural, strategic knowledge Is an Issue at the heart of the "declarative/procedural controversy" (30]. We conclude here that, for purposes of teaching, the hierarchies of problem features and the diagnosis space should be represented explicitly, providing a useful means for Indexing the heuristics by both their premise and action (Figure 5). A structural language of cause, class, and process connects this domain-specific knowledge to domain-independent meta-rules, the strategy for problem solving. 5.3 Self-Referencing Rules Self-referencing rules provide an interesting special example of how problem solving strategies can be embedded in MYCIN's rules. A rule is self-referencing if the goal concluded by the action is also mentioned in the premise. An example Is the "aerobicity rule" (Figure 16).7 7 Aerobicity refers to whether an organism can grow in the presence of oxygen. A facultative organism can grow with or without it; a non-aerobic organism cannot grow with oxygen present; and an obligate-aerob Is aerobic only In a certain stage of growth. Self-Referencing Rules 41 Ru I e986 If: 1) The aerobicity of the organism is not known and 2) The culture was obtained more than 2 days ago, Then: There is evidence that the aerobicity of the organism is obligate-aerob (.5) or facultative (.5) Figure 16. The self-referencing aerobicity rule. This rule Is tried only after all of the non-self-referencing rules have been applied. The conclusion of these rules Is held fixed, then the self-referencing rules are tried. The effect Is to reconsider or reflect upon a tentative conclusion. When the original conclusion Is changed by the self-referencing rules, this is a form of non-monotonic reasoning [31]. We can restate MYC(NWs self-referencing rules in domain-independent terms: -- If nothing has been observed, consider situations that have no visible manifestations. (Example: The aerobicity rule--"if no organism Is growing in the culture, It may be an organism that takes a long time to grow (obligate-aerob and facultative organisms).") The self-referencing mechanism makes It possible to state this rule without requiring a long premise that is logically exclusive from the remainder of the rule set. -- If unable to make a deduction, assume the most probable situation. (Example: "if the gram stain is unknown and the organism is a coccus, then assume that It Is gram positive.") -- If there is evidence for two hypotheses, A and 8, that tend to be confused, then rule out B. (Example: "if there is evidence for Tb and Fungal, and you have hard data for Fungal, rule out Tb.") THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 42 Like Meta-ruleO01, self-referencing rules provide a useful mechanism for controlling the use of knowledge, but they leave out both the domain-dependent justification end the general, domain-independent reasoning strategy of which they are examples. These rules Illustrate that strategy Involves more than a search plan; It also takes In principles for reasoning about evidence, what Collins calls "plausible reasoning" [10]. It is not clear that a teacher needs to explicitly state these principles to a student. They tend to be either "common sense" or almost impossible to think about Independently of an example. Nevertheless, they are yet another example of strategic knowledge that Is Implicit In MYCIN's rules. / 43 8 Implications for Modifiability and Performance MYCIN's rule authors believed that the program achieved good problem-solving performance without the structural, strategic, and support knowledge we have been considering. However, It Is possible to Imagine situations In which knowledge of justification and strategy allows one to be a more flexible problem solver, to cope with novel situations. MYCIN cannot solve some kinds of difficult problems, the problems we say only an expert can solve. Knowing the basis of a rule allows you to know when not to apply It, or how to modify It for special circumstances. For example, knowing that tetracycline won't kill the patient, but the Infection might, you may have to dismiss social ramifications and prescribe the drug. You can deliberately break the rule because you understand it. There will also be problems which cannot be diagnosed using MYCIN's rules. For example, several years ago coccidioides meningitis strangely appeared In the San Francisco • ay Area. We would say that such a case "violates all the rules." To explain what was happening, one has to reason about the underlying mecharnisms. The organisms were travelling from the San Joaquin Valley to the Bay Area by "freak Southeastern winds," as the newspapers reported. The basic mechanism of disease was not violated, but this time the patients didn't have to travel to the Valley to come in contact with the disease. A human expert can understand this because he can fit the new situation to his model. Examples like these make us realize that Al systems like MYCIN can only perform some of the functions of an expert. Regarding modifiability, the process of implementing NEOMYCIN required many hours of THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 44 consultation with the original rule authors in order to unravel the rules. As shown In this paper, the lack of principles for using the representation makes It difficult to Interpret the purposes of clauses and rules. The strategy and overall design of the program have to be deduced by drawing diagrams like Figure 14. Imagine the difficulty any physician new to MYCIN would have modifying the CSF protein table (Figure 9); clearly he would first need the program to explain why it is correct. We also need a principled representation to avoid a problem we call concept broadening. When intermediate problem abstractions are omitted, use of goals becomes generalized and weakened. This happened In MYCIN as the meaning of "significance" grew to Include both "evidence of infection" and "non-contaminated cultures." (So rules were written In the form "If X then significant disease," rather than "if X then evidence of infection" and "if evidence of infection, then significant disease.") As long as the rule author makes an association between the data and some parameter he wants to influence, it doesn't matter for correct performance that the rule is vague But vague rules are difficult to understand and modify. A rule base is built upon and extended like any other program. Extensive documentation and a well-structured design are essential, as in any engineering endeavor. The framework of knowledge types and purposes that we have described would constitute a "typed" rule language that could make it easier for an expert to nrganize his thoughts. On the other hand, we must realize that this meta-level analysis may Impose an extra burden by turning the expert Into a taxonomist of his own knowledge--a task that may require considerable assistance. 45 7 Application of the Framework to Other Systems To Illustrate further the idea of the strategy, structure, support framework and to demonstrate its usefulness for explaining how a program reasons, several knowledge-based programs are described below in terms of the framework. For generality, we will call Inference associations like MYCIN's rules "knowledge sources" (KS). This analysis Is not concerned with the representational notation used In a program, whether It be frames, production rules, units, and so on. Instead we are trying to establish an understanding of the knowledge contained In the system: what kinds of Inferences are made at the KS level, how these KS's are structured explicitly In the system, and how this structure Is used by strategies for invoking KS's. THE EPISTEMOLOOY OF A RULEBASED EXPERT SYSTEM 46 META-LEVELS OF PROBLEM-SOLVING KNOWLEDGE SYSTEM STRATEGY J STRUCTURE KS examples J SUPPORT Domain = Chemistry, Mass spectrometry analysis DENDRAL Aggregation Family trees Molecular Buch- heuristics of functional identification chemistry anan, build superatms groups rules relating et al., & generate all (ketones, functional 1969 plausible ethers, etc.) groups to interstitial spectral peaks structures Domain= Speech Understanding HEARSAY It Policy KS's hierarchy of Example: KS's grammar & control hypoths interpretation hypothesizing identification Lesser, to generate & levels with words use properties of et al.. thresholds Jlinks to KS-s syllable level phonemes, 1975 (data-directed) for data syllables & words Domain = Concept formation, mathematical discovery AM Activity hierarchy of rules to create theory of Lenat heuristics heuristics concepts and interesting- 1976 propose tasks associated with fill-in facets, ness-chiefly (priority most general e.g., heuristics based on agenda & focus concept/context to fill-in the generalizing heuristic) to which they "examples" facet & specializing apply for ANY-SET Domain = Molecular Genetics, Experiment planning NOLGEN Determine hierarchy of specific lab Molecular Stefik differences, laboratory techniques: biology 1979 sketch.plan, operation types Input objects processes refine steps (used by -> molecular (message refinement changes & passing) design operator) byproducts Domain = Medical diagnosis, pulmonary function CENTAUR Hypothesis- I Hierarchy of disease I disease Aikins directed, top- disease component -> patterns; 1989 down refinement prototypes evidence for J biological (agenda) I prototype processes Domain = Medical diagnosis, diseases causing neurological symptoms NEO- MYCIN Grouping and Multiple data -> refining hierarchies evidence for same as abov Clancey hypothesis list of etiological disease process 1981 meta-rules processes or causall91 ocus £pursue) l state/category Figure 17. Relation of strategical, structural, and support meta-knowledge to knowledge sources of expert programs The Character of Structural Knowledge 47 7.1 The Character of Structural Knowledge One product of this study is a characterization of different ways of structuring KS'3 for different strategical purposes. In all cases, the effect of the structural knowledge Is to provide a handle for separating out what the KS Is from when It is to be applied s . The different ways of structuring KS's are summarized here according to the processing rationale: -- Organize KS's hierarchically by hypothesis for consistency in data-directed interpretation. In DENDRAL, if a functional group is ruled out, more specific members of the family are not considered during forward-directed, preliminary Interpretation of spectral peaks. Without this organization of KS's earlier versions of DENDRAL could generate a subgroup as a plausible interpretation, while ruling out a more general form of the subgroup, as if to say, "This is an ethyl ketone but not a ketone." [6] -- Organize KS's hierarchically by hypothesis to eliminate redundant effort In hypothesis-directed refinement. In DENDRAL, the family trees prevent the exhaustive structure generator from generating subgroups whose more general forms have been ruled out. The same principle is basic to most medical diagnosis systems that organize diagnoses In a taxonomy and use a top-down refinement strategy, such as CENTAUR and NEOMYCIN. -- Organize KS's by multiple hypothesis hierarchies for efficient grouping 8 In this section, the term "hypothesis" generally refers to a diagnostic or explanatory Interpretation made by a KS (In terms of some model), though It can also be a hypothesis that a particular problem feature is present, as In CRYSALIS. THE EPISTEMOLOGY Of A RULE-BASEO EXPERT SYSTEM 48 (hypothesis-space splitting). Besides using the hierarchy of generic disease processes (infectious, cancerous, toxic, traumatic, psychosomatic, etc.), NEOMYCIN groups the same diseases by multiple hierarchies according to disease process features (organ system involved, spread in the system, progression over time, etc.). When hypotheses are under consideration that do not fail Into one confirmed subtree of the primary etiological hierarchy, the "group and differentiate" strategy is invoked to find a process feature dimension along which two or more current hypotheses differ. A question will then be asked, or a hypothesis pursued, to differentiate among the hypotheses on this dimension. -- Organize KS's for each hypothesis on the basis of how KS data relates to the hypothesis, for focusing on problem features. In NEOMYCIN, additional relations make explicit special kinds of connections between data and hypotheses, such as "this problem feature Is the enabling causal step for this diagnostic process," and meta-rules order the selection of questions (invocation of KS's) by Indexing them indirectly through these relations ("if an enabling causal step is known for the hypothesis to be confirmed, try to confirm that problem feature"). The meta-rules that reference these different relations ("enabling step", "trigger", "most likely manifestation") are ordered arbitrarily. Meta-meta- rules don't order the meta-rules because we currently have no theoretical basis for relating the first order relations to one another. -- Organize KS's Into data/hypothesis levels for opportunistic triggering at multiple levels of Interpretation. HEARSAY's blackboard levels (sentence, word sequence, word, etc.) organize KS's by the level of analysis they use for data, each level supplying data for the hypothesis level above it. When new results are posted on a given level, KS's The Character of Structural Knowledge 49 that care about that level of analysis are polled to see If they should be given processing time. Policy KS's give coherence to this opportunistic Invocation by affecting which levels will be given preference. CRYSALIS takes the idea a step further by having multiple planes of blackboards; one abstracts problem features (the "density plane") and the other abstracts interpretations (the "model plane"). -- Organize KS's Into a task hierarchy for planning. In MOLGEN, laboratory operators are referenced indirectly through tasks that are steps in an abstract plan. For example, the planning level design decision to refine the abstract plan step, MERGE, is accomplished by indexing laboratory operators by the MERGE task (e.g., MERGE could be refined to using a ligase to connect DNA structures, mixing solutions, or causing a vector to be absorbed by an organism). Thus, tasks In planning are analogous to hypotheses in interpretation problems. -- Organize KS's into a context specialization hierarchy for determining task relevance. In AM, relevant heuristics for a task (typically filling In a mathematical concept slot) are Inherited from all contexts (concepts) that appear above It in the specialization hierarchy. Thus, AM goes a step beyond most other systems by showing that policy KS's must be selected on the basis of the kind uf problem being solved. Lenat's work suggests that this might be simply a hierarchical relationship among kinds of problems. The characterizations of structural knowledge above are a first step towards a vocabulary or language for talking about indirect reference of KS's. It Is clear that strategy and structure are Intimately related; to make this clearer, we return to our original Interest In explanation. rI THE EPISTEMOLOGY Or A RULE-BASED EXPERT SYSTEM 50 Teaching a strategy might boil down to saying, "Think In terms of such-and-such a structural vocabulary in order to get this strategical task done"--where the vocabulary Is the Indexing scheme for calling KS's to mind. So we might say, "Think in terms of families of functional subgroups In order to rule out interpretations of the spectral peaks." Or. "Consider process features when diseases of different etiologies are possible." That is, teaching a strategy involves in part the teaching of a perspective for relating KS's hierarchically (e.g., "families of functional subgroups" or "disease process features"), and then showing how these relations provide leverage for managing a large amount of data or number of hypotheses. The explanation of the leverage or structuring effect must be in terms of some task for carrying the problem forward, thus tying the indexing scheme to the overall process of what the problem solver Is trying to do (thus we say, "to rule out Interpretations" or "to narrow down the problem to one etiological process" or (recalling Figure 6) "to broaden the spectrum of possibilities"). In this way, we give the student a meta-rule that specifies what kind of vocabulary or "cut on the problem" to consider for a given strategical task. Davis's study of meta-rules [14] suggested a need for a vocabulary of meta-rule knowledge. His examples suggested just a few conceptual primitives for describing refinement (ordering and utility of KS's) and a few primitives for describing object-level knowledge (KS input and output). All of the strategies In our examples deal with ordering and utility criteria for KS's; so we have nothing to add there. All of the examples given here reference KS's by the data they act upon, the hypotheses they support or the tasks they accomplish, except for AM, which references KS's by their scope or domain of applicability. What is novel about the analysis here Is the focus on relations among hypotheses and among data. The Character of Structural Knowledge 51 From our domain-independent perspective, strategical knowledge selects KS's on the basis of the causal, subtype, process, or scoping relation they bear to hypotheses or data currently thought to be relevant to the problem at hand. Thus, our mete-rules make statements like: "Consider KS's that would demonstrate a prior cause for the best hypothesis." "Don't consider KS's that are subtypes of ruled-out hypotheses." "Consider KS's that abstract known data." "Consider KS's that distinguish between two competing kinds of processes." "Consider KS's relevant to the current problem domain." To summarize, the structural knowledge we have been studying consists of relations that hierarchically abstract data and hypotheses. These relations constitute the vocabulary by which domain-independent meta-rules invoke KS's. The key to our analysis Is our insistence on domain-independent statement of meta-rules--a motivation deriving from our Interest in explanation and teaching. 7.2 Explicitness of Strategical Knowledge Another consideration for explanation is whether or not the strategy for Invoking KS's is explicit or encoded Indirectly. As we proceed to higher strategical levels, It becomes difficult to represent a strategy declaratively. For example, top-down refinement Is "compiled into" CENTAUR's hierarchy itself by the control steps that specify on each level what to do next (e.g., "after confirming Obstructive Airways disease, determine the subtype of Obstructive Airways Disease"). By separating control steps from disease inferences, Alkins's Improved the explanation facility, one of the goals of CENTAUR. However, the rationale for these control steps is not represented--it is just as Implicit as it was in PUFF'S contextual clauses. In contrast, NEOMYCIN's "explore and refine" task clearly implements THE EPISTEMOLOGY Of A RULE-BASED EXPERT SYSTEM 52 top-down refinement through domain-independent meta-rules. However, these mete-rules are ordered to give preference to siblings before descendents--an example of an implicit strategy. One common way of selecting KS's Is on the basis of numerical measures of priority, utility, Interestingness, etc. For example, CENTAUR, like many medical programs, will first request the data that gives the most weight for the disease under consideration. Thus, the weight given to a KS Is another form of indexing by which a strategy can be applied. If we wish to explain these weights, we should ideally replace them by descriptors that "generate" them, and then have the strategy give preference to KS's having certain descriptors. NEOMYCIN's mete-rules for requesting data (described above) are a step In this direction. MOLGEN's "least-commitment" meta-strategy is a good example of implicit encoding by priority assignment. The ordering of tasks specified by least commitment Is: "look first for differences, then use them to sketch out an abstract plan, and finally refine that plan .... " This ordering of tasks is implicit in the numerical priorities that Stefik has assigned to design operators (e.g., propose-goal, refine-object, find-features). Therefore, an explanation system for MOLGEN could not explain the least-commitment strategy, but only say that the program performed one task before another because Its priority was higher. 7.3 Absence of Support Knowledge We have little to say about support knowledge In these systems because none of them represent it. That Is, the causal or mathematical models, statistical studies, or world knowledge that justifies the KS's Is not used during reasoning. As discussed In Section 8, Absence of Support Knowledge 83 this limitation calls Into question the problem-solving flexibility or "creativeness" of these programs. In any case, the knowledge is not available for explanation. 7.4 Summary The strategy/structure/support framework can be applied to any knowledge-based system by asking the questions: What are the KS's in the system (what kinds of recognition or construction operations are performed)? How are the KS's labeled or organized (by data/constraint or hypothesis/operation)? Is this Indexing used by the Interpreter or by explicit strategical KS's, or is It just an aid for the knowledge engineer? What theoretical considerations justify the KS's? Is this knowledge represented? With this kind of analysis, it should be clear how the knowledge represented needs to be augmented or decomposed, If an explanation facility is to be built for the system. Quite possibly, as In MYCIN, the representational notation will need to be modified as well. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 54 8 Conclusions The production rule formalism is often chosen by expert system designers because It Is thought to provide a perspicuous, modular representation. But we have discovered that there are points of flexibility in the representation that can be easily exploited to embed structural and strategic knowledge in task rules, context clauses and screening clauses. Arguing from a teacher's perspective, we showed that hierarchies of problem features and diagnoses (allowing rules to be generalized), in addition to domain-independent statement of strategy are useful to justify a rule and teach an approach for using It. Also, when the rule Is causal, satisfying explanations generalize the rule In terms of an underlying process model. This same knowledge should be made explicit for purposes of explanation after a consultation, ease of modification, and potential Improvement of problem solving ability. Characterizing knowledge In three categories, we concluded that MYCIN's rules were used like a programming language to embed strategic and structural principles. However, while context and screening clauses are devices that don't precisely capture the paths of expert reasoning, the basic connection between data and hypothesis Is a psychologically valid association. As such, the "core rules" represent the experts' knowledge of causal processes In proceduralized form (again, not necessarily compiled Into this form, but compiled with respect to causal models which may be incomplete or never even learned). For this reason, support knowledge needs to be represented In a form that Is somewhat redundant to the diagnostic associations, while structure and strategy can be directly factored out and represented declaratively. The lessons of this study apply to other knowledge-based programs, Including those which don't use the production rule representation. The first moral Is that one cannot simply 55 slap an Interactive front end onto a good Al program and expect to have an adequate teaching system. Similarly, an explanation system may have to do more than just read back reasoning steps and recognize questions: it may be useful to abstract the reasoning steps, relating them to domain models and problem-solving strategies. Other knowledge bases could be studied as artifacts to evaluate the expressiveness of their representation. Is the design of the inference structure explicit? Can It be reasoned about and used for explanation? You must ask: where are the choice points In the representation and what principles for their use have not been represented explicitly? For production systems one should ask: What Is the purpose of each clause in the rule and why are clauses ordered this way? Why Is this link between premise and conclusion justified? Under what circumstances does this association come to mind (structure and strategy)? Finally, future "knowledge engineering" efforts in which human experts are interviewed and their knowledge codified could benefit from first constructing an epistemology along the lines of the strategy-structure-KS-support distinction, and then, relative to that framework, representing knowledge using their chosen notation (rules, units, etc.). Then, when the system fails to behave properly (whether the purpose Is teaching or problem solving), changes to either the epistemology or the rules should be entertained. In fact, this is a cyclic process where changes are made to the rules that subtly tear at the framework, and after Incorporating a series of changes, a new, better epistemology and revised notation can be arrived at. (So a single MYCIN rule might seem awkward, but a pattern such as 40 rules with the same first 3 clauses suggests the underlying nature of the knowledge). Thus, a methodology for converging on an adequate epistemology comes In part from constant cycling and re-examinlng of the entire system of rules. THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 56 The epistemology that evolved from attempts to reconfigure MYCIN's rules Is NEOMYCIN's etiological taxonomy, multiple disease process hierarchies, data that trigger hypotheses, etc., plus the domain-independent task hierarchy of meta-rules. In our use of terms like "problem feature," we have moved very far from the too abstract "clinical parameter" which did not distinguish between data and hypotheses. Our epistemology provides an Improved basis for interpreting expert reasoning, a valuable foundation for knowledge engineering, as echoed by Swanson: Three aspects of the expert's adaptation are especially important to the design of decision support systems: the generative role of basic principles of pathophysiology, the hierarchical structure of disease knowledge, and the heuristics used in coping with information processing demands. [27] These categories of knowledge provide a framework for understanding an expert. We ask, "What kind of knowledge Is he describing?" This framework enables us to focus our questions so that we 'can separate out detailed descriptions of the expert's causal model from his associations that link symptom to disorder, and his strategies for using this knowledge. 9 Postscript: How the rule formalism helped Despite the now apparent shortcomings of MYCIN's rule formalism, we must remember that the program was influential because it worked well. The uniformity of representation, so much the cause of the "missing knowledge" and "disguised reasoning" described here, was perhaps an Important asset. With knowledge so easy to encode, It was perhaps the simple parameterlzation of the problem that made MYCIN such a success. The program could 57 be built and tested quickly at a time when little was known about the structure and methods of human reasoning. Again demonstrating the importance of the original research, we can treat the knowledge base as a reservoir of expertise, something never before captured In quite this way, and use It to suggest better representations. 10 Acknowledgments This paper is based on a chapter of my thesis. Early encouragement and comments were provided by Bruce Buchanan, John Seely Brown and Adele Goldberg. An early draft of this version was read and critiqued by Ted Shortliffe and Jim Bennett. The Al Journal reviewers were particularly helpful during the long gestation period of these Ideas. I have quoted a number of John Brown's suggestions directly. Without the careful, patient explanations of Tim Beckett, my meningitis tutor for a year. this paper would not have been possible. This research was supported In part by a grant from a joint ONR/ARPA contract (ONR 14-79C-0302). Computing resources provided by the SUMEX-AIM national resource (NIH grant RR 00786-07). THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM 58 References 1. Alkins, J. S. Prototypes and production rules: a knowledge representation for computer consultations. PhD thesis, Computer Science Department, Stanford University, (HPP-80-1 7, STAN-CS-80-814), August 1980. 2. Brachman, R. J. What's in a Concept: Structural Foundations for Semantic Networks. BBN Report No. 3433, October 1976. 3. Brown, J. S., Collins, A., & Harris, G. Artificial Intelligence and learning strategies. In O'Neill (Ed.), Learning Strategies. New York: Academic Press, 1977. 4. Buchanan, B. G., Sutherland, G., & Feigenbaum, E. A. HEURISTIC DENDRAL: a program for generating explanatory hypotheses in organic chemistry. In B. Meltzer and D. Michie (Eds.), Machine Intelligence 4. Edinburgh University Press, 1969, 209-254. 6. Buchanan, B. G., Sutherland, G., & Feigenbaum, E. A. Rediscovering some problems of artificial Intelligence In the context of organic chemistry. In B. Meltzer and D. Michie (Eds.), Machine Intelligence 5. Edinburgh University Press, 1970, 263-280. 6. Clancey, W. J. Tutoring rules for guiding a case method dialogue. International Journal of Man-Machine Studies, 1979, 11, 25-49. 7. Clancey, W. J. Transfer of Rule-Based Expertise through a Tutorial Dialogue. Computer Science Doctoral Disserta.tion, Stanford University, STAN-CS-769, August, 1979. 8. Clancey, W. J., Shortliffe, E. H., and Buchanan, B. G. Intelligent computer-aided instruction for medical diagnosis. Proceedings of the Third Annual Symposium on Computer Applications in Medical Care, Silver Spring, Maryland, October 1979. 9. Clancey, W. J. and Letsinger, R. NEOMYCIN: Reconfiguring a rule-based expert system for application to teaching. Seventh International Joint Conference on Artificial Intelligence, 1981, 829-836. 10. Collins, A. Fragments of a theory of human plausible reasoning. TINLAP-2, 1978, 194- 201. 11. Davis, R. Applications of meta-level knowledge to the construction, maintenance and use of large knowledge bases (STAN-CS-76-552, HPP-76-7). Stanford University, July 1976. 12. Davis, R. Generalized procedure calling and content-directed Invocation. Proceedings of the Symposium on Artif icial Intelligence and Programming Languages. SIGPLAN/SIGART Combined Newsletter, August 1977, 46-64. 13. Davis, R. Interactive transfer of expertise: acquisition of new Inference rules. Artificial Intelligence, 1979, 12, 121-167. 59 14. Davis, R. Meta-rules: reasoning about control. Artificial Intelligence, 1980, 15, 179- 222. 16. Davis, R., Buchanan, B., & Shortliffe, E. H. Production rules as a representation for a knowledge-base consultation program. Artificial Intelligence, 1977, 8, 16-46. 16. Davis, R., & King, J. J. An overview of production systems. In E. W. Elcock & D. Michle (Eds.), Machine intelligence 8. New York: Wylie & Sons, 1977,.300-332. 17. Elstein, A. S., Shulman, L. S., & Sprafka, S. A. Medical problem-solving: An analysis of clinical reasoning. Cambridge: Harvard University Press, 1978. 18. Engelmore, R. and Terry, A. Structure and function of the CRYSALIS system. Sixth International Joint Conference on Artificial Intelligence, 1979, 260-266. 19. Fagan, L. M., Kunz, J. C., Felgenbaum, E. A., & Osborn, J. J. Representation of dynamic clinical knowledge: measurement interpretation in the Intensive care unit. Sixth International Joint Conference on Artificial Intelligence, 1979, 260-262. 20. Lenat, D. B. AM: An artificial intelligence approach to discovery In mathematics as heuristic search. PhD thesis, Computer Science Department, Stanford University, (CS-STAN-76-670, AIM-286), July 1976. 21. Lesser, V. R., Fennell, R. D., Erman, L. D., Reddy, 0. R. Organization of the HEARSAY II speech understanding system. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-23, February 1975, 11-24. 22. McDermott, J. RI: A rule-based configurer of computer systems. Department of Computer Science, Carnegie-Mellon University, CMU-CS-80-1 19, April, 1980. 23. Mizoguchi, F., Maruyama, K., Yamada, T., Kitazawa, K., Saito, M., Kulikowski, C. A. A case study of EXPERT formalism--an approach to a design of medical consultation system through EXPERT formalism. Sixth International Joint Conference on Artificial Intelligence, 1979, 583-685. 24. Shortliffe, E. H. MYCIN: A rule-based computer program for advising physicians regarding antimicrobial therapy selection. Ph.D. dissertation in Medical Information Sciences, Stanford University, 1974. (Also, Computer-based medical consultations: MYCIN. New York: Elsevier, 1976.) 26. Shortliffe, E. H., Buchanan, B. G., and Feigenbaum, E. A. Knowledge engineering for medical decision making: a review of computer-based clinical decision aids. Proceedings of the IEEE, 1979, 67:1207-1224. 26. Stefik, M. J. Planning with constraints. PhD thesis, Computer Science Department, Stanford University, (HPP-80-2, STAN-CS-80-784), January 1980. 27. Swanson, D. B., Feltovich, P. J., & Johnson, P. E. Psychological analysis of physician expertise: implications for design of decision support systems. MEDINFOiF7, 1977, 161-164. k- s ]LZNG PAUR BLANK-NOT n J THE EPISTEMOLOGY OF A RULE-BASED EXPERT SYSTEM s0 28. Szolovits, P. , & Pauker, S. G. Categorical and probabilistic reasoning In medical diagnosis. Artificial Intelligence, 1978, 11(1), 115-144. 29. van Melle, W. A domain-independent production-rule system for consultation programs. Computer Science Doctoral Dissertation, Stanford University, In press, 1980. 30. Winograd, T. Frame Representations and the Declarative/Procedural Controversy. In D. G. Bobrow & A. Collins (Eds.), Representation and Understanding. New York: Academic Press, 1975, 185-210. 31. Winograd, T. Extended inference modes in reasoning by computer systems. Artificial Intelligence, 1980, 13, 6-26. 32. Woods, W. A. What's in a Link: Foundations for Semantic Networks. In D. G. Bobrow & A. Collins (Eds.). Representation and Understanding. New York: Academic Press, 1975, 35-82. 33. Yu, V. L., Buchanan, B. G., Shortliffe, E. H., Wraith, S. M., Davis, R., Scott, A. C., & Cohen, S. N. Evaluating the per'ormance of a computer-based consultant. Computer Programs in Biomedicine, 1979, 9, 95-102. (a) 34. Yu, V. L., Fagan, L. M., Wraith, S. M., Clancey, W. J., Scott, A. C., Hannigan, J. F., Blum, R. L., Buchanan, B. G. ahn,S, S.N. Antimcrobial selection by a computer -- a blinded evaluation by infectious disease experts. Journal of the American Medical Association, 1979, 242, 1279-1282. (b) STANFORD/CLANCEY December 30, 1981 Page 1 Navy Navy Dr. Ed Aiken 1 CAPT Richard L. Martin, USN Navy Personnel R&D Center Prospective Commanding Officer San Diego. CA 92152 USS Carl Vinson (CVN-70) Newport News Shipbuilding and Drydock Co Meryl S. Baker Newport News, VA 23607 NPRDC Code P3,09 1 Dr. James cBride San Diego, CA 92152 Navy Personnel R&D Center San Diego, CA 92152 Dr. Robert Ereaux Code N-711 Dr Wiiliam Montague I:AVTRAEQU IPCEN Navy Personnel R&D Center Orlando, FL 32813 Sar Diego, CA 92152 CDR Mike Curran Tet . . Yeller, Office of Naval Research Technical information Office, Code 2C1 FCC N. Quincy St. NAVY PEESON" EL F$&D CENTER Code 270 SAN !IEGO, CA 921F2 Prington, VA 22217 1 Library, Cc, P2CIL DR. PAT FEDERICC Navy Personnel &D Center NAVY PERSONNEL R&D CEUNTER an Diego, CA 92152 SA" DIECO, CP c2152 1 Technic rI Director Dr. John Ford Npvy Personnel r, D Center Navy Pprsonnel R&D Center San Dicgo, CI 92152 San, Diego, CA 92152 6 Commanding Officer LT Steven D. Harris, TISC, USN Naval PesearcY Laboretory Code 6021 Code 2(27 Nav . Air Devclopment Center VaE!"Lngton, r.C 22 e.. 1;prminster, Pennsylvania 197L 1 Psychologist Dr. Jim Hollpn ONR Ernch Office Code 3"C Elg 11L, Section D Navy Personnel P & D Center 666 Sure.-r Street. San Diego, Ct 92152 Foston, MA C22 r Dr. I'orman J. Kerr 1 Psychologist Chief of lnval Technical Training O1:R Frz ncl Offie N val Pir Ftt.tior Vemphis (75) 56 S. Cl:,rk Ftr,-ct , ' cngton, TN 3P054 Chicac, -L Dr. Willi m L. Xaloy 1 C)ffice of !'ava] Fe.(.arc:r Principal Civilian Advisor for Code N'.- Educrtion and Training PDC . uincy SFtreet --vtl Tr-ining Comr'nd, Code 0OA trlirgtor, V4 22217 Pensacol., FL 250 STA ',F 0RD / CLA N'C FY Dt'ccrnber 3C, 1CS1 Page 2 r; Personnel ?. Tr,-ining Research Programs 1 Dr. Robert G. Smith (Code 4~5P) ODffice of Chief of Naz Oper~tic Offirnc of Navi Research OP-( 0 PW! Arlington. VA 22217 Washington, DC ?C31; Psycholog-ist 1 Dr. Plfred F. Sriode 2,,I Frz-nch Cffice Training AnalIysis P.EVF.ILI?tLor ' lr'.e East Grccn Street (7AECQ) Pasadena, CA 011C1 Dept. of the I!avy Orlando, FL -2R I Spec-la1. Asst. for Fducation and Trzirdng (OP-01E) 1 Dr. Fichprd Sorensen Rni. ?"Cr Pr'hingtor Annex Navy Personnel R&2 Center LVss!%Ington, DC 7 7 r Stnn Diego, Cf. p2152 i Iffice of the Chief of !.aval r0 per~tiens I Roger Wpissirrer-P-ylor Pese, rch Tevplopment t C;tudies Erarch Department of f. miristrrtfve Fc.-' (C-P-1 lE.) ha;val! Pcstgrciu.,te Sch-oo2 ~~V~.~o.PC 20er,( !-Ioterey, CAI Cq% 4P I LT Fr.-nk C. Patlio, !"'C, US" (Ph.D) 1 Dr. Pobcrt Uis*,r 5>2 octicr rd Trining lcs-carch Division Code c I Huin.n Pe'-fcrntnence s Dept. !:z3%y Pcrsonne2 F',D Center !mrl rec 1,ec ic~l Fes, F-rch Labor: t S, .n Dicgo, CP 2~ Penc'clF'L 2- 1 M'r John E. IMo Cc I Dr. \(Thry Poch-! Code P*:Ir -~rtin PsrhDprtmen U. S. !avy Pers onnP2 ser' Code 5 ppK Devclopmnr C-nte-r 'Sa'.'] Post~rrmdur-t~c 5chcol nPicCE '% 1.,nt'.r'y, C,. c 0.4C Pagrer W. nrem'ingtcn. Ph.P CodE Lr,3 Pensrano' r, FL ?2 ,OF Dr. F~rnlrd Firnrd (C'.D) M'nvy Personr