1

The Evaluation and Impact of NEPER Wheat Expert System 
 

Ahmed Rafea 
Computer Science Dept., AUC 

Email: rafea@aucegypt.edu 
 

Mostafa Mahmoud  
Central Lab. for Agriculture Expert System, ARC 

Email: mostafa@ esic.claes.sci.eg  
 

Abstract: This paper presents the laboratory and field evalation results of NEPER Wheat 
expert system. The laboratory evaluation showed that NEPER performance is comparable 
with human experts. Field evaluation has revealed that NEPER has good economic and 
environmental impacts. The field testing results have also shown that NEPER is usable, 
applicable and needed. Copyright ® 2001 IFAC 

 
Keywords: Expert Systems, Diagnosis, Knowledge-based systems, Hierarchical structures, 
Classification, Intelligence. 

 
1. INTRODUCTION 
 

Bread, known as aish, or life, is a vital component of 
the Egyptian diet. In 1993, the country produced 4.5 
million tons of wheat on 2.2 million feddans. Given 
the crucial role wheat plays in Egypt, CLAES 
cooperated with the Intelligent Systems laboratory 
(ISL) at Michigan State University in developing the 
Egyptian Regional Wheat Management System, 
funded by NARP a United States Agency for 
International Development (USAID) project in the 
period from 1992 to 1995. This project integrates an 
ES with a crop simulation model and aims at 
addressing all aspects of irrigated wheat 
management in Egypt. This integrated system is 
named NEPER (Kamel et al, 1995). In order to 
achieve this goal, NEPER is designed to perform the 
following functions: 
• Select the appropriate variety for a specific field  
• Advise the farmer on field preparation  
• Design schedules for irrigation and fertilization  
• Control pests and Weeds  
• Manage harvests  
• Diagnose malnutrition  
• Diagnose disorders  
• Suggest Treatments  
 
In 1997, another project between CLAES and ISL 
was funded by the ATUT which is also a USAID 
project. One of the objectives of this project was 
conducting field-testing  to measure the ES 
performance. The objective of conducting the field-
testing was to evaluate the economical and 
environmental impacts and to measure the ES 
performance from three aspects: usability, 
applicability and need. The results of this testing 
were also used to enhance the user interface and 
extend the knowledge base of Neper .In this project, 

there is a component for the evaluation of a new 
enhanced version of NEPER that considers the whole 
agricultural operations. This new version has been 
developed according to the results and 
recommendation of the field testing that is presented 
in this paper. 
 
In this paper, the technical  background of NEPER is 
presented in section 2. The laboratory evaluation is 
summarized in section3. The experiments 
description is presented in sections 4. The 
economical and environmental impact of the ES are 
summarized in sections 5 & 6, respectively. The ES 
performance, section 7, has been measured using 
three aspects namely usability, applicability, and  
need of ES. ES enhancements as a result of those 
experiments are presented in section 8.  
 

2. BACKGROUND 
 
In developing Neper, the Generic Task Approach to 
ES development proposed by Chandrasekran 
(Chandrasekran, 1986) has been used. The idea 
behind the Generic Task approach, is that the way a 
problem is to be solved, depends largely on its type 
e.g., diagnosis, design, planning, etc. Consequently, 
problems of the same type could share some sort of a 
generic problem solver. So, according to the Generic 
Task methodology approaching a diagnosis problem 
will be inherently the same regardless of the domain 
in which such a problem is being addressed.  The 
classical example of a problem solver that could be 
applied to a diagnosis problem is Hierarchical 
Classification (Gomez & Chandrasekran, 1981; 
Chandrasekran, 1983) and it is this problem solver 
that has been used in implementing the Wheat 
disorders ES, which is a component of NEPER. 


 2

This system component has been implemented using 
a Generic Task Tool developed at Michigan State 
University (MSU). In this tool, the knowledge base is 
created as a hierarchy of nodes. In each node, the 
knowledge is represented in a table, where each 
entry in this table represents either a database 
variable or a variable pointing to another table.  
Each database variable is associated with a question. 
A user will be presented with that question only if 
the database variable has never been assigned a 
value. The combination of possible inputs for each 
question denotes different rules and matching 
patterns. If a combination of inputs results in a 
match value greater than a given threshold, the node 
is said to be established. By asking the user a series 
of questions, the system is able to pursue or rule out 
paths in the classification in which the leaves 
represent disorders. Basically, if a path from a root to 
a leaf exists, then the disorder at the leaf is presented 
as the diagnosis.  
 

3. Laboratory Evaluation 
 
Laboratory evaluation is conducted before 
dissemianting the ESs in the field. The Laboratory 
evaluation methodology consists of three main 
procedures namely Verification, Validation, and 
Evaluation. Verification is defined as the 
demonstration of consistency, completeness, and 
correctness of software (Adrion et al, 1982). O’Keefe 
et al. (1987, 1989, and 1990) have defined 
verification as "Building the system right", that is 
making sure that the implemented system is 
functionally matching the proposed design, and free 
of semantic and syntactic errors. Validation is the 
process whereby the system is tested to show that its 
performance matches the original requirements of 
the proposed system. It is defined as the 
determination of the correctness of the final program 
or software produced from a development project 
with respect to the user needs and requirements 
(Adrion et al, 19982). As noted by O’Keefe et al. 
(1987, 1989, and 1990) "Validation means building 
the right system". Evaluation is the process whereby 
we ensure the usability, quality, and utility of the ES 
(O’Keefe et al. 1987, 1989, and 1990). A complete 
testing cycle is performed in iterations through 
which, the ES is updated and refined. 
 
Verification process evolves through two main stages 
during the development of the ES: the development 
stage and the examination stage. In the development 
stage, the developer practices different functions of 
the implemented systems, looking for potential 
errors that may exist. This is accomplished using two 
broad techniques: non case-based and case-based. 
Non case-based techniques include tracing, spying 
and other traditional debugging techniques.  Case-

based verification techniques are applied by 
preparing "Typical Cases". These cases should be 
selected to serve requirements satisfaction as spelled 
out in the requirement specification. In the 
examination stage, the ES is tested to make sure that 
it is running properly, by testing all the functions of 
the system trying to examine the performance of the 
system in different situations. The output of this 
stage is the verification report that is a document of 
differences between system design and 
implementation. This report is used to update , the 
design document and implementation. 
 
The validation step is done through conducting 
meeting with the doamin experts who provided the 
knowledge to check that the right system has been 
developed. This is done by going throgh the 
generated test cases during the meeting with the 
domain experts. Their comments on the content and 
user interface are considered. Necessary updating of 
the design and implementation is done. 
 
The evaluation step is to assess the quality, usability, 
and utility of the ES from the point of view of human 
experts other than the domain expert, who 
participate in the system development. Typical cases 
are created and distributed to three domain experts 
in the specialty of a specific sub system. If one sub 
system includes more than one specialty, cases are 
distributed to all experts in different specialties. For 
example in the remediation subsystem, we have three 
specialties: plant pathology, entomology, and 
nutrition. Therefore 9 experts have participated in 
the validation of this subsystem. For each specialty, 
an evaluator is selected to blindly assess the 
responses of the three human experts and the ES. 
After the evaluation, the domain expert participated 
in the development, the evaluator, and the domain 
experts participate in an evaluation meeting together 
with the knowledge engineer to discuss the 
evaluation results till they reach to a consensus.  

Figure 1 NEPER Diagnosis evaluation result  
 

Applying this methodology on NEPER, verification 
and validation were done sucessfully.  In this 
paragraph we will presnt the evaluation results of the 
diagnosis and treatment subsystems. Figure 1 shows  
the evalaution scores of NEPER diagnosis 

��

��

��

��

��

��

��
��

��
��

��

��

��

��

��

��

��

��

��

��

��

��

��

Expert
System

Expert � Expert � Expert �

Diseases Diagnosis

Insect Identification

Malnutrition Diagnosis

 
 3

subsystem. NEPER diagnosis subsystem over 
performs human expert in the insect and 
malnutrition specilties, and its score in the disease 
diagnosis results (86%) is equivalent to those of the best 
human expert.  

 
The evaluation scores of NEPER treatment 
subsystem is shown in figure 2. NEPER treatment 
over performs human expert in disease treatment, 
and its score in the insecta and malnutrition 
treatment are 0.95 and 0.85 respectivly of the best 
expert-group. After this experiment, NEPER has 
been trained to reach the scores of the best experts..  

Figure 2 NEPER Treatment evaluation result 
 

4. Field Test Experiment Description 
 
Many experiments were conducted in the last few 
years for NEPER ESs. The objectives of those 
experiments were to validate the system in the field, 
and to measure the impact of using the system. The 
experiments were  conducted in different locations 
by selecting two fields at same area and location: one 
is to be cultivated using NEPER Wheat 
recommendations without any interference from the 
agriculture engineer or any specialist, and the other 
one is to be cultivated as usual, this is a control field.  
 
In order to get the best results from the experiment, 
the following issues and activities were considered 
and followed: 

♦ Formal training on the usage of NEPER was 
conducted for the staff who are going to use the 
system.  
♦ A computer engineers from CLAES were 
responsible for supporting the site staff on the 
usage of the system and handling trouble-shooting 
problems of hardware and software. 
♦ A number of the wheat researchers from Field 
Crop Research Institute (FCRI) were assigned to 
supervise different fields, i.e., a researcher for each 
site. 
♦ Periodical fields visits were conducted by 
researchers from CLAES and FCRI  

 
Three experiments were conducted in three different 
seasons for NEPER ES. Two of those three 
experiments had the same number of fields, both of 
them consisted of a total number of 32 fields 
carefully selected for conducting the experiment. The 
third one consisted of a total number of 44. These 
fields were equally divided so that 16 fields in the 
first two expermints and 22 fields in the third one 
were assigned to utilize NEPER and managed by the 
ES, and the other fields were to be managed in the 
usual practice and acts as control. The selected fields 
were located in four different geographical areas, 
namely: Noubaria, Gemiza, Sharkia, and Decerns. In 
Noubaria two sites were selected to cover the 
different types of soil at that area. One of those sites 
located in Bostan and the other one located in 
Banger El-Sokar.  
 
The first experiment covered only the diagnosis and 
treatment part of the NEPER Wheat ES including 
Weed Identification. The second one included the 
strategic part and tactic part. Strategic part includes 
six subsystems called: Variety Selection, Pre-
cultivation Pest Control, Tillage, Planting, Irrigation 
& fertilization, and Harvest. The third one also 
included the strategic part and tactic part. Strategic 
part includes six subsystems called: Variety 
Selection, Planting, Land Preparation, Irrigation, 
fertilization, and Harvest. Tactical part includes two 
subsystems called diagnosis and weed identification, 
each of them includes the treatment function. 
 

5. Economical Impact 
 
In the first experiment (CLAES, 1996), the averages 
of treatment costs, yields, and straw per feddan was 
calculated for both NEPER and the control fields. By 
taking the averages of treatment cost, yield, and 
straw per Feddan, it was found that the average net 
income per feddan for ES fields is 2049.85 LE and 
for control fields is 1600.05 LE, consequently, the 
net production increase in Egyptian Pound was 
449.8. This represents 26.78% increase in the 
production 
 
In the second and third experiments (CLAES, 1999, 
CLAES, 2001), the complate system was tested. 
Tables (1) and (2) summarize the result of those 
experiments in the new reclaimed area and the Delta 
area. The following remarks were observed: 
• In both the new reclaimed and Delta area, there 
was an increase in the production and net profit 
consistently in the two consecutive seasons.  
• The percentage of increase in the net profit in 
the newly reclaimed is greater than the percentage of 
increase in the net profit in the Delta area.  
• The production in the newly reclaimed area is 
less than the Delta area because the lack of expertise 

��

��
�� ��

��

��

�� ��
��

��

��

��

�

��

��

��

��

��

��

��

��

��

���

Expert
System

Expert � Expert � Expert �

Dis eas esTreatment

Insect Treatm ent

Malnutrition Treatm ent

 
 4

in the reclaimed area. Hence expertise transfer in 
this area has led to a relatively high impact. 
 

6. Environmental Impact 
The conservation of natural resources has two 
aspects. The first is pertinent to the management of 
these resources on the macro level, such as 
controlling the expansion of urban development in 
order not to loose agricultural land. The second is 
concerned with the management of these resources 
on the micro level such as adding chemical 
fertilizers to the soil. In this paper, the focus will be 
on the status of the water and land resources because 
they are the two main resources related to our work 
on crop management ESs.  
 
Water is the scarcest resource in Egypt, since its 
supply is nearly fixed and water demand for different 
sectors is continuously increasing. The water supply 
can be classified into three categories: surface water, 
ground water, and (isle) water reuse after treatment 
either from agriculture drainage or domestic usage. 
The decision makers concerned with water resource 
management in Egypt are challenged by how to 
balance the limited water supply with an increasing 
water demand for the future, since water is the major 
constraint for land expansion to satisfy food self-
sufficiency. Another challenge is how to reduce the 
water pollution resulting from using chemical 
fertilizers and pesticides. After water, land is the 
major limiting factor for sustainable agricultural 
development (Rafea, 1996).  
 
There are two problems facing decision-makers to 
conserve water resources namely: the efficient 
utilization of water resources, and the pollution 
resulting from the usage of chemical fertilizers and 
pesticides. Regarding soil conservation, there are two 
main problems namely: the urban expansion, and the 
soil degradation resulting from excessive use of 
fertilizers and other bad agricultural practices. 
Therefore, the main contribution of ESs for soil and 
water conservation is to transfer the agricultural 
practices according to certain strategy or a 
combination of strategies namely: environmental 
sustainability, economical sustainability and/or social 

sustainability. In the ESs that have been built so far, 
we are concerned with economic sustainability 
taking into consideration the environmental 
sustainability in the second place. In other words, we 
are trying to acquire the recommendations that 
optimize the output relative to the agricultural 
inputs. As a consequence, environmental 
conservation is achieved, because no extra input is 
provided such as water, fertilizers and pesticides 
without a return in the yield. 
 
The results of experiments conducted for the ES 
agree with the goals of environmental conservation. 
The fields managed by the ES have used fewer 
resources in terms of fertilizers and pesticides than 
the control fields and hence conserve environment. 
The cost is an indicator of the increase or decrease of 
using chemicals in general. Hence, we have used the 
cost as a factor in determining the quantity of used 
fertilizers and pesticides. 
 
The average cost of pesticides used by NEPER 
Wheat fields in the first experiment was more than 
the control fields by 15.7 Egyptian pound/Fadden, 
but the production increased by 449.8 Egyptian 
pound/Fadden. Notice that the increase of cost in this 
experiment is negligible. In the second experiment 
the average cost of fertilizers and pesticides used by 
NEPER Wheat fields was less than the control fields 
by 5.57 Egyptian pound/Fadden and the production 
increased by 247.4 Egyptian pound/Fadden. In the 
third experiment the average cost of fertilizers and 
pesticides used by NEPER Wheat fields was less 
than the control fields by 1.2 Egyptian pound/Fadden 
and the production increased by 287.12 Egyptian 
pound/Fadden. In fact, this indicates that changes to 
the NEPER Wheat system have made it more 
compliant to the goals of resource management and 
environmental conservation.  
 
In the second experiment of NEPER wheat ES, the 
average water quantity used to produce one Ardab of 
wheat in the ES fields was 112.74 M3 water, while 
in the control fields the farmers used 152.58 M3 
water on average to produce the same quantity of 
wheat. This represents 35% decrease in the use of 
irrigation water. 

Table 2: The result of the experiments in the Delta area 
Season 97/98 Season 98/99 Item 

ES Control Differance % ES Control Differance % 
Average Production 2130 1830 300 16 2117 1759 358 20 
Average cost 631 597 34 5 445 422 23 5 
Average Net profite 1499 1233 265 22 1672 1337 335 25 

Table 1: The result of the experiments in the new reclaimed area 
Season 97/98 Season 98/99 Item 

ES Control Differance % ES Control Differance % 
Average Production 1701 1506 195 13 1647 1431 216 15 
Average cost 747 861 -114 -13 468 603 -135 -22 
Average Net profite 954 645 309 48 1179 828 351 42 


 5

7. Expert System performance 
The expert system performance has been measured 
using three aspects namely usability, applicability, 
and need of ES.  

7.1 Expert Usability 
In order to measure the usability of the ES, the 
developers in CLAES have re-run the system on the 
cases reported in the forms of the fields managed by 
NEPER and compared the conclusions with the 
results represented in the field books by the 
researchers and extension agriculture engineers in 
different locations. In the first experiment (CLAES, 
1996) was examining the comparison results, it was 
found that in 86% of the cases, the trained 
researchers have used the system correctly while this 
percentage has decreased to 38% for untrained 
researchers. This indicates the importance of 
training on the usage of the ES.  It is worth noting 
that there is no great difference between the 
researchers and extension officers in using the 
system as the differences was only 4%, although the 
system was in English. This proves the importance 
of ES. It raised the performance of extension officers 
to the level of researchers, in the underlying domain 
of the NEPER.  
 
In the second experiment there was discrepancy 
between the ES recommendation and the agriculture 
practices documented in the field books. When this 
discrepancy was discussed with the ES users, we 
found that this discrepancy was due to their rejection 
of the ES recommendation and not due to bad use of 
the system. Therefor, we concluded that the usability 
of the system in the second experiment was high. 

7.2 Expert system applicability 
The applicability can be measured by comparing the 
ES recommendation and to what extent the ES users 
have applied them. This discrepancy must not be due 
to bad usability of the system.  
 
In the first experiment, the discrepancy between the 
ES results and the applied practices by the users 
were due to bad use of the system. 
 
In the second experiment (CLAES, 1999), it was 
difficult to quantify the comparison result as it was 
found that sometimes the recommendations are 
applied partially. Hence qualitative measures were 
found more appropriate, especially in the strategic 
part. The applicability of the modules: Pre-
Cultivation Pest Control, Planting, and Weed 
Control was found low because in Noubaria fields’ 
users did not accept the ES recommendations of the 
pre-cultivation pest control and the planting 
modules. In the weed control module the actual 
practice is different from the ES advice. The 
applicability of the modules: Tillage and 
Fertilization are moderate, as the ES fields’ users 

did not accept the ES recommendation in about 50% 
of the cases. The applicability of the modules 
Diagnosis and Treatment were above moderate as 
the advice of the ES fields supervisors matches the 
advice generated by the ES in the range of 80 to 
87.5% in diagnosis and 50% of the cases in 
treatment. The applicability of the modules: Variety 
Selection, Irrigation, and Harvest are high. In the 
Variety Selection, the ES recommendations are 
compatible with the actual varieties cultivated in the 
ES fields. In the Irrigation, the Delta area (there is 
only 10% difference). In the Harvest, the ES 
recommendations are compatible with the actual 
practice in the ES fields. 

7.3 Need of Expert System  
In order to measure the need of NEPER, a 
comparison has been done between the advice given 
by the researchers and extension workers supervising 
the control field in the experiment locations and the 
advice that would be generated if NEPER were used. 
In the first experiment (CLAES, 1996), examining 
the comparison results it was found that the ES 
performance is better in 76% of the cases, and hence 
there is a great need for having the ES. 
 
In the second experiment (CLAES, 1999), it was 
found that there is a high need for the ES modules: 
Tillage, Irrigation, Fertilization, Diagnosis, and 
Treatment. In the Tillage module, it was found that 
the performance of the ES is better as all control 
fields supervisors did not apply laser and plowers, 
appropriately. In the Irrigation module, it was found 
that the ES recommends less water than what was 
recorded in the control fields books. In the 
fertilization module, it was found that the ES is 
better as ES recommends the adequate quantities of 
phosphorus and potassium fertilizers whereas some 
control fields did not apply these types of fertilizers 
at all. In diagnosis and treatment modules, the 
performance of the ES is better as the advice of the 
control fields supervisors match the advice generated 
by the ES in only 37.5% of the diagnosis cases, and 
20% of the cases in treatment. The experiment 
showed that there is a need for such module 
especially if the treatment part is modified to be 
more applicable.  
 

8. Expert System Enhancements 
 
According to the results obtained from field-testing, 
the following enhancements were done: 
• Arabic language support was introduced.  
• The irrigation module was revised to be 
accepted by users. 
• User interface become more flexible. 
• Basic information about the field and the 
enviromnent have been included in the reasoning 


 6

(i.e. drainage system, previous crops, water source, 
length and width of the field, etc.) 
• The variety selection module has been enhanced 
to produce the most suitable variety for each field 
and produce justification for this selection. 
• Basin recommendation has been revised 
completely. 
• The harvest module has been enhanced to 
generate real advice about the suitable date of start 
harvest 
 
The following enhancements were also suggested 
and the ES are going to include them: 
• Most of the users were unable to understand 
what was meant by some operations so, more 
explanation like video clips should be provided. 
• Some of the terms are difficult to understand, 
e.g.,  “spindly stem” and “leaf chlorosis”. 
Consequently, pictures are necessary for symptoms at 
different plant growth stage. 
• Currently, the ES is capable of diagnosing sever 
nutrition deficiency. However, it is not equally 
capable of detecting early stages of nutrition 
deficiency. This should be rectified. A very good 
example of this is Nitrogen deficiency. 
• Drought and Water Logging should be covered 
by the system specially that their symptoms coincide 
with the symptoms of Nitrogen def.  and Potassium 
def. 
 

9. CONCLUSION 
 
The work done in this project has revealed and 
emphasized the effectiveness and importance of ES 
as a decision support tool for extension services. It 
was very clear that there is a difference in the advice 
quality and consistency given by the ES and the 
extension agriculture engineers.  
 
In the mean time, field experiments showed that 
Usage of ES has an economic and environmental 
impact. Currently there are efforts to  disseminate  
NEPER, nation wide, and to avail it on the Internet. 
 
The field testing was found to be very useful as many 
aspects of the usability, applicability, and need were 
not possible to be identified without this field test. 
NEPER was found to be user friendly, and can be 
used by both researchers and extension workers. The 
recommendations generated by NEPER were 
applicable in most of the cases. The cases that were 
not accepted by the researchers and extension 
workers conducting the experiment, were discussed 
and the right recommendations were included in the 
succesor version. Most of the NEPER modules are 
found to be needed. The modules which were found 
not needed , were examined. The result was that this 
was not needed by the researchers and extension 

workers conducting the experiment but they are 
badly needed by the growers and extension workers 
in remote locations. 
 

REFERENCES 
Adrion, W., Branstad, M., Cherniovsky, 
J.'Validation (1982) "Verification and Testing of 
Computer Software" ACM Computing Surveys, 
Vol. 14, No. 2,1982 

Chandrasekran, B. (1986). Generic Tasks in 
Knowledge-Based Reasoning: High-Level Building 
Blocks for expert system design. IEEE Expert, 1(3), 
23-30. 

Chandrasekran, B. (1983). Towards a Taxonomy of 
Problem Solving Types. AI Magazine, 4(1),  9-17. 

CLAES (1996) "Validating NEPER Wheat Expert 
System and CERES Wheat Simulation Model", 
Technical report, No: 
TR/CLAES/ATUT(1)/3/96.12, 1996. 

CLAES (1999) "Validating NEPER Wheat Expert 
System - Field testing for season 97/98", Technical 
report, No: TR/CLAES/ATUT(w4)/5/99.2, 1999. 

CLAES (2001) "Validating NEPER Wheat Expert 
System - Field testing for season 98/99", Technical 
report, No: TR/CLAES/ATUT(w4)/10/2001.3, 
2001. 

Nazareth, D (1989) Issues in the verification of 
knowledge In Rule- based System;  International 
Journal of Mas-Machine Studies, Vol.30, 1989, 
PP.255-271. 

O' Keefe R.M (1990) "Consultant Report" Report 
No-CR-88-024-08 the Expert Systems for Improved 
crop management project.Project No EGY/88/024, 
August 1990 

O' Keefe, R.M., O. Balci, and E. P. Smith, (1987) “ 
Validating Expert System Performance “ IEEE 
Expert, Vol. 2, No. 4, Winter 1987, PP 81-90. 

O' Leary, D., O'Keefe, R. (1989) "Verifying and 
Validating Expert Systems", Tutorial: MP4, 
IJCAI,1989. 

Rafea, A. (1996) "Natural Resources Conservation 
and Crop Management Expert Systems", Workshop 
on Decision Support Systems for Sustainable 
Development, UNU/IIST, Macau. 26 February - 8 
March, 1996.  

Gomez,F., & Chandrasekran, B. (1981). Knowledge 
Organization and Distribution for Medical 
Diagnosis. IEEE Transactions on Systems, Man, 
and Cybernetics, SMC-11(1), 34-42. 

Kamel, A., Schroeder, K., Sticklen, J., Rafea,A., 
Salah,A., Schulthess,U., Ward, R. and Ritchie, J. 
(1994).  Integrated Wheat Crop Management 
System Based on Generic Task Knowledge Based 
Systems and CERES Numerical Simulation. AI 
Applications  9(1):17- 27