eliasen.indd


509

Navigating Online Menus: A 
Quantitative Experiment

Karen Eliasen, Jill McKinstry, Beth Mabel Fraser, and 
Elizabeth P. Babbitt

Karen Eliasen is Access Services/Reference Librarian in the Undergraduate Library at the University of 
Washington Libraries; e-mail: eliasen@u.washington.edu. Jill McKinstry is in the Networked Information 
Library at the University of Washington Libraries; e-mail: jillmck@u.washington.edu. Beth Mabel Fraser 
is Universal Access Project Librarian in the Reference and Research Services Division of the University 
of Washington Libraries; e-mail: bamf@u.washington.edu. Elizabeth P. Babbi� is Reference Librarian at 
Bellevue Regional Library in the King County (Washington) Library System; e-mail: ebabbi�@kcls.org.

A uniform interface to multiple, varied databases enhances searching 
and reduces the need for end-user training. The uniformity, however, 
can make it more difficult to select appropriate resources. This article 
describes a quantitative study at the University of Washington Libraries 
that examined the effect of terminology and screen layout on students’ 
ability to correctly select databases from an introductory screen, the 
Session Manager. Results indicate a significant improvement in students’ 
ability to navigate menu screens where terminology is expanded and 
selections are grouped by type. Implications for other projects, including 
Web design decisions, are suggested.

s online information prolifer-
ates, library users have more 
difficulty choosing among 

electronic resources. Welcome screens or 
menus help users make these decisions, 
but like all new tools, they require con-
stant, iterative refinement in order to re-
main useful. This article reports the results 
of an empirical study at the University of 
Washington (UW) Libraries that exam-
ined two variables affecting the efficient 
navigation of the libraries’ initial online 
interface, the Session Manager. Results 
are evaluated and used as a framework 
for further investigation. Implications 
for other projects, including Web-based 
design decisions, also are suggested.

Background
Willow, the uniform graphical user inter-
face, is the hallmark of online searching 
at UW.1 The single-search template and 
the ease with which a single query can be 
searched in multiple informational data-
bases is popular with library users and 
has reduced the need for individual da-
tabase training. Although the uniformity 
in searching and database appearance 
has made it easier to access multiple data-
bases and to interpret results, it has made 
it more difficult to distinguish among 
electronic resources and to guide users in 
database selection. Although the Willow 
interface was a welcome alternative to a 
separate catalog and database interface, it 


was soon evident that for walk-up public 
terminal use, an introductory screen or 
program launcher was needed to help 
navigate in a graphical environment and 
to describe databases available from that 
terminal. The Session Manager was cre-
ated to serve this function (see figure 1).

The arrangement of databases on the 
Session Manager was not a problem 
initially because there were only three 
databases: UW Libraries Catalog, MED-
LINE, and ERIC. Each database had a 
separate bu�on. But as databases were 
added, limited screen space required 
creating a hierarchy and grouping data-
bases under a single arrangement. The 
design that evolved maintained separate 
bu�ons for the catalogs but grouped the 
bibliographic databases by subject. By 
1997, the number of databases provided 
through the uniform Willow interface had 
grown to more than thirty.

Problem Statement
With the addition of so many new data-
bases to the campus online system, many 
students were having difficulty locating 
the database they needed. At the same 
time, the role of the Session Manager had 
evolved. The increased importance of the 
Session Manager as a selection tool made 
it a part of the navigation process itself. 
Most users knew about MEDLINE and 
ERIC, but fewer knew what they would 
find in Avery, INSPEC, or Grolier’s. Li-
brarians reported users having problems 
ranging from misinterpretation of the 
information presented to a lack of under-
standing of library concepts, such as the 
difference between a catalog database and 
an index database. Typical complaints 
reported by the staff included these three 
examples:

1. “I have answered many, many ques-
tions at the information desk from people 

FIGURE 1
The University of Washington Libraries Session Manager

510 College & Research Libraries November 1997


who are using Books in Print to look for 
UW Libraries’ holdings. They just don’t 
seem to be able to tell the difference be-
tween the catalog and Books in Print.” 

2. “About half of the questions and 
misunderstandings we address are based 
on confusion of the UW catalog with the 
various computerized indexes. We o�en 
have to tell students that the citations they 
have printed out have no direct relation-
ship to UW holdings, or that they will not 
find a book on the shelves by browsing 
the indexes.”

3. “Today, a student wanted to look up 
books in the UW libraries. Because she 
was looking for books in the ‘Arts and 
Humanities,’ that is what she chose in the 
main menu.”

The first example shows the difficulty 
students have in differentiating between 
the content of the two databases, Books 
in Print and the UW Libraries Catalog. 
Librarians speculated that students look-
ing for a book latched onto the term books, 
not listed anywhere else in the Session 
Manager. The second example illustrates 
a similar problem. Students have dif-
ficulty understanding the relationship 
between an abstracting and indexing 
database and the library catalog. The third 
example indicates the problem that arises 
from a subject arrangement of databases. 
Students looking for UW Libraries books 
on social science topics might choose the 
Social Science Databases bu�on instead 

of the UW Libraries Catalog. In another 
typical example of confusion, students 
expect the All Databases bu�on to allow 
them to search all the libraries’ databases 
simultaneously. That seems logical, but 
in fact this bu�on merely takes them to 
a list of all databases. Librarians were 
convinced that the Session Manager could 
be altered in a way that would prevent 
students from making such mistakes.

The anecdotal information made it 
clear that the Session Manager often 
provided li�le navigation assistance to 
students. Along with the complaints 
came many suggestions from library 
staff for changes to the Session Manager. 
However, the stories did not give much 
insight into what changes would be most 
effective. Reluctant to use patrons as 
guinea pigs with a constantly changing 
interface, the libraries decided to gather 
information that would give a clearer 
direction for improvement. The Session 
Manager Evaluation Committee was 
created to collect data on how students 
interact with the Session Manager.

Design of the Session Manager Study
After reviewing the current literature 
related to the problem and meeting with 
other public services librarians to gather 
more details and hypotheses, the com-
mi�ee began the process of designing an 
experiment and creating a questionnaire 
to test hypotheses about the Session Man-

TABLE 1
Research Design Showing Variables and Their Levels

Variable Type Variable Name(s) Levels or Attributes
Independent (1) Terminology 

(2) Layout/grouping
(1) Existing terms, new terms
(2) Existing layout, new layout

Moderator or 
Co-variate

Training level Library instruction (yes/no)

Dependent Success in selecting databases 
from session manager

Scores on a 15-point identification 
test

Control 170 undergraduate students

Navigating Online Menus: A Quantitative Experiment  511


ager.2 Most of the perceived problems of 
navigation could be grouped into two 
large categories: terminology and lay-
out. With an eye toward future interface 
evolution, the group decided to focus on 
these two issues of underlying signifi-
cance. This decision guided development 
of the questionnaire and the research 
method. In designing the research outline, 
the commi�ee concentrated on a single 
problem statement: What is the effect of 
terminology and screen layout/grouping 
on students’ ability to correctly select 
databases from a Session Manager? Table 
1 shows the variables and levels defined 
and identified by the goup.

The committee created low-fidelity 
paper prototypes of the Session Manager. 
In order to judge the effects of terminol-
ogy, the information on each bu�on was 
increased to become more descriptive 
of the content. The text surrounding the 

bu�ons also was altered 
to be more descriptive. 
To test layout/grouping, 
information on the ini-
tial selection screen was 
grouped according to 
type of information (cata-
logs, indexes, etc.) rather 
than subject category. In 
addition, training level 
was introduced as a vari-
able in order to judge 
the impact of library in-
struction on participants. 
A clearly quantifiable 
dependent variable was 
chosen: the students’ suc-
cess in selecting databases 
from the libraries’ Session 
Manager, where “suc-
cess” was determined by 
scores on a fi�een-point 
identification test. 

The committee also 
tried to anticipate threats 
to design validity in a 

way that would help to avoid common 
statistical errors. Similar empirical study 
designs were reviewed to judge, for ex-
ample, whether the low-fidelity paper 
prototypes would provide an appropriate 
test instrument for an online interface. To 
ensure internal validity, the commi�ee 
selected questions that would test the 
use of the Session Manager rather than 
reflect outside knowledge of the topics. 
This meant that the questions had to elicit 
basic, navigational decisions. Because 
the questions were based on actual pub-
lic-service examples, this was easier to 
manage. A more difficult consideration 
was that in order to avoid a ceiling effect, 
questions had to be chosen that were not 
too simple. Fortunately, there was a group 
of questions that were common, but com-
plex, enough to meet both requirements. 
To make the results more applicable out-

FIGURE 2
Low-fidelity Prototype of Existing Session  

Manager, Version A

512 College & Research Libraries November 1997


side the existing Session 
Manager structure, and 
thus to preserve exter-
nal validity, underlying 
design concepts rather 
than isolated interface 
problems were targeted. 
To refine the question-
naire, a sample test was 
administered to a group 
of twelve undergraduate 
students.

Questionnaire
The test questions were 
drawn from a group of 
actual information re-
quests reported by pub-
lic-service staff: being 
able to distinguish library 
catalog databases from 
index databases; select-
ing appropriate data-
bases from subject cat-
egories; locating campus 
information sources such 
as directories and class schedules; and 
accessing the Internet. The questionnaire 
was divided into three parts. The first part 
provided information about the subjects 
to make sure that the person filling out 
the questionnaire matched the authors’ 
established user profile (undergraduate 
students) and to determine whether he 
or she had received any library instruc-
tion in the past two years. The question-
naire also asked whether students spoke 
English as their first language (although 
this factor was later determined to be 
insignificant in the overall results). The 
next part of the questionnaire tested the 
Welcome screen of the Session Manager, 
where initial navigation decisions are 
made. The final part tested the selection 
of databases from the All Databases 
screen where secondary database selec-
tion occurs.

Three variations of the Session Man-
ager introductory and secondary screens 
were developed to be used with the 
standard set of questions. Version A du-
plicated the current Session Manager, ver-
sion B maintained the current layout but 
expanded the terminology, and version 
C maintained the current terminology 
but changed the layout by grouping re-
sources according to content (see figures 
2, 3, and 4).

Administration of the Questionnaire
The questionnaire was administered to 
170 undergraduate students. With the 
helpful approval of two faculty members, 
large undergraduate social science lecture 
classes were targeted. As students entered 
the classrooms, they were randomly 
assigned version A, B, or C of the ques-
tionnaire and were given ten minutes to 
complete it. The testing took place during 

FIGURE 3 
Low-fidelity Prototype of Existing Session  

Manager with Expanded Terminology, Version B

Navigating Online Menus: A Quantitative Experiment  513


class time; no compensation was given to 
participants. Twenty-five subjects who 
were outside the established user profile, 
or who filled out less than one-third of 
the questionnaire, were eliminated from 
the study. This created unequal subject 
cell sizes for the three versions of the 
questionnaire: A = 55, B = 48, C = 42. Also, 
because subjects had been randomly as-
signed, there was an uneven distribution 
of students who had participated in li-
brary instruction. The choice and analysis 
of the statistics take into account these 
inequalities. Moreover, a test was run to 
be sure there was no significant differ-
ence between the test results of the two 
classrooms of students tested.

When scoring student questionnaires, 
the committee recognized that some 
questions had more than one correct an-
swer. For every question, the commi�ee 

decided ahead of time 
on the “best” answer and 
all the “right” answers. 
The “best” answer was 
defined as most direct. 
T h e  “ r i g h t ”  a n s we r s 
would all lead to ap-
propriate information 
but would require more 
steps; “right” answers 
were less efficient than 
“best” answers. “Wrong” 
answers were defined 
as choices that would 
not retrieve information 
useful for answering 
the question. In scoring 
the questionnaire, the 
“best” and “right” an-
swers were tallied and 
compared. Students did 
not select significantly 
more “right” answers 
than “best” answers on 
any version of the ques-
tionnaire. All results dis-

cussed in this article are based on the 
“right” answer scores.

Results and Discussion
A statistical analysis of variance (ANOVA) 
was used to determine whether screen im-
age had any effect on the competency test 
scores. The results showed a main effect 
for screen image. A statistical (Scheffe) 
test showed that the difference in scores 
on screen image testing was significant in 
the case of both terminology and layout/
grouping; students who filled out ques-
tionnaires with altered layout or altered 
terminology scored significantly better 
than those who used the existing Session 
Manager.3 There was no significant differ-
ence between the scores of the students 
filling out versions B and C of the question-
naire; layout/grouping and terminology 
positively affected scores similarly.

FIGURE 4 
Low-fidelity Prototype of Existing Session  

Manager with Altered Groupings/Layout, Version B

514 College & Research Libraries November 1997


The results of the ANOVA also showed 
a main effect for training. The subjects 
who had a�ended a UW Libraries in-
struction class in the past two years had 
a significantly higher mean score. These 
students not only scored higher on the ex-
isting Session Manager but also outscored 
the students without library instruction 
on the two altered versions. The results of 
the study upheld the commi�ee’s hypoth-
esis that grouping resources and assign-
ing concrete, descriptive labels help un-
dergraduates, especially those with basic 
library instruction, to make more efficient 
navigation decisions. Concretely, this 
means that in order for undergraduates 
to make efficient choices in a complex in-
formation environment, more descriptive 
text is necessary. In addition, grouping 
online resources by content rather than 
by subject provides more effective navi-
gational structure. These results already 
have been used in designing the new ver-
sion of the database selection screen for 
the uniform interface and for current Web 
development. Because hypertext naviga-
tion consists of grouping information, and 
o�en leads to design decisions about the 
length and explicitness of accompanying 
text, the results of the Session Manager 
study has helped guide the creation of 
Web-based tools.

In gathering the data necessary to ad-
dress the specific problem statement, the 
commi�ee also hoped to find practical an-
swers to the following questions: Where 
are students making the most mistakes 
using the current Session Manager? What 
effect do screen changes have on initial 
and secondary navigation choices? What 

design elements and terminology choices 
deserve future investigation? Students 
consistently made errors on certain ques-
tions. For example, only 21.8 percent of 
the students using the existing Session 
Manager could find the location of the 
New York Times correctly. Even fewer, 18.4 
percent, could determine where to look 
for a book at the Health Sciences Library. 
Students also showed a greater ability 
to make primary, rather than secondary, 
screen navigation decisions. Although 
the average score on the first part of the 
existing Session Manager test was 77 
percent, the average score on the second 
part, dealing with secondary database 
selection, was only 54 percent. Deciding 
why these results occurred lies outside 
the scope of the present research design, 
but the results point to areas for further 
examination.

Conclusions and Recommendations
Now that the commi�ee has outlined two 
basic ways that presentation and descrip-
tion of databases can be improved, and 
has identified places where undergradu-
ates are making mistakes using the cur-
rent Session Manager, it has begun to fo-
cus on finding out more about why these 
mistakes occur. A rigorous usability study, 
using think-aloud protocols, timed-task 
analysis, and a pos�est debriefing will 
help explain the current results. Sample 
testing already has revealed interesting 
pa�erns of user navigation behavior. In 
future quantitative and qualitative stud-
ies, the authors plan to continue to test 
and observe user behavior in the evolving 
online environment.

Notes

1. For more information on the early development of Willow, see Ketchell, Fuller, Freedman, 
and Lightfoot, “Collaborative Development of a Uniform Graphical Interface,” Proceedings of 
the Annual Symposium on Computer Applications in Medical Care (1992): 251–55; and Freedman, 
“The University of Washington Information Looker-Upper Layered over Windows,” X Resource 
14 (Apr. 1995): 13–31. To view Willow and find out more about its current development, contact: 

Navigating Online Menus: A Quantitative Experiment  515


h�p://www.washington.edu/willow/.
2. Relevant research articles were found by searching the literature of many different disciplines, 

including library science, technical communication, cognitive psychology, human factors, and 
computer science. Related bibliographic information can be found in Karen Eliasen, “Research 
Methodologies: A Complementary Approach in Library and Information Science” (Thesis, Uni-
versity of Washington, 1996).

3. The test scores were examined and compared using Statview statistics so�ware on a Ma-
cintosh computer and SPSS statistics so�ware on a PC. The main effects reported here are based 
on ANOVA results with significance values p = <.01 and comparison of means with standard 
deviations < 2.12. The Scheffe test was significant at p = <.05.

516 College & Research Libraries November 1997