intRoDucinG zoomiFY imaGE | smitH 55 Column Title Editor Author ID box for 3 column layout REtuRninG cLassiFication to tHE cataLoG | BLanD anD stoFFan 55 Communications Robert N. Bland and Mark A. Stoffan Returning Classification to the Catalog The concept of a classified catalog, or using classification as a form of subject access, has been almost forgotten by contemporary librarians. Recent developments indicate that this is changing as libraries seek to enhance the capabilities of their online catalogs. The Western North Carolina Library Network (WNCLN) has developed a “classified browse” feature for its shared online catalog that makes use of Library of Congress classification. While this feature is not expected to replace keyword search- ing, it offers both novice and experienced library users another way of identifying relevant materials. Classification to modern librari-ans is almost exclusively a tool for organizing and arranging books (or other physical media) on shelves. The role of classification as a form of subject access to collec- tions through the public catalog—the concept of the classified catalog—has been almost forgotten. From a review of the literature, it does not appear that any major U.S. library has sup- ported a classified catalog since Boston University Libraries closed its classified catalog in 1973.1 To be sure, nearly all online catalogs nowadays have some form of what is called a “call number search” or a “shelf list browsing capability” that is based on classification, but this is a humble and little-used feature because it requires that a call number (or at least a call number stem) be known and entered by the user, when no verbal index to the classification is available online. This search methodology provides nothing in the way of a systematic and hierarchical arrangement and display of subject classes, complete with accompanying verbal descrip- tions, that the classified catalog seeks to accomplish. But as Karen Markey put it in her recent review of classifi- cation and the online catalog, “To this day, the only way in which most end users experience classification online is through their online catalog’s shelf list browsing capability.”2 There are signs that this situ- ation is changing. The recently released Endeca-based catalog at North Carolina State University Libraries uses Library of Congress Classification (LCC) in a prominent way to provide for browsing of the collection without need of the user entering any search terms at all.3 The LCC outline is presented on the main search entry screen with verbal cap- tions describing the classes, allowing users to navigate through several layers of the outline to retrieve with a click of the mouse bibliographic records for materials assigned to those classes. In a converse way, the new online catalog being developed by the Florida Center for Library Automation uses LC classification as a kind of back end to keyword search- ing. Following a keyword search, a user can limit the results set by con- fining it to a designated LCC range chosen again from an online display of the LCC outline.4 Both of these catalogs use three levels of the LCC outlines from the most general single letter level classes (Q for sciences, for example) through the two-letter classes for more specific subjects (QC for physics, QD for chemistry) to an even finer granularity with des- ignated numeric ranges within the two-letter classes identifying specific subdisciplines, (QD241–QD441 for organic chemistry). The Western North Carolina Library Network (WNCLN) has been experimenting with classification as a retrieval tool in the public cata- log for some time,5 and it has just implemented the first version of what we call a Classified Catalog Browse in our Innovative Millennium sys- tem.6 Like the two catalogs just men- tioned, the Classified Catalog Browse is based on software that is external to the ILS software and integrated with that software through linking and webpage designs. Also, like the previously discussed catalogs, it is Robert n. Bland (bland@unca.edu) is Associate University Librarian for Technical Services, University of North Carolina at Asheville. mark stoffan (mstoffan@ fsu.edu) is Associate Director for Library Technology at Florida State University, Tallahassee. Figure 1. Level 1 of LC Classification in WNCLN WebPac 56 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200856 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 based on scanning and incorporating into the catalog the LCC outlines as published by the Library of Congress. The WNCLN catalog goes a step fur- ther, however, in bringing the entire LC classification online down to the individual class number level—at least that portion of the classification that is actually used in our catalog. This is done through extracting class numbers and associated subject head- ings from bibliographic and authority records in our catalog and building an online classification display with descriptive captions (a verbal index) from these bibliographic and author- ity records. The result is a hierarchical display (to continue the example from above) not only of QC241–QD441 for organic chemistry but within this, QD271 for chromatographic analy- sis, QD273 for organic electrochem- istry, and so on. The design of our interface presents this as a fourth level to which the user can “drill” down beginning with Q for sciences, QD for chemistry, QD241–QD441 for organic chemistry, and finally QD271 for chromatographic analysis (fig- ures 1–4.) From this fourth level,the user can click an associated link to execute a search of the catalog by the class number in question using the call number search function of the ILS (figure 5); a second link for that class number will present the same list of titles but sorted by “Most Popular” (i.e., the items that have been checked out most frequently) from a separate but linked external database (figure 6); a third link will search the catalog by the associated subject heading for the class (figure 7); and finally a fourth link will show other subject headings that have been used in the catalog with this specific class number (figure 8). What does having the LC clas- sification online in our catalog accomplish for our users? Part of the point of our project is to answer this very question. Chan and oth- ers7 have theorized that incorpora- tion of the classification system into the catalog as a retrieval tool can Figure 2. Level 2 of LC Classification in WNCLN WebPac Figure 3. Level 3 of LC Classification in WNCLN WebPac provide enhanced subject access that is not possible through standard alphabetical subject headings and keyword searching alone. Early stud- ies by Markey and others at OCLC seem to have confirmed this with an online version of the Dewey Decimal Classification.8 Since (as far as we know) the Library of Congress clas- sification has not really been tested as an online retrieval tool in a live catalog up to now, our implementa- tion will serve as a kind of test bed for this hypothesis. How actual users in fact exploit this feature is of course only something that experience will intRoDucinG zoomiFY imaGE | smitH 57REtuRninG cLassiFication to tHE cataLoG | BLanD anD stoFFan 57 tell. A cursory look, however, would seem to indicate definite advantages to this approach. First of all, many studies indicate that two of the major sources of fail- ure with subject retrieval in online systems are misspellings and poor choice of search terms by users. No Figure 4. Level 4 of LC Classification in WNCLN Web:Pac Figure 5. Call number search display in WNCLN matter how far we may try to go with keyword searching and relevance ranking, no online library retrieval system is likely to do much with “Napolyan’s fites” when what the user is looking for are books on the military campaigns of the Emperor Napoleon. With the classification sys- tem and verbal index online most of these problems are eliminated, since users can navigate to a subject of choice without ever entering a search term. Moreover, given the design of the verbal index based on Library of Congress subject headings, the user is led to actual subject headings used in the catalog, which should provide for precise retrieval beyond what is ordinarily possible with key- words even when entered correctly, and (importantly) a retrieval set that is always greater than zero. The infa- mous and frustrating problem of “no hits” is eliminated. Secondly, the great attraction of the classified catalog approach is that it arranges subjects in a hierarchical fashion based on integral connec- tions among the topics in a way that cannot be accommodated in an alphabetic subject approach because of the vagaries of spelling. The top- ics “Violence,” “Social Conflict,” and “Conflict Management,” for example, obviously spread out in an alpha- betical subject list, are collocated in the classified catalog under the class “HM1106–HM1171 Interpersonal Relations” (figure 9), allowing the user to find references to materials all in one place in the catalog just as the classification system arranges the books on these subjects all in one place on the library shelves. Alphabetical subject indexes, of course, attempt to ameliorate this problem by means of cross references, but there is clearly a limit to how far one can go with this approach. Finally, the classified catalog provides an efficient way for col- lection development staff to review specific subject areas and to make better informed purchasing deci- sions regarding the collections. In the WNCLN design, the classes at the bottom level of the hierarchy are linked to the catalog by call number and subject headings, and each class carries an indication of the number of items assigned that class number. The classes are also linked to an external database that shows the frequency 58 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200858 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 of circulation of items in the class as well as title and date of publication. A quick review of this list can inform a bibliographer of circulation rates as well as the currency of materials in the class. As mentioned, the captions that are displayed with the LCC hier- archy in the WNCLN catalog are extracted from subject headings and authority records present in our catalog. Readers familiar with LC MARC record services may won- der why we took this approach to building the verbal index rather than using the information available in the LC MARC classification records. Machine-readable records for LC clas- sification are now available in MARC format. These files include records for each individual class number with a corresponding verbal caption. While we did experiment with using these files, cost and complexity determined that we go another direction. The LC classification files are huge, contain- ing hundreds of thousands of classi- fication numbers that we do not now and probably never would use in our WNCLN catalog simply because we (unlike LC) have no materials on these subjects. While these records could be filtered out by matching against LC class numbers that are found in our catalog and discard- ing non-matches, this would add yet another level of processing to an already complex process, as would handling the LC table subdivisions that are used in the LC schedules and that are separate from the stan- dard class numbers. Secondly, the LC MARC classification files require a subscription costing several thou- sands dollars per year, as well as a substantial payment for the retro- spective file needed to begin building the database of class numbers. On the other hand, extracting the verbal index from subject headings and authority records in our own catalog adds no cost to our process- ing. These headings and authority records are created and maintained, of course, as a standard part of the Figure 6. Most used titles display Figure 7. Subject search display in WNCLN cataloging process, and accordingly only headings and authority records that match materials owned by our libraries are included. The descrip- tion or caption that is finally assigned to a class number is determined by a computer program that analyzes both authority records and biblio- graphic records found in our catalog that are assigned the class number in question, with the subject head- ing that is used most frequently as a primary subject generally being the one normally selected as the caption for the class. These class numbers with associated subject headings are processed then by another program, which eventually builds HTML files intRoDucinG zoomiFY imaGE | smitH 59REtuRninG cLassiFication to tHE cataLoG | BLanD anD stoFFan 59 representing the classification with links to the catalog and the external “Most Used” database as alluded to above. These standard HTML files, along with the files representing the first three levels of the LCC out- line, are then loaded onto our Web server to display the classification system online. Figure 9. Collocation of terms in the classified catalog Figure 8. Related subjects display in WNCLN A second advantage of this approach is that using the actual subject heading as the caption or description for the class makes it pos- sible to use that caption as a direct link to a subject search in the catalog, as shown in the illustration in figure 4. A disadvantage is that the captions from the LCC files are designed to retain the hierarchy that is repre- sented in the printed schedules in a visual way by formatting and indent- ing. Captions derived from subject headings do not retain this feature. We have tried to accommodate this in our display of the schedules by replicating the class number ranges from the outline in the appropriate place in the full display of the sched- ules, thereby building a hierarchy from these ranges as genus and the individual class numbers as species. This does not manage to retain the full hierarchy of the LC schedules as shown in the printed schedules or as represented in LC’s online Classification Web product, but it is, we hope, an adequate surrogate for the purpose intended. In fact, in most cases, the captions derived from the extracted subject and authority head- ings match quite nicely the captions included in the actual LCC schedules, as shown in a comparison from the psychology classification of the hier- archy as it appears in our Classified Catalog Browse and as it appears online in LC’s Classification Web product (figures 10 and 11). What is missing in our representation of the classification is not so much the subject content of the classes but the notes and information about literary form that are included in the actual LCC schedules. Thus, our LCC online is not a strict image of the LCC as it would appear in printed or electronic form based on the hierarchies and cap- tions devised by the LC. Nor for that matter—despite our terminology— is it a true classified catalog, since only one classification (that used in the call number) is assigned to each item, whereas in a true classified catalog multiple classifications may be assigned to an item. It is never- theless an online presentation of the LCC with links to our catalog that seeks to enhance subject access by exploiting the power of the classifica- tion system to organize materials by integral subject classes and to show relationships among subjects by a 60 inFoRmation tEcHnoLoGY anD LiBRaRiEs | junE 200860 inFoRmation tEcHnoLoGY anD LiBRaRiEs | sEptEmBER 2008 hierarchical arrangement of classes as genus, species, and subspecies. And, perhaps just as importantly, it is an implementation that requires no additional cataloging effort on the part of our staff, nor any additional costs for data or processing other than the investment we have made in development of the software and the small amount of time required weekly to update the files. We do not expect that the Classified Catalog Browse will replace keyword or subject searching as the primary means of subject access to our collections. We do believe that it promises to be a powerful and effec- tive complement to our standard ILS searches that may improve subject searching for both the novice and the experienced user. References 1. Margaret Hindle Hazen, “The Clos- ing of the Classified Catalog at Boston University,” Library Resources and Technical Services 18 (1974): 221–26. 2. Karen Markey, Joan S. Mitchell, and Diane Vizine-Goetz, “Forty Years of Clas- sification Online: Final Chapter or Future Unlimited?” Cataloging and Classification Quarterly 42 (2006): 1–63. 3. North Carolina State University Libraries, “NCSU Libraries Online Cata- log,” North Carolina State University, www.lib.ncsu.edu/catalog (accessed Mar. 23, 2007). 4. Florida Center for Library Automa- tion, “State University Libraries of Flor- ida–Endeca,” Board of Governors, State of Florida, http://catalog.fcla.edu (accessed Mar. 23, 2007). 5. The Western North Carolina Library Network is a consortium consisting of the libraries of Appalachian State Univer- sity, the University of North Carolina at Asheville, and Western Carolina University. 6. Western North Carolina Library Net- work, “Library Catalog,” Western North Carolina Library Network, http://wncln .wncln.org (accessed Mar. 23, 2007). Figure 10. Class captions in the WNCLN WebPac Figure 11. Class captions in LC’s Classification Web 7. Lois Mai Chan, “Library of Con- gress Classification as an Online Retrieval Tool: Potentials and Limitations,” Infor- mation Technology and Libraries 5 (1986): 181–92. 8. Karen Markey and Anh Demeyer, Dewey Decimal Classification Online Project: Evaluation of a Library Schedule and Index Integrated into the Subject Searching Capa- bilities of an Online Catalog: Final Report to the Council on Library Resources (Dublin, Ohio: OCLC, 1986), Report no. OCLC/ OPR/RR-86/1.